Seminar: Web Scraping
Overview
Web scraping, the automated extraction of data from websites, has emerged as a powerful tool for gathering a variety of structured and unstructured data types from the internet. In today’s digital age, vast amounts of valuable information are embedded within web pages, ranging, for instance, from real estate listings and customer reviews to job postings or social media data.
Web scraping enables individuals and businesses to harness this wealth of data, allowing them to use the information for a wide range of research applications and business cases. Nowadays, the use of web scrapers for tasks such as real-time data monitoring or market research plays an important role and is therefore an integral part of the day-to-day business of many companies and the foundation of many research projects in economics.
This hands-on seminar offers an opportunity to learn how to scrape data from websites in R. During the first part of the semester, students will teach each other the necessary tools and skills by means of seminar presentations of roughly 35 minutes. During the second part of the semester, students implement a web scraper for a project of their choice for scraping data that allows them to analyze web data in the context of a chosen research question (possible examples: product price comparisons, analysis of text from speeches of monetary policy makers, features of job vacancies).
Organization
I teach this course in the summer term (in 2024, potentially also in the next years). The course language is English.
There will be weekly seminar sessions with presentations by students during the first approximately 2/3 of the semester. After that, each participant implements a small web-scraping project and writes a project report/seminar paper that has to be handed in at the end of the semester.
You earn 5 ECTS by passing the seminar for which you have to hand in a seminar paper (60% of the grade) and present one topic in the seminar (40%).
Syllabus
More information about how to register for the seminar, grading, available topics, and the seminar in general can be found in the syllabus.