Presentation is loading. Please wait.

Presentation is loading. Please wait.

Olav ten Bosch MSIS, Dublin, 14-16 April 2014 On the use of internet robots for official statistics.

Similar presentations


Presentation on theme: "Olav ten Bosch MSIS, Dublin, 14-16 April 2014 On the use of internet robots for official statistics."— Presentation transcript:

1 Olav ten Bosch MSIS, Dublin, 14-16 April 2014 On the use of internet robots for official statistics

2 Overview – Why internet as a data source (IAD)? – Internet robots, how do they work? – Applications: Airline tickets Housing market Clothing Robot assisted data collection – Conclusion

3 Why IAD? (1) Administrative sources – Tax, social security services – Municipalities/ Provinces – Supermarkets Surveys Internet sources Less!!! Faster, better, more efficient New indicators

4 4

5 Which content is original, reliable, stable, representative and accessible? Internet sources Why IAD? (2) – Internet prices for CPI ? – Real estate sites for housing statistics ? – Internet vacancies for job statistics ? – Social media sentiment for consumer confidence ? – Trade in second-hand goods as economic indicators ? – Travel activity for tourism statistics ?

6 Robots / crawlers / bots / spiders / scrapers: how do they work? (1) Browser Website Internet Requests code, images, style, data, etc. Graphical markup You Commands

7 Robots / crawlers / bots / spiders / scrapers: how do they work? (2) Robot/ spider/ crawler Website Internet Requests Navigation code, images, style, data, etc. Data You

8 Robots / crawlers / bots / spiders / scrapers: how do they work? (3) Robot/ spider/ crawler Website Internet Requests Navigation code, images, style, data, etc. Data Monitor actively Generic software for: - site navigation - product details - monitoring Data

9 Airline tickets (1) Robot collection versus manual collection

10 Airline tickets (2) Price of a ticket over time

11 Housing Market (1)

12 Housing market (2) Dynamics of the database behind becomes visible

13 Clothing (1):

14 2 sites: very volatile data Clothing (2): Challenges: -from volatile data to stable statistics -how to classify multiple less structured data sources Seasonal pattern

15 Robot-assisted data collection (1) – Use case: few price observations on many sites – Example: price of a cinema ticket – Robot tool to automatically check if prices are changed

16 Robot-assisted data collection (2) 16

17 Conclusion – Using internet as a datasource we can measure statistical phenomena in a completely different way – It is powerful to combine fast internet data with reliable (but slower) administrative data – We should redesign statistics with the possibilities of internet data in mind Challenges: – Legal framework – The internet changes continuously: how to turn volatile data sources into reliable statistics? – We need advanced statistical methods, processes and IT


Download ppt "Olav ten Bosch MSIS, Dublin, 14-16 April 2014 On the use of internet robots for official statistics."

Similar presentations


Ads by Google