Presentation is loading. Please wait.

Presentation is loading. Please wait.

Job Vacancies Experiment Boro Nikić Satellite workshop on Big Data, NTTS 2015.

Similar presentations


Presentation on theme: "Job Vacancies Experiment Boro Nikić Satellite workshop on Big Data, NTTS 2015."— Presentation transcript:

1 Job Vacancies Experiment Boro Nikić Satellite workshop on Big Data, NTTS 2015

2 2 Job Vacancies experiment (1) -Idea about the experiment: Rome Workshop (May,2014) -Started with identifying web sites which advertise jobs - and searching for available APIs for websites -UNECE Task Team consisted of representatives from Austria, Hungary, Italy, Netherlands, Sweden and Slovenia

3 3 Job Vacancies experiment (2) Goals: -Overview of the methodologies of calculation of JV statistics at NSIs -Identification of possible web scraping tools -Determination of BD methodology of calculation of JV statistics -Testing the BD quality indicators proposed by UNECE Quality Task Team

4 Overview of the methodologies of calculation of JV statistics at NSIs By EU regulation it is prescribed to publish quarterly statistic on JV data: -Totals of advertised JV on national level -Totals on domains defined by size of units -Totals on domains defined by NACE activity groups Documents on Wiki: http://www1.unece.org/stat/platform/pages/vie wpageattachments.action?pageId=10030373 9&metadataLink=truehttp://www1.unece.org/stat/platform/pages/vie wpageattachments.action?pageId=10030373 9&metadataLink=true, 4

5 Identification of web scraping tools Tools: http://www.irobotsoft.com/ https://www.kimonolabs.com 5

6 Aim of the Irobot tool IRobotSoft for Visual Web Scraping IRobotSoft is a visual Web robot software for Web scraping and Web automation. With IRobotSoft, you can scrape tons of data from the deep Web with a single click! You don't need to have computer skills to do this! IRobotSoft is for Everyone! Follow our discussions and become a Web geek! for novice data collectors for Web testers for data experts Link:http://www.irobotsoft.com/http://www.irobotsoft.com/ 6

7 Basic Steps 1.Define the name of the Irobot 2.Define the name of the Task 3.Copy and paste the link of desired website into the URL 4.Start Recording Actions 5.Give names to the „scraped“ variables 6.Save the variables 7.Use the option „Repeat Property“ 7

8 Determination of BD methodology of calculation of JV statistics (1) -Cleaning of data -Methodology for the replacement of existing statistics (on the level of NSi) -Methodology for the calculation of new statistics (on the level of NSi) -Methodology for the calculation of new statistics (international level) 8

9 Interface with the parameters 9

10 Determination of BD methodology of calculation of JV statistics (2) All the documentation about the experiment could be found on: http://www1.unece.org/stat/platform/pages/viewpageattach ments.action?pageId=100303739&metadataLink=true Document: Information which could be extracted from the Slovenian Websites and the proposed statistics for the job vacancies.doc 10

11 Determination of BD methodology of calculation of JV statistics (3) One of the step in the statistical processing of JV data is assigning the ID of the Legal Unit from the Business Register. Linking the ID to the „scraped“ unit enables us to get the information about the activity and size of LeU (according to number of employees) 11

12 „Scraped“ data Name_LeUTel numbMob_numb TownStreetStreat_numbPostal_code AR PLANE d.o.o. 03-809-4100040 383840 Bistrica ob Sotli Savatech, d.o.o. Kranj ARENDA d.o.o. Ljubljana Knauf Insulation d.o.o04 5114 219 Škofja LokaTrata324220 AVIAT d.o.o. Trzin VIP Virant d.o.o Komenda 12

13 „Matched“ data iskaniName_LeU Town_BR idcomplete_nmaenace_codeadressVIDdist1 1AR PLANE d.o.o.BISTRICAOBSOTLI 1AR PLANE d.o.o.ZAGAJ3290476000AR PLANE, korporacijsko upravljanje in pravna pisarna, d.o.o.70.22014742380 1APLANE d.o.o.SOLKAN3307611000Letalska družba APLANE d.o.o.30.30010342698 1ARTPLANETSLOVENSKABISTRICA3498417000 ARTPLANET, zavod za razvoj umetnosti, kulture in kakovosti življenja, Slovenska Bistrica72.20015 1ARTPLAN, d.o.o.KRANJ6188265000ARTPLAN, proizvodnja in trgovina d.o.o.31.010242989121 1ARPLAN, ANŽE REZAR s.p.PROSENIŠKO3761843000 ARPLAN, projektiranje, inženiring, svetovanje in storitve v gradbeništvu, ANŽE REZAR s.p.71.129231547425 1AL PLANET, Dejan Janež s.p.SEŽANA3356892000AL PLANET, Stavbno pohištvo iz aluminija, Dejan Janež s.p.25.12093079126 1AR-AL NET d.o.o.ČENTIBA6072526000AR-AL NET, trgovina in posredništvo d.o.o.47.91028 1ARTLINE d.o.o.MENGEŠ5333644000ARTLINE, studio za oblikovanje, d.o.o.73.110141705528 2Savatech, d.o.o.KRANJ 2SAVATECH d.o.o.KRANJ1661205000 SAVATECH družba za proizvodnjo in trženje gumenotehničnih proizvodov in pnevmatike, d.o.o.22.19024045550 2SAITECH d.o.o.CELJE5311292000SAITECH podjetje za trgovino in storitve d.o.o.43.290142836321 2SAVA TMC, d.o.o.LJUBLJANA1893718000 SAVA TURIZEM - TMC, podjetje za upravljanje dejavnosti turizem, d.o.o.70.100258532521 2ASTECH d.o.o.LOGATEC1661078000ASTECH d.o.o., Inženiring in servisiranje strojnih instalacij43.220161796525 2AVTECH D.O.O.VIDRGA3282058000 AVTECH, SVETOVANJE, ZASTOPSTVO, PROIZVODNJA, D.O.O.70.22028455225 2SANOTECHNIK d.o.o.MARIBOR5850908000SANOTECHNIK trgovsko podjetje d.o.o.46.730149014927 3ARENDA d.o.o.LJUBLJANA 3ARENDA d.o.o.LJUBLJANA1629417000ARENDA, nepremičninska družba, d.o.o.68.20012425480 3OPTIKA ARENA d.o.o.MARIBOR1873512000OPTIKA ARENA, družba za trgovino in storitve d.o.o.47.78149998110 3PEKARNA ARENA d.o.o.LJUBLJANA3918076000PEKARNA ARENA, pekarstvo in trgovina, d.o.o.10.710231348810 3ARENA SERVIS d.o.o.OSLUŠEVCI6318797000 ARENA SERVIS, izposojanje šotorov, šankov in gostinske opreme ter gostinske storitve, d.o.o.77.39010 3ADENDA d.o.o.MIREN5743729000ADENDA d.o.o. grafične storitve in oblikovanje18.130136558016 3AGENDA d.o.o.MARIBOR5656222000AGENDA komunikacijski in informacijski inženiring d.o.o.62.02016318716 3RANDA d.o.o.LJUBLJANA6011624000RANDA gradbeništvo, storitve in prevozi d.o.o.41.200189049620 3AGENDA 2003 d.o.o.LJUBLJANA1824775000 AGENDA 2003 premoženjsko svetovanje in računovodske storitve d.o.o.69.2006384924 13

14 Testing the BD quality indicators proposed by Quality Team Quality framework consists of three quality hyperdimensions: input, throughput and output hyperdimension http://www1.unece.org/stat/platform/pages/viewpageattach ments.action?pageId=101158888&metadataLink=true 14

15 Conclusions (1) BD could be used as a source: for new types of statistics for existing statistics for validation of existing statistics In case of scraping of JV data: Change of mode of collection Validation of data collected by traditional way (administrative sources, questionnaire Flash statistics 15

16 Conclusions (2) Before the JV BD source is employed in regular statistical production the scraping tools, procedures of manipulation of data and statistics must be carefully tested in period of at least one year in order to ensure stability of sources and statistics. More about experiment can be found on http://www1.unece.org/stat/platform/display/BDP/Sandbox+ Task+Team 16

17 Thank you for your attention! 17


Download ppt "Job Vacancies Experiment Boro Nikić Satellite workshop on Big Data, NTTS 2015."

Similar presentations


Ads by Google