Removing Duplicate Job Ads

Slides:



Advertisements
Similar presentations
CAREER COMPASS. What is Career Compass?  Link training to core competencies, job duties  Training supports career paths defined by job standards  Help.
Advertisements

The Many Ways of Improving the Industrial Coding for Statistics Canada’s Business Register Yanick Beaucage ICES III June 2007.
Deriving Performance Metrics From Project Plans to Provide KPIs for Management Information Primavera SIG October 2013.
1.
DEPARTMENT OF BUSINESS AND EMPLOYMENT Moving to eRecruitment HR Forum – 6 August 2010 Department of Business & Employment Presented by: Kate McTaggart,
Maureen B. Higgins Assistant Director, Agency Support & Technical Assistance Office of Personnel Management December 8, 2010.
SitePublisher Agency Rollout Strategy. TeamSite Today 92 parent sites 46,000 pages 273,000 files.
Working in partnership with Crescent Purchasing Consortium Buying Temporary & Teaching Staff 5 th May 2010.
ISCO-08 - Current Status and plans to support implementation David Hunter Department of Statistics International Labour Office United Nations Expert Group.
Introduction to the Case Study Professional and Academic Skills 1.
 How do we do ‘it’?  Lessons learnt  Achievements to date  Sense of Place  Trainee’s post GM Procure  Successessentials Improving Community Benefits.
Talent Match Fulfilling Lives: supporting young people into work Sharon Jones.
Real-time LMI is made up of job postings by occupation obtained from Internet job boards, company websites, and newspapers using spidering technology.
S upport for Local Authorities Neil Marshall Chief Executive.
London Civic Forum 12 th March Small local charities & Community Organisations 1.Sustainability threatened by cuts in government funded programs.
Israel Accession Seminar PIAAC: Programme for International assessment of Adult Competencies Skills strategy in OECD Programme for the International Assessment.
International Standard Classification of Education (ISCED) Revision Status and next steps.
Branch Chairs Conference Amanda O’Brien Director of Professional Development Continuing Professional Development 19 November 2010.
Certification Learning Network February 2,
TEAM: KIRISOR COUNTRY: ROMANIA Learn more than just theory, learn skills!
Big Data activities at SURS Statistical Office of the Republic of Slovenia DIME/ITDG meeting, February 2016.
13-Jul-07 State of the art of the ISCO-08 implementation.
Data Science in Official Statistics: The Big Data Team
Jobcentre Plus Services for Employers
Are you Searching for a Job ?. About Us Salaryontime is India's Leading Online Job and Recruitment Portal - Search & Apply for Latest Job Vacancies.
WEB SCRAPING FOR JOB STATISTICS
2021 Population Census and migration statistics in Spain.
WP2 Internal Meeting 15:00-15:30 Next Milestones and proposed workplan
ESSNet Pilot: Web Scraping for Job Vacancy Statistics
WP1: Web scraping Job Vacancies- ELSTAT
14.00 – The common EURES IT platform & the mapping process - workshop Martin Le Vrang, DG EMPL Kornelia Kozovska, DG EMPL Zoltan Patkai, DG EMPL.
(UNSD introduction followed by open discussion)
MGT601 SME MANAGEMENT.
Istituto Nazionale di Statistica – Istat
iCIMS 17.3 Release: Highlights
Big Data ESSNet: Web Scraping for Job Vacancy Statistics Nigel Swier UK Office for National Statistics.
Classifying enterprises by economic activity
2.2 Characteristics of units
ESSnet Linked Open Statistics Update
So you want to be an Ecologist or Environmental Manager?
Goals and objectives of Work package 2 of the ESSnet on Consistency of concepts and applied methods of business and trade-related statistics Norbert Rainer,
Dissemination Workshop ESSnet Big Data Sofia, February 2017
ESSNet Pilot: Web Scraping for Job Vacancy Statistics
Presentation title Enhancing the roles of the NAMA registry Asia Pacific and Eastern Europe Regional Workshop on Nationally Appropriate Mitigation Actions.
1 What is EGR? ESTP course on EGR 6-7 September 2016.
Progress of the ESS.VIP ADMIN Special focus on the ESSnet on quality of multiple sources statistics. DIME/ITDG SG, Fabrice Gras, unit B1.
ESSnet on SDMX phase II June 7-8, Luxembourg.
States Ranked by October 2018 Unemployment Rate
ESS.VIP ADMIN Sorina Vâju.
Boro Nikic WP1&WP2 meeting Rome, November 2016
WGCapD Perspectives of Work Plan –
ESF Informal Technical Working Group meeting Brussels,
Community Census Programme 2001
Item 3 of the draft agenda ESS.VIP ADMIN: progress report
Curriculum Review and Design
WP7 – COMBINING BIG DATA - STATISTICAL DOMAINS
Big Data ESSNet WP 1: Web scraping / Job Vacancies Pilot
Elaborating a European Socio economic Classication
Review plan of the nature reporting – update 6
Item 4.2 – Towards the 2016 AES Philippe Lombardo Eurostat-F5
Grants for the implementation of ISCO 08 during 2010
Labour Market Information (LMI) What does it tell us?
States Ranked by April 2019 Unemployment Rate
Quality assurance in population and housing Census
My name is VL, I work at the EEA, on EA, and particularly on developing a platform of exchange which aims at facilitating the planning and development.
Aurora Hoxha & Drini Imami
Social Science Curriculum Roll Out
Energy Catalyst Round 7 Iain Wheeler Bid Manager 24th July 2019
Recent dissemination activities
ESTP course on EuroGroups Register
Presentation transcript:

ESSNet Pilot: WP1 - Web Scraping for Job Vacancy Statistics - UK update 7th November 2016, Rome

Removing Duplicate Job Ads Job Portal Concatenated list Deduplicate Final deduplicated list 1. Create common variable list: Job_title Job_description Location_city Location_region Date_posted Enterprise name 2. Clean data: e.g. " .NET Developer - Stoke-On-Trent - £35-£40K " 3. Run dedup to produce candidate matches 4. Active learning step (manual coding of > 100 records) 5. Rerun to automatically remove “duplicate” job ads A lot of high quality training data needed to work effectively!

Conceptual model for measuring job vacancies from on-line sources ‘Ghost’ Vacancies Target Population: All job vacancies Employing business is identifiable Advertised through an agency Advertised on a job portal Advertised on enterprise website

Current Workplan Matching job ad counts by advertising business from five portals to JV survey Focusing on 1300 largest reporting units (approx 33% of all JVs) About 25% can be matched easily Need more BR data? Manual matching of residuals? What about smaller enterprises? Single location enterprises may be easier?

Data collection

CEDEFOP Pilot system for online vacancy analysis 4.2 million job vacancies (5 countries UK, Ireland, Germany, Italy, Czech Republic) May 2016: Training session June 2016: Access agreement to online analysis system => Initial assessment: Limited functionality August 2016: Agreement to supply underlying data: Very large file delivered requiring bespoke database solution …. not what we were expecting! Latest: CEDEFOP tendering to undertake further work….

CEDEFOP tender

Review of UK Standard Occupation Classification (SOC) UK SOC (UK version of ESCO) Last reviewed in 2010 Public consultation supported by analysis of new occupations (including 2011 Census) Proposal is to use bulk job titles (and skills) scraped from job portals Benefits: High volume and up-to-date data to inform SOC review New job titles to enhance SOC coding frame Duplication, text mining and coding methods could be developed and applied to WP1 ESSNet pilot SOC Pilot focusing on job titles in IT sector

Current Challenges Staffing API limitations Working across two environments Resources for doing supervised learning and matching Paperwork, meetings and conferences!

Looking ahead Data scientist recruited last week! Continue with work plan (plus SOC review) Re-engagement with CEDEFOP (and sucessful bid?) Engagement with job portal owners? Preparing for end of SGA-1 technical report