Presentation is loading. Please wait.

Presentation is loading. Please wait.

Crawling, Parsing and Semantic Matching of Vacancies and CV’s Semantic Recruitment Technology Jakub Zavrel, Textkernel InGRID Workshop 11-2-2014.

Similar presentations


Presentation on theme: "Crawling, Parsing and Semantic Matching of Vacancies and CV’s Semantic Recruitment Technology Jakub Zavrel, Textkernel InGRID Workshop 11-2-2014."— Presentation transcript:

1 Crawling, Parsing and Semantic Matching of Vacancies and CV’s Semantic Recruitment Technology Jakub Zavrel, Textkernel InGRID Workshop 11-2-2014

2 Textkernel: Spinoff from R&D in machine learning and language technology Founded 2001, offices in Amsterdam (HQ), Frankfurt, Paris, 45 employees; strong R&D focus Deloitte Fast 50 2007, 2010, 30% YoY growth Core technology: Understanding unstructured text data. Multi-lingual Market: Job boards, Recruitment Software, Staffing and recruitment, Mobility, Large Employers Products: Multi-lingual tools (15 languages) to extract CVs and jobs Jobfeed: largest real time DB for job market analysis Search! & Match! to connect people and jobs Customers: UWV, Pole Emploi, Adecco, Randstad, USG, Monster, Stepstone, XING, SAP, Unisys, Bosch, Axa, Philips, etc. (350 direct, 2000+ indirect), Large partner network (HR & recruitment software)

3 I like programming, but I’m interested do take on more project management responsibility Is there a job in our organisation that better fits my degree? I’d like to work on our mobile strategy. I’ve helped a friend develop a mobile app. I’d like to do more with my organisational talent. We are looking to hire: An experienced tech team team lead We are looking to hire: An experienced tech team team lead Language gap The ideal candidate has: -min. 5yr of experience -Certfied scrummaster -Exp. w/iOS, Android The ideal candidate has: -min. 5yr of experience -Certfied scrummaster -Exp. w/iOS, Android Completed academic studies Computer Science or related 30% travel for customer presentations

4 The Job ad searches directly in a database and identifies relevant candidates (or vice versa) …

5

6 Automatically convert each document into a complete record Extract! CV/Job Parsing

7 Extract!

8

9

10

11 Extract! – Zero data entry job application

12 Extract!

13 Time savings coding CVs and Jobs If you accept noise, 100% time savings Structured data allows better search: Semantic Searching and Matching Coding enables reporting and statistics Extract!

14 Coding follows Extraction Customer specific or standard taxonomies String similarity based normalization Lot of synonyms per language Distance = confidences Problem cases: ambiguity, context, long tail More complex models can help (classifiers, multi-variate models) Semantic matching better (occupation coding errors are counterbalanced by other variables) Occupation coding!

15 Semantic search: „Lets you find what you mean not what you type“ Impression... Search!

16 CV Parsing Job Parsing Match!

17 Semantic Matching Technology: Natural Language Processing Machine Learning Semantic Analysis Probabilistic Language Model Search Engine Multi-lingual taxonomies Recruitment knowledge-bases

18 Demo

19 Search and analyse real-time online job ads as well as historical data Jobfeed

20

21 Jobfeed! Knowledge of all demand for labour in European job market –Sales leads for recruitment and staffing companies –Real time labour market analytics tools –Largest database of jobs for matching unemployed –Perfect data source for text mining

22 Jobfeed! Real time collection of online job ads from any (unstructured) source Available in NL, DE, FR, IT Gradually rolling out in rest of Europe Richly semantically structured data

23 Jobfeed!

24 Jobfeed: Multilingual Occupation Taxonomy Occupations >4000 codes 4 languages 3 layer hierarchy >50K synonyms Link to other concepts: - Skills - Education level - Sector - O*NET - UWV (Dutch Employment Agency) - ROME Based on millions of jobs, years of customer feedback and experience! Example: NL: administratief medewerker, EN: administrative assistant, FR: employé administratif, DE: Verwaltungsassistent (m/w). Group: administrative personnel Class: Administration and Customer Service Synonyms: administrative employee, assistant clerk, office support Skills: ms office, excel, english language, etc O*NET: 43-9199.00: Office and Administrative Support Workers, All Other UWV: 1000402563: Administratief medewerker secretariaat

25 Demo

26 Jobfeed as material for Research

27

28

29

30

31 Frequent words for "Java developer" en van de een je met in het Java of Je op is voor te ervaring aan als and software om team zijn kennis bij Ervaring die the naar a jaar jij bent Developer HBO hebt to werken werk

32 Frequent words for all professions en van de een in het je met op Je voor te is of zijn aan bent naar bij om als ervaring die Het hebt deze werken zoek De wij functie onze ben tot over werk opleiding uit and werkzaamheden dat binnen u Als Voor zelfstandig kennis ook s verantwoordelijk

33 Solution: contrast frequencies Observed frequency of w: O(w) = A Expected frequency of w: E(w) = C * B / D Pick words with highest score: score(w) = (O - E) 2 / E Java develo per jobs All jobs # jobs where w occurs AB Total # jobs CD

34 Top words for "Java developer" java developer software spring scrum agile hibernate ontwikkelaar u j2ee development maven applicaties ervaring web de frameworks jboss mbo senior wij xml jee o javascript you kennis ontwikkelen oracle ontwikkeling architectuur webservices informatica werkzaamheden technologie developers eclipse bezit het team wo rijbewijs technieken tomcat the vca zelfstandig architect werklocatie html Building rich skills profiles for thousands of occupations from millions of real time jobs… … new trends and occupations…

35 Supply & Demand Have: lots of data, technology, ideas Want: labor market expertise, students, research

36 Semantic Recruitment Technology Thanks!


Download ppt "Crawling, Parsing and Semantic Matching of Vacancies and CV’s Semantic Recruitment Technology Jakub Zavrel, Textkernel InGRID Workshop 11-2-2014."

Similar presentations


Ads by Google