Presentation is loading. Please wait.

Presentation is loading. Please wait.

iCrawl – Master Thesis and Hiwi Jobs

Similar presentations


Presentation on theme: "iCrawl – Master Thesis and Hiwi Jobs"— Presentation transcript:

1 iCrawl – Master Thesis and Hiwi Jobs
Context iCrawl Project – A novel approach for the creation of high quality Web Archives Easy to use and extensible Web archive crawler framework Usable also by non-technicians User Interface Key Component to interact with the crawler Setting up crawls Maintaining and monitoring crawls Quality assurance of crawls Thomas Risse 08/11/18

2 Master Thesis: Crawl Specification Wizard
Problem Statement Quality of a Web Archive depends on the quality of the Crawl specification Crawl specification for focused crawls are complex and hard to define (Initial Starting points, good descriptions of terms, entities, etc.) Crawl specification are similar to search engine queries but more complex Aim of the Master Thesis Development of an semi-automatic tool that learns the intention of a crawl Based on a set of reference pages or on search engine results Iterative and interactive process Requires analysis and extraction of information from Web pages Requirements Interest in doing cool things in the context of a research project A “feeling” for good design and user friendliness Programming skills in Java Contact: Thomas Risse (L3S), Thomas Risse 08/11/18

3 Master Thesis: Entity-centric Linked Data Crawler
Topic Development of an entity-centric Linked Data crawler Automatic collection of metadata for Linked Data sources to enable crawler prioritization Integration of the crawler with the iCrawl platform for integrated crawling of Web pages and Linked Data Requirements Good grades in the IR-related courses Good programming skills in Java Interest in research-oriented projects Contact: Elena Demidova, Thomas Risse 08/11/18

4 Hiwi Job in the context of Web Archiving
Topic User Interface development for setup, maintaining and monitoring of crawls Easy to use (also for non-computer scientists) Near-real-time information Requirements Interest in doing cool things in the context of a research project A “feeling” for good design and user friendliness Programming skills in Java Contact: Thomas Risse (L3S), Thomas Risse 08/11/18


Download ppt "iCrawl – Master Thesis and Hiwi Jobs"

Similar presentations


Ads by Google