iCrawl – Hiwis Jobs and Master Thesis

Slides:



Advertisements
Similar presentations
Summary XBRL Challenge Objective: Tools that rely on XBRL data, e.g., tool that extracts data for multi-company comparison via desktop application; or.
Advertisements

Database Management Using Microsoft Access Xinhua Chen, Ph.D. Chinese Association of Professionals in Science and Technology March 23, 2003.
Taavi Tamberg What is screen? Device User Interface Information Service Innovation.
Why Are Computers Necessary in Today’s World?
The CERIF-2000 Implementation. Andrei S. Lopatenko CERIF Implementation Guidelines Andrei Lopatenko Vienna University of Technology
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
A Tool to Support Ontology Creation Based on Incremental Mini- Ontology Merging Zonghui Lian Data Extraction Research Group Supported by Spring Conference.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
FREMA: e-Learning Framework Reference Model for Assessment Design Patterns for Wrapping Similar Legacy Systems with Common Service Interfaces Yvonne Howard.
Universe Design Concepts Business Intelligence Copyright © SUPINFO. All rights reserved.
Introduction to Software Testing
Overview of Search Engines
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Evaluations and recommendations for a user support toolkit Christine Cahoon George Munroe.
Databases & Data Warehouses Chapter 3 Database Processing.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
 A set of objectives or student learning outcomes for a course or a set of courses.  Specifies the set of concepts and skills that the student must.
The Role of DBMS in Computing
ALCME: OAI at OCLC Jeffrey A. Young OCLC Online Computer Library Center, Inc.
Practical Project of the 2006 Joint International Master’s Degree.
Master Thesis Defense Jan Fiedler 04/17/98
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
Webarchivering in het Audiovisuele Domein Web archiving in the audiovisual Domain Julia Vytopil- Nederlands Instituut voor Beeld en Geluid Netherlands.
Publication Spider Wang Xuan 07/14/2006. What is publication spider Gathering publication pages Using focused crawling With the help of Search Engine.
Creating Usable Data Usable Data and “Actionable” Information Jonathan Callahan Mazama Science M AZAMA S CIENCE Data – Information – Knowledge.
Search Tools and Search Engines Searching for Information and common found internet file types.
GeoProMT Purpose of today’s meeting – Present some research ideas Identify people willing to make a commitment to the project – Development could be part.
Design and Implementation of a Rationale-Based Analysis Tool (RAT) Diploma thesis from Timo Wolf Design and Realization of a Tool for Linking Source Code.
CSC 9010 Spring, Paula Matuszek. 1 CS 9010: Semantic Web Applications and Ontology Engineering Paula Matuszek Spring, 2006.
Augmenting Focused Crawling using Search Engine Queries Wang Xuan 10th Nov 2006.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Web application component mapping Noé Fernández. The Problem 19/08/2014Noé Fernández › Dozens of s/day › Lack of information  Users don’t know what.
How PHP is Different From Other Programming Language
Database Technologies for E-Commerce Rakesh Agrawal IBM Almaden Research Center.
HCC 831 User Interface Design and Evaluation. What is Usability?
Crawling When the Google visit your website for the purpose of tracking, Google does this with help of machine, known as web crawler, spider, Google bot,
AFFORDABLE WEBSITE DESIGN SERVICES.  The different areas web designing services includes web graphic design, user interface designing, authoring and.
TextCrowd – Collaborative semantic enrichment of text-based datasets
Leverage your Business with Selenium Automation Testing
Evaluation Anisio Lacerda.
Aim: How can we best search the internet using various search engines?
Presented by: Hassan Sayyadi
Web Applications Security What are web Applications?
Systematic Manual Testing
Extraction, aggregation and classification at Web Scale
CS 351d Human-computer interaction Lecture 01 Introduction
Objective % Explain concepts used to create websites.
iCrawl – Master Thesis and Hiwi Jobs
Web scraping tools, an introduction
Object Oriented Analysis and Design
Fluency with Information Technology
Martin Rajman, EPFL Switzerland & Martin Vesely, CERN Switzerland
Good User Experience is a pinnacle point of your customer’s online experience. Only by testing your website’s usability will you understand how real Australian.
Project Structure Overview
User Interface Design and Evaluation
Web Application Server 2001/3/27 Kang, Seungwoo. Web Application Server A class of middleware Speeding application development Strategic platform for.
Human Computer Interaction
Identify Different Chinese People with Identical Names on the Web
Junghoo “John” Cho UCLA
Assignment Design an interface for a scholarship search engine that searches a database of scholarships (Due Date, August 29th class).
CS 580 Human-computer interaction Lecture 01 Introduction
MIS2502: Data Analytics MySQL and MySQL Workbench
Objective Explain concepts used to create websites.
The Time You Attended the Address Validation Meeting
A framework for ontology Learning FROM Big Data
Presentation transcript:

iCrawl – Hiwis Jobs and Master Thesis Context iCrawl Project – A novel approach for the creation of high quality Web Archives Easy to use and extensible Web archive crawler framework Usable also by non-technicians User Interface Key Component to interact with the crawler Setting up crawls Maintaining and monitoring crawls Quality assurance of crawls Thomas Risse 08/12/18

Hiwi Job in the context of Web Archiving Topic User Interface development for setup, maintaining and monitoring of crawls Easy to use (also for non-computer scientists) Near-real-time information Requirements Interest in doing cool things in the context of a research project A “feeling” for good design and user friendliness Programming skills in Java Contact: Thomas Risse (L3S), risse@L3S.de Thomas Risse 08/12/18

Master Thesis: Crawl Specification Wizard Problem Statement Quality of a Web Archive depends on the quality of the Crawl specification Crawl specification for focused crawls are complex and hard to define (Initial Starting points, good descriptions of terms, entities, etc.) Crawl specification are similar to search engine queries but more complex Aim of the Master Thesis Development of an semi-automatic tool that learns the intention of a crawl Based on a set of reference pages or on search engine results Iterative and interactive process Requires analysis and extraction of information from Web pages Requirements Interest in doing cool things in the context of a research project A “feeling” for good design and user friendliness Programming skills in Java Contact: Thomas Risse (L3S), risse@L3S.de Thomas Risse 08/12/18