Geographic Web Information Retrieval Alexander Markowetz, University of Marburg Thomas Brinkhoff, FH Oldenburg Bernhard Seeger, University of Marburg.

Slides:



Advertisements
Similar presentations
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Advertisements

Getting Your Web Site Found. Meta Tags Description Tag This allows you to influence the description of your page with the web crawlers.
Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.
Introducing Calais A Thomson Reuters initiative designed to make content interoperable on the Web A free API that anyone can use An easy way to automatically.
Navigating the Intranet with High Precision Huaiyu Zhu Alexander L¨oser Sriram Raghavan Shivakumar Vaithyanathan.
Adaptive Book: A Platform for teaching, learning and student modeling Ananda Gunawardena School of Computer Science Carnegie Mellon University.
The PageRank Citation Ranking “Bringing Order to the Web”
Using the Semantic Web for Web Searches Norman Piedade de Noronha, Mário J. Silva XLDB / LaSIGE, Faculdade de Ciências, Universidade de Lisboa.
 Manmatha MetaSearch R. Manmatha, Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts, Amherst.
FACT: A Learning Based Web Query Processing System Hongjun Lu, Yanlei Diao Hong Kong U. of Science & Technology Songting Chen, Zengping Tian Fudan University.
A Mobile World Wide Web Search Engine Wen-Chen Hu Department of Computer Science University of North Dakota Grand Forks, ND
Web Exploration and Search Technology Lab Department of Computer and Information Science Polytechnic University Brooklyn, NY Faculty: Torsten Suel.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Advanced Search Giora Feldman, CTO Axioma Search, LLC.
Analysing the link structures of the Web sites of national university systems Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton,
Overview of Search Engines
Organic Website Marketing and Online Reputation Management To Boost Traffic, Visibility and Targeted Audience Table of content Introduction Service On.
Design and Implementation of a Geographic Search Engine Alexander Markowetz Yen-Yu Chen Torsten Suel Xiaohui Long Bernhard Seeger.
SEO PLAN Presented By Mangesh Dolse. Lead Management Tool( Sample)
CS344: Introduction to Artificial Intelligence Vishal Vachhani M.Tech, CSE Lecture 34-35: CLIR and Ranking in IR.
Web Search Engines and Information Retrieval on the World-Wide Web Torsten Suel CIS Department Overview: introduction.
1. An Idea “In order to create wealth, you must be the first with an idea. Then, you must be first to tell the world about that idea” Warren Buffett “…probably.
DETECTING NEAR-DUPLICATES FOR WEB CRAWLING Authors: Gurmeet Singh Manku, Arvind Jain, and Anish Das Sarma Presentation By: Fernando Arreola.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Vienna University of Economics and Business Administration User-centered Navigation Re-Design for Web-based Information Systems Michael Hahsler, Department.
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
Adversarial Information Retrieval The Manipulation of Web Content.
E-Commerce and the Entrepreneur
1. 2 Search Engine Marketing What, why and how? IAB Peru,
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
LIS618 lecture 1 Thomas Krichel economic rational for traditional model In olden days the cost of telecommunication was high. database use.
Searching the Web Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
What is Web Mining? Discovering desired and useful information from the World-Wide Web.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Influence of Search Engines Christina Pong cs349.
SEO techniques & Mastering Google Adwords By Ganesh.S
1 Search Engine Optimization An introduction to optimizing your web site for best possible search engine results.
Extracting Metadata for Spatially- Aware Information Retrieval on the Internet Clough, Paul University of Sheffield, UK Presented By Mayank Singh.
When Search is not Enough Case Study: The Advertising Research Foundation Gilbane Boston November 27, 2007 Gilbane Boston November 27, 2007.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Improving Cloaking Detection Using Search Query Popularity and Monetizability Kumar Chellapilla and David M Chickering Live Labs, Microsoft.
Course grading Project: 75% Broken into several incremental deliverables Paper appraisal/evaluation/project tool evaluation in earlier May: 25%
Search Engines.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
CS315-Web Search & Data Mining. A Semester in 50 minutes or less The Web History Key technologies and developments Its future Information Retrieval (IR)
ITGS Databases.
Querying Web Data – The WebQA Approach Author: Sunny K.S.Lam and M.Tamer Özsu CSI5311 Presentation Dongmei Jiang and Zhiping Duan.
Scribing Your responsibility to scribe at least one class (5 points of final grade!)
Mobile Search Engine Based on idea presented in paper Data mining for personal navigation, Hariharan, G., Fränti, P., Mehta S. (2002)
WebEx. Google 101: Getting more from Google 7/26/2010.
Internet Research – Illustrated, Fourth Edition Unit A.
SEO Friendly Website Building a visually stunning website is not enough to ensure any success for your online presence.
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
Google Maps and Web Mapping Kyle Mulka Computer Science in Engineering University of Michigan.
Supporting Ranking and Clustering as Generalized Order-By and Group-By Chengkai Li (UIUC) joint work with Min Wang Lipyeow Lim Haixun Wang (IBM) Kevin.
Concept-based P2P Search How to find more relevant documents Ingmar Weber Max-Planck-Institute for Computer Science Joint work with Holger Bast Torino,
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Frompo is a Next Generation Curated Search Engine. Frompo has a community of users who come together and curate search results to help improve.
Searching the Web for academic information Ruth Stubbings.
Adversarial Information System Tanay Tandon Web Enhanced Information Management April 5th, 2011.
Search Engine Optimization
XEBEC INDIA ADVERTISING | DIGITAL MARKETING AGENCY
Federated & Meta Search
Creating a Successful Web Presence
Data Mining Chapter 6 Search Engines
Combining Keyword and Semantic Search for Best Effort Information Retrieval  Andrew Zitzelberger 1.
Results Fusion in Heterogeneous Information Sources
Presentation transcript:

Geographic Web Information Retrieval Alexander Markowetz, University of Marburg Thomas Brinkhoff, FH Oldenburg Bernhard Seeger, University of Marburg

2 Current Situation In Web-IR  Everybody is online  But never seen

3 Current Situation In Web-IR  Queries are too short  Resultsets are too large  You can effectively block your competitors  Good results get buried  Smaller Results  Ways to drill the ice-berg

4 Solutions  Personalized Search  Dynamic/Interactive Search

5 Geographic Web-IR  Location is the most personal property  „All business is local“  People already use the web geographically  „Yoga Brooklyn“  „Linux usergroup Frankfurt“  And get poor results  We are going to make that a lot better

6 How-Not-To  Semantic Web  „If just everybody included Geographic Markup in their web-pages“  Two problems  Chicken-Egg  Malicious Webmaster  Metatags Anyone?  Bottomline  Semantic web is for „B2B“ situations only.

7 How-To  Modify traditional IR techniques to extract geographic markers  Multigranular approach  Extending basic Web-IR  Map pages to geographic positions  Footprint  Aggregate and Cluster them  Build Applications  Geographic Search  Geographic Web-Mining

8 Geocoding  Footprint  Geographic Position of a Webpage  Set of points and polygons, associated with some amplitude

9 Preliminaries  Basic IR Assumptions can easily be extended to „geographic-IR“  Radius-1 Hypothesis  Radius-2 Hypothesis (co-citation)  Intra-Site Hypothesis  Intra-subdomain  Intra-directory

10 Multigranularity  Information extraction on different levels  Domain  Subdomain  Directory  File  Need to aggregate Dir File Dom SDom Dir File

11 Sources  On all levels  Names of places  Zip-codes  Area-codes  On Site Level  Whois  Business Directories  Links  Density over a given area  Radius-1 and Radius-2  Geospatial Mapping and Navigation of the Web, Kevin S. McCurley, 10 th WWW, 2001 Geospatial Mapping and Navigation of the Web  Computing Geographical Scopes of Web Resources, J. Ding, L. Gravano, and N. Shivakumar, VLDB 2000 Computing Geographical Scopes of Web Resources Dir File Dom SDom Dir File

12 Geographic Search  A simple interface  Not so exciting, but...  Key Words  City  Street  State  Area code  SEARCH

13 Dynamic Geographic-IR  Replacing the „next“ button  Closer  Continue  Wider  Next  Closer  Wider  Next  ½ mile  1 mile  2 miles  5 miles  10 miles  25 miles 100 miles

14 Locality  Final ranking is a (linear) combination of importance and geographic distance.  Chances are:  Amazon will still rank first: no matter where you are  Amazon is a „global bully“  Idea:  Eliminate global bullies by computing importance differently  Give less weight to links that span a longer distance

15 Evaluation  Evaluation Web-IR is hard  Evaluating geo-Search is even harder  Mistakes are hard to find

16 Impact of geo-IR  Next generation Search Engine  Location based Service  For cellphones under UMTS  Move traffic from A&E  Local companies will get more traffic  Increase Profits from Adwords  Smallest businesses will advertise online  Locally focused  The „Leaflet-industry“ will shrink

17 Geographic Web-Mining  The web reflects human society.  Distorted  Delayed/Ahead  A lot of interesting social questions can be answered by looking at a large webcrawl  You can save time and money compared to door- to-door surveys  This is widely used  But:  Most of them are of geographic nature

18 Example Queries  Where in Germany are vintage sneakers a trend?  Is there a fashion authority that is accepted in all regions of Germany?  Do Britney and Madonna have the same audience?  Draw a map of Germany with all sites about vintage sneakers.  Find all fashion-sites that get a min of 1000 equally distributed links.  Map the areas in Germany, where there are significantly more Sites for B. than for M.  Precise Semantics?

19 Current Work  Older Prototype  Metasearch on top of lycos.de  Screen-scrape & re-order  Whois only  Did very well

20 Current Work  Current Prototype for Geographic Search  Limited to Germany =.de domains  Pages  Expected online by late summer  In co-operation with  Yen-Yu Chen  Xiaohui Long  Torsten Suel  Polytechnic University, Brooklyn

21 Reinventing Web-IR  Nearly no (academic) work in geo-IR  Allmost every aspect of Web-IR needs to be looked at again  Interfaces  Query processing  Index distribution  Link analysis  User profile analysis  Spam detection  Even:  Other aspects of personalized search  Changes in the web

22 Thank you Any questions?