Wenxu Li & Ziming Zhai Deepin Search. Motivation Google gives you the best results for everyone, but maybe not the best for you. Besides keyword match,

Slides:



Advertisements
Similar presentations
Fatma Y. ELDRESI Fatma Y. ELDRESI ( MPhil ) Systems Analysis / Programming Specialist, AGOCO Part time lecturer in University of Garyounis,
Advertisements

Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
Exploring PHP and MySQL Using an Online Travel Agency as a Case Study Charles R. Moen, M.S. Morris M. Liaw, Ph.D. October 9, 2004 ACET 2004.
Using GET data within a IF Statement. If ($GETCom === ‘home’) { echo ’They Match’; } $GETCom = $_GET[‘com’]; If the data stored in the variable ($GETCom)
© Copyright 2012 STI INNSBRUCK Apache Lucene Ioan Toma based on slides from Aaron Bannert
Lecture 6/2/12. Forms and PHP The PHP $_GET and $_POST variables are used to retrieve information from forms, like user input When dealing with HTML forms.
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
1 The PageRank Citation Ranking: Bring Order to the web Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd Presented by Fei Li.
Search Engine – Metasearch Engine Comparison By Ali Can Akdemir.
Project Title: Deepin Search Member: Wenxu Li & Ziming Zhai CSCI 572 Project.
Web Categorization Crawler – Part I Mohammed Agabaria Adam Shobash Supervisor: Victor Kulikov Winter 2009/10 Final Presentation Sep Web Categorization.
Project Title: Deepin Search Member: Wenxu Li & Ziming Zhai CSCI 572 Project.
Searching on the WWW The Google Phenomena Snyder p
SEO PACKAGES. Types of Plans Starter Plan Business Plan Enterprises Plan.
Internet Research Search Engines & Subject Directories.
SEO Techniques Tech Talk 29 th August 2013 (By PEN Vannak)
An Application of Graphs: Search Engines (most material adapted from slides by Peter Lee) Slides by Laurie Hiyakumoto.
Improving Internet Surfing.  Company started in  Currently subsidiary of Amazon.com.  Main objective: provide information about almost every.
Databases & Data Warehouses Chapter 3 Database Processing.
Stage 1: Keyword Research. List of Tasks 1. Find 15 keywords that fulfill our criteria 2. Capture the screen of Market Samurai’s SEO Competition Matrix.
Chris Pinski.  History  What is Ajax  Who uses Ajax  Underlying Technologies  SE Aspect  Common Problems  Conclusion.
Santosh Ghimire – 066 BCT 533 Subit Raj Pokharel – 066 BCT 538 Sudip Kafle – 066 BCT
Server-side Scripting Powering the webs favourite services.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
Basic Web Applications 2. Search Engine Why we need search ensigns? Why we need search ensigns? –because there are hundreds of millions of pages available.
Developing an improved focused crawler for the IDEAL project Ward Bonnefond, Chris Menzel, Zack Morris, Suhas Patel, Tyler Ritchie, Mark Tedesco, Franklin.
SQL Queries Relational database and SQL MySQL LAMP SQL queries A MySQL Tutorial and applications Database Building Assignment.
© Copyright 2013 STI INNSBRUCK DigSiteValue.net Anna Fensel March
WHAT IS A SEARCH ENGINE A search engine is not a physical engine, instead its an electronic code or a software programme that searches and indexes millions.
Search Engine Comparisons By: Thomie Ventura. Search Engines Today, much, but not all, of the work we do revolves around the web Today, much, but not.
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Project: web service composition Jianguo Lu University of Windsor.
Curtis Spencer Ezra Burgoyne An Internet Forum Index.
The Business Model of Google MBAA 609 R. Nakatsu.
The Anatomy of a Large-Scale Hyper textual Web Search Engine S. Brin, L. Page Presenter :- Abhishek Taneja.
Search Engines.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
EVALUATE YOUR SITE’S PERFORMANCE. Web site statistics Affiliate Sales Figures.
Search Engines By: Faruq Hasan.
Shack Up With a University Alum Orf 401 Final Presentation Kai Ross, Kevin Fan, Mik Breiterman-Loader, David Laslett, Liz Brennan May 12, 2008.
Windows 7 WampServer 2.1 MySQL PHP 5.3 Script Apache Server User Record or Select Media Upload to Internet Return URL Forward URL Create.
 Pages within our Framework  Categories within Framework  Products within Framework  Model-View-Controller description of Page, category and products.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Quality of Service Models for Web Services Eric Montrym 2/18/05.
Our MP3 Search Engine Crawler –Searching for Artist Name –Searching for Song Title Website Difficulties Looking Back.
Online Marketing. Types Marketing Link Building Content Marketing Search Engine Optimization(SEO) Social Media Marketing Advertising.
How Web Database Architectures Work CPS181s April 8, 2003.
1 CS 8803 AIAD (Spring 2008) Project Group#22 Ajay Choudhari, Avik Sinharoy, Min Zhang, Mohit Jain Smart Seek.
General Architecture of Retrieval Systems 1Adrienn Skrop.
/16 Final Project Report By Facializer Team Final Project Report Eagle, Leo, Bessie, Five, Evan Dan, Kyle, Ben, Caleb.
SEO FOR REDESIGN Eric Werner. DON’T WAIT “ We are going to wait until the redesign is complete to work on SEO” No problem unless any of the following.
Search Engine Optimisation No Point having a lovely site and lovely content if no one can find it!
DPS Dissertation System
Methods and Apparatus for Ranking Web Page Search Results
PHP Training at GoLogica in Bangalore
Facebook Clone Script | Social Network Script - Open Source Social Network Script
Search Engines & Subject Directories
The Anatomy of a Large-Scale Hypertextual Web Search Engine
Information Retrieval
Online Tool Screen shots
Search Engines & Subject Directories
Search Engines & Subject Directories
Combining Keyword and Semantic Search for Best Effort Information Retrieval  Andrew Zitzelberger 1.
All About the Internet.
INFO 344 Web Tools And Development
Information Retrieval and Web Design
Best Digital Marketing Tips For Quick Web Pages Indexing Presented By:- Abhinav Shashtri.
MIS Professor Sandvig MIS 424 Professor Sandvig
Who is Using your webSite?
Presentation transcript:

Wenxu Li & Ziming Zhai Deepin Search

Motivation Google gives you the best results for everyone, but maybe not the best for you. Besides keyword match, maybe you also be aware of site speed, site quality or category belonging. It would be great if users can create their owns ranking methods.

Our Approach Retrieve the first 24 results from google Send request to Amazon Alexa Services to get insight information of each result url Use our formula to calculate the score of each criteria for each url Allow user to change the weight of each criteria Re-rank the results based on the final scores

Main Functions Customized rank: User can use scroll bar to give weights to five criteria Speed Quality Popularity Date Created Keyword Match View detailed information of each url User can view the general description of each url, 3 months traffic information and related sites Display results in category We allow the results to show in categories

Amazon Alexa Services URL Info Related Links, Categories, LinksInCount Rank, RankByCountry, RankByCity UsageStats, Speed, Keyword, SiteData ContactInfo, AdultContent, Language, OwnedDomains SitesLinkingIn CategoryListings Domz, return a list of sites within that category Traffic History since Rank, Reach, PageView

Calculate Ranking Scores Formula: S = U s *S speed + U t *S time + U q *S quality + U p *S popularity + U d *S google Normalization: S quality = (value-min)/(max-min) Popularity Score: Reach*PageView Quality Score: 0.5 * S LinksInCount * S PageView Dummy Variable (Keep Google Ranking)

Architecture

Implementation jQuery + PHP + MySQL AJAX + JSON + XML Hosted on Godaddy Amazon Alexa Cost $1.6 so far ($0.15 per 1000 requests) Use hash (inverted index) to index url Use Trie Structure to organize url in categories

Performance Each Query (everything on the fly) 5*3 connections to Google 24*5 connections to Amazon Alexa Godaddy has connection limitation Actually more than 200 connection requests per query Ajax to split a big task into 6 tasks, each one only deals with one kind of information Store retrieved info to database, update regularly It saves money

Demo