CMo: When Less Is More Yevgen Borodin Jalal Mahmud I.V. Ramakrishnan Context-Directed Browsing for Mobiles.

Slides:



Advertisements
Similar presentations
The Internet.
Advertisements

© 2011 Delmar, Cengage Learning Chapter 1 Getting Started with Dreamweaver.
Albert Gatt Corpora and Statistical Methods Lecture 13.
Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen SIMS, UC Berkeley Susan Dumais Adaptive Systems & Interactions Microsoft.
Microsoft Office 2007 PowerPoint Web Feature Creating Web Pages Using PowerPoint.
Server Web Server Pages Client Browser  Office Hours on Thursday › Diane’s are CANCELLED › Belinda will hold 11:00-12:00  Belinda will be teaching.
XP Browser and Basics1. XP Browser and Basics2 Learn about Web browser software and Web pages The Web is a collection of files that reside.
BuzzTrack Topic Detection and Tracking in IUI – Intelligent User Interfaces January 2007 Keno Albrecht ETH Zurich Roger Wattenhofer.
World Wide Web1 Applications World Wide Web. 2 Introduction What is hypertext model? Use of hypertext in World Wide Web (WWW) – HTML. WWW client-server.
Macromedia Dreamweaver 4 Advanced Level Course. Add Rollovers Rollovers or mouseovers are possibly the most popular effects used in designing Web pages.
Does Ajax suck? CS575 Spring 2007 Chanwit Suebsureekul.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
Chapter 1 Getting Started With Dreamweaver. Explore the Dreamweaver Workspace The Dreamweaver workspace is where you can find all the tools to create.
Chapter 14 Introduction to HTML
Prof. Vishnuprasad Nagadevara Indian Institute of Management Bangalore
HTML Comprehensive Concepts and Techniques Intro Project Introduction to HTML.
Computer Concepts 2014 Chapter 7 The Web and .
Web 2.0: Concepts and Applications 11 The Web Becomes 2.0.
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
© Cheltenham Computer Training 2001 Macromedia Dreamweaver 4 - Slide No 1 Macromedia Dreamweaver 4 Advanced Level Course.
HTML 101 MPM What is a website? A website is basically a collection of web pages stored on a particular computer (called a web server) and accessed.
Chapter 6 The World Wide Web. Web Pages Each page is an interactive multimedia publication It can include: text, graphics, music and videos Pages are.
The Internet. An interconnected network of computers globally Computers are able to communicate and share information with one another from remote locations.
 The internet is the hardware that creates the massive worldwide network. Computers, cables, telephone wires, high-speed communication lines. The internet.
Copyright © 2008 Pearson Prentice Hall. All rights reserved. 1 Exploring Microsoft Office Word 2007 Chapter 8 Word and the Internet Robert Grauer, Keith.
©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.
Adobe Contribute CS4 Targeted Training, LLC © Targeted Training, LLC 2010.
Content Analysis Techniques to Ease Browsing with Handhelds Jalal Mahmud Yevgen Borodin I.V. Ramakrishnan Department of Computer Science State University.
HOW WEB SERVER WORKS? By- PUSHPENDU MONDAL RAJAT CHAUHAN RAHUL YADAV RANJIT MEENA RAHUL TYAGI.
Internet Explorer 7. Cut through the clutter. Redesigned streamlined interface –Maximizes the area of the screen that displays the webpage.
Template. Mobile devices used in the exploration.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Improving Web Spam Classification using Rank-time Features September 25, 2008 TaeSeob,Yun KAIST DATABASE & MULTIMEDIA LAB.
Detecting Semantic Cloaking on the Web Baoning Wu and Brian D. Davison Lehigh University, USA WWW 2006.
HTML ~ Web Design.
Features and Algorithms Paper by: XIAOGUANG QI and BRIAN D. DAVISON Presentation by: Jason Bender.
INTRODUCTION TO HTML5 Semantic Layout in HTML5.  The new semantic layout in HTML5 refers to a new class of elements that is designed to help you understand.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
1 FollowMyLink Individual APT Presentation Third Talk February 2006.
CPSC 203 Introduction to Computers Lab 66 By Jie Gao.
Jan 2001C.Watters1 World Wide Web and E-Commerce Client Side Processing.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Website design and structure. A Website is a collection of webpages that are linked together. Webpages contain text, graphics, sound and video clips.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
The Internet and World Wide Web Sullivan University Library.
Web 2.0: Concepts and Applications 11 The Web Becomes 2.0.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
CIS 228 The Internet Day 4, 9/8/11 Getting on the Internet.
The Internet Salihu Ibrahim Dasuki (PhD) CSC102 INTRODUCTION TO COMPUTER SCIENCE.
Information Networks. Internet It is a global system of interconnected computer networks that link several billion devices worldwide. It is an international.
Some from Chapter 11.9 – “Web” 4 th edition and SY306 Web and Databases for Cyber Operations Cookies and.
Vertical Search for Courses of UIUC Homepage Classification The aim of the Course Search project is to construct a database of UIUC courses across all.
StoryTelling with ArcGIS Online Jirka Pánek //
Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference,
Distributed Control and Measurement via the Internet
Objective % Select and utilize tools to design and develop websites.
Julián ALARTE DAVID INSA JOSEP SILVA
Some Common Terms The Internet is a network of computers spanning the globe. It is also called the World Wide Web. World Wide Web It is a collection of.
Objective % Select and utilize tools to design and develop websites.
Getting Started with Dreamweaver
An Empirical Study of Web Interface Design on Small Display Devices
Objectives To understand the about types of computer network
What is the World Wide Web (www)
Teaching slides Chapter 6.
Leverage Consensus Partition for Domain-Specific Entity Coreference
Tools to Show Effects of Different Download Order
INTELLIGENT BROWSERS Cenk Ursavas.
Y. Borodin, F. Ahmed, M. A. Islam, Y. Puzis, V. Melnyk and I. V
Presentation transcript:

CMo: When Less Is More Yevgen Borodin Jalal Mahmud I.V. Ramakrishnan Context-Directed Browsing for Mobiles

Miniaturization and Mobility

Mobile Web

Regular Web Sites

Happy Scrolling

Browsing Example

Mobile Browsing Problems Data Transfer Cost is High Connection is Slow Small Screens Lots of Scrolling  Time-Consuming  Strenuous  Tiring

Browsing With CMo

Interface Manager Context Analyzer Browser Object Geometric Analyzer Architecture CMo Proxy Server

First Problem: identifying significant frames CMo HTTP proxy Utilizes Mozilla to parse DOM  Get a tree of “frames”  Tag these by content “link”, “text”, “image link” … Identify “maximal semantic blocks”  Discard leaves  look for all X or Y aligned blocks

The Page is Segmented into 5 Blocks Context Collection

Next Problem: identifying context of links User has clicked somewhere What is the context? Possible ideas  The text of the link itself  The surrounding text (in the HTML stream)  The surrounding text (on the page) CMo looks at the nearby text  … only if it has something to do with the link text

Next Problem: identifying context of links Link text parsed into 1, 2, 3-grams “Rice not ruling out talks with Iranians” ->  Rice, not, ruling, out, with, Iranians  Rice ruling, ruling out, …  Rice ruling out, ruling out talks, …

Next Problem: identifying context of links Perform similar analysis on sibling blocks Calculate cosine similarily between m-sets  Cardinality of intesecting members  Divided by the product of the square root of each set’s cardinality.  USA, news, sports | USA, world ->.4  USA, news | USA, world ->.5  USA, news | USA, news -> 1

Cos(M1, M2) > T M1 M2 M1 M2 Cos(M1, M2) < T M2 Context Collection

Last Problem: where to zoom at target Break target page into frames Compare each frame with context Metrics used: –Words, 2-, 3-grams matched exactly –Words, 2-, 3-grams that stem match

Next Problem: where to zoom at target End up with a 6-tuple for each target block How to rank… Machine Learning! Supervised learning using SVM –Linear classifier –maximizes distance from hyperplane (QP) 900 labeled examples, 100 unlabled.

FeaturesSVMRank The Page is Segmented into 3 Blocks

The Highest Ranking Block is Most Relevant! 0.8

Exact Match of Context Words: Rice Exact Bigram Match: ruling talks Exact Trigram Match: Secretary State Condoleezza Match of Word Stems: rule Match of Stemmed Bigrams: talk Iranian Match of Stemmed Trigrams: Iranian offici confer

Experimental Setup Web Site Domains (5 Websites in Each)  News, Books, Consumer Electronics  Office Supplies, Informational 30 Graduate Students

Training SVM for Block Relevance Data Collection  Collected Pairs of Pages from 25 Web Sites  Labeled Data with Link, Context, Relevant Block Training SVM  Computed Features for 900 Pairs of Pages  Trained SVM Model with Feature Vectors  Used 100 Pages for Cross-Validation

Somewhat complicated procedure for training  Classificaion of blocks on link targets  Feeds back into the link context threshold

Evaluation Accuracy of Context Identification Accuracy of Relevant Block Identification Browsing Time with CMo vs. Regular Browser Number of Pen Taps with CMo

Evaluation: Context Collection Using 500 Web Pages from 25 Websites

SVM Model Trained Using 900 Page Pairs Testing Done with Remaining 100 page pairs Evaluation: Relevancy Detection

Evaluation Users perform news tasks such as (T1)  In Google news, find a given story  Click link to New York Times  Provide a specific piece of information contained in that story. Other tasks were shopping-like (T8)  Go to amazon  Click on “Pink ipod”  Determine its sales rank

Evaluation: Stylus Taps

Evaluation: Time

Future Work Porting CMo to Client Side Expand SVM Features Use Partitioning to Improve Segmentation Explore Navigation Options

Contributions Using Context to Find Relevant Information Saving Users Browsing time Reducing the Number of Stylus Taps Conveying the Richness of Web Pages

Questions?