Presentation is loading. Please wait.

Presentation is loading. Please wait.

LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.

Similar presentations


Presentation on theme: "LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah."— Presentation transcript:

1 LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah

2 2 Agenda Introduction Problem definition -- Hyperlink Selection Solution -- LinkSelector Evaluation Collaboration

3 3 Introduction Size of WWW More than 3 billion web pages (Google.com, 2001) 1 million pages added daily (Lawrence and Giles,1999) How to find information on the Web Using search engines (best coverage 38.3%) (Lawrence and Giles,1999) Clicking through hyperlinks

4 4 Introduction Product Category List A B C D E F Product Category A Product List A1 A2 A3 A4 A5 Product A2 Price: 1000 Detailed description Click on A Click on A2 Web Page 1 Web Page 2 Web Page 3 B2

5 5 Introduction Portal page: is a specific web page which serves as the entrance to a website. Portal page Important Mainly consisting of hyperlinks

6 6 Introduction Web portal is a personalized entrance to a website. (e.g., My Yahoo!) Default Web Portal/Portal Page Most My Yahoo! users never customize their default web portals (Manber et al., 2000).

7 7 Introduction Homepage of a Website/Portal Page

8 8 Introduction Not all hyperlinks in a website can be placed in the portal page of the website Hyperlinks in a portal page are selected from a hyperlink pool which is a set of hyperlinks pointing to top-level web pages, e.g., hyperlinks in a site index page.

9 9 Portal page

10 10 Hyperlink pool

11 11 Portal page

12 12 Hyperlink pool

13 13 Introduction Number of hyperlinks in a portal page one to several dozens (e.g., 14 in My Yahoo!). (Neilson, 1999) Number of hyperlinks in a hyperlink pool: one to several hundreds (e.g., 102 in My Yahoo!).

14 14 Introduction It is too computational expensive to do an exhaustive search (e.g., ). Current practice of hyperlink selection – expert selection Based on domain experts’ experiences Subjective and slower to adapt

15 15 Introduction Our approach is based on Web access patterns extracted from a web log – objective (web surfers’ actual visiting behaviors) Web structural patterns extracted from an existing website – objective and dynamically adaptive

16 16 Hyperlink Selection Metrics to measure the quality of a portal page Effectiveness Efficiency Usage The quality of a portal page is measured using a web log. A web log can be divided into sessions.

17 17 Hyperlink Selection Effectiveness: is the percentage of the user- sought top-level web pages that can be easily accessed from a portal page.  Efficiency measures the usefulness of hyperlinks placed in a portal page.  Usage : how often a portal page is visited.

18 18 Hyperlink Selection  Given  the hyperlink pool of a website, HP,  the number of hyperlinks to be placed in the portal page of the website, N, where N < |HP|;  Construct the portal page by selecting N hyperlinks from the hyperlink pool HP  Objective: optimize the effectiveness, efficiency and usage of the resulting portal page

19 19 LinkSelector LinkSelector is based on relationships between hyperlinks in a hyperlink pool. Structure Relationship Access Relationship

20 20 LinkSelector Structure Relationship L2 L4 L6 L8 L1 L3 Web page 1 Web page 2 L5 L7 Web page 3 Other Structure relationships: L1  L4 L1  L6 L1  L8 L3  L5 L3  L7 Structure relationship: L1  L2 L1: initial hyperlink L2: terminal hyperlink

21 21 LinkSelector A k-HS is denoted as a hyperlink set with k hyperlinks. e.g., {L1,L2} is a 2-HS The support of a k-HS is the percentage of sessions in which hyperlinks in the k-HS are accessed together. Example: If L1 and L2 are accessed together in 20 sessions out of total 100 sessions, then the support of the 2-HS {L1,L2} is 20%. Access Relationship

22 22 LinkSelector Access Relationship Definition : For a k-HS, where, there exists an access relationship among hyperlinks in the k-HS if and only if its support is greater than a pre-defined threshold. Example: If threshold = 0.15 and the support of the 2-HS {L1, L2} is 0.2 then, there exists an access relationship between hyperlinks L1 and L2 and the support of the relationship is 0.2

23 23 LinkSelector Discover structure relationships Parse the existing website Discover access relationships Data Preprocessing Web log cleaning Session identification Association rule mining (Agrawal and Srikant,1994 )

24 24 LinkSelector

25 25 Evaluation Summary of Data Hyperlink pool: site-index page of the UA web Site 110 links

26 26 Evaluation Summary of Data Web log: collected from the UA web server in Sep. 2001 10 M records (raw)  4.2 M records (clean) total 344 K sessions 262 K sessions  Training data (23 days) 82 K sessions  Testing data (7 days)

27 27 Evaluation Average improvement: 12.7% Improvement decrease from 22.1% to 8.4% Average number of sessions per day: 11.5k

28 28 Evaluation Group II relationship: 0.2% of the training sessions Group I relationship /shared/sports-entertain.shtml  /shared/athletics.shtml

29 29 Evaluation Average improvement: 17.0% Improvement decreases from 30.2% to 9.4% 605/day more user-sought top-level web pages can be easily accessed from the portal page constructed using LinkSelector than from those constructed using the other two approaches

30 30 Evaluation Average improvement: 16.9% Improvement decrease from 30.2% to 9.3%


Download ppt "LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah."

Similar presentations


Ads by Google