Adaptive News Access Daniel Billsus Presented by Chirayu Wongchokprasitti.

Slides:



Advertisements
Similar presentations
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen SIMS, UC Berkeley Susan Dumais Adaptive Systems & Interactions Microsoft.
Towards Twitter Context Summarization with User Influence Models Yi Chang et al. WSDM 2013 Hyewon Lim 21 June 2013.
Supervised Learning Techniques over Twitter Data Kleisarchaki Sofia.
Korea Univ. Division Information Management Engineering UI Lab. Korea Univ. Division Information Management Engineering UI Lab. S E M I N A R Predictive.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.
 Why should it be mobile?  What content should I make mobile?  What need do I serve by making my content available to mobile users?  What value does.
THE UNIVERSITY OF HONG KONG WEB BY DANIEL CHURCHILL 2.0.
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
Model Personalization (1) : Data Fusion Improve frame and answer (of persistent query) generation through Data Fusion (local fusion on personal and topical.
Web Mining Research: A Survey
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
Web Mining Research: A Survey
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
Approaches to automatic summarization Lecture 5. Types of summaries Extracts – Sentences from the original document are displayed together to form a summary.
CSC 101 Slide Show Ashley Carroll. Podcast What is Podcasting? Podcasting is the distribution of audio or video files, such as radio programs or music.
Web Mining Research: A Survey
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Discovery of Aggregate Usage Profiles for Web Personalization
Presented by Zeehasham Rasheed
Recommender systems Ram Akella November 26 th 2008.
The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.
12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.
Overview of Web Data Mining and Applications Part I
Overview of Search Engines
Homepage Layout Management. Note: This is our last Core Publisher training in the series! You will be checking in with your Station Relations Support.
Best Practices Using Enterprise Search Technology Aurelien Dubot Consultant – Media and Entertainment, Fast Search & Transfer (FAST) British Computer Society.
Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Adventures in Radio UserLand Lincoln Cushing, UC Berkeley Institute of Industrial Relations Library.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Question Answering.  Goal  Automatically answer questions submitted by humans in a natural language form  Approaches  Rely on techniques from diverse.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Pete Bohman Adam Kunk. Real-Time Search  Definition: A search mechanism capable of finding information in an online fashion as it is produced. Technology.
Presented By :Ayesha Khan. Content Introduction Everyday Examples of Collaborative Filtering Traditional Collaborative Filtering Socially Collaborative.
Toward the Next generation of Recommender systems
1 Business System Analysis & Decision Making – Data Mining and Web Mining Zhangxi Lin ISQS 5340 Summer II 2006.
Data Mining By Dave Maung.
Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches John HannonJohn Hannon, Mike Bennett, Barry SmythBarry Smyth.
Math Information Retrieval Zhao Jin. Zhao Jin. Math Information Retrieval Examples: –Looking for formulas –Collect teaching resources –Keeping updated.
1 Automatic Classification of Bookmarked Web Pages Chris Staff Second Talk February 2007.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, Jose San.
1 Emerging Technology Using RSS RSS and syndication By Steve Sloan RSS and syndication By Steve Sloan.
Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,
Exploring in the Weblog Space by Detecting Informative and Affective Articles Xiaochuan Ni, Gui-Rong Xue, Xiao Ling, Yong Yu Shanghai Jiao-Tong University.
Text Categorization With Support Vector Machines: Learning With Many Relevant Features By Thornsten Joachims Presented By Meghneel Gore.
Information Design Trends Unit Five: Delivery Channels Lecture 2: Portals and Personalization Part 2.
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
Peter Brusilovsky. Index What is adaptive navigation support? History behind adaptive navigation support Adaptation technologies that provide adaptive.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
User Modeling and Recommender Systems: recommendation algorithms
Hybrid Content and Tag-based Profiles for recommendation in Collaborative Tagging Systems Latin American Web Conference IEEE Computer Society, 2008 Presenter:
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
Homepage Layout Management. Note: This is our last Core Publisher training in the series! You will be checking in with your Station Relations Support.
User Modeling, Adaptation, Personalization Part 2 ΕΠΛ 435: Αλληλεπίδραση Ανθρώπου Υπολογιστή.
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
WEB STRUCTURE MINING SUBMITTED BY: BLESSY JOHN R7A ROLL NO:18.
Automatic cLasification d
A Case Study for Adaptive News Systems with Open User Model
Movie Recommendation System
Web Mining Research: A Survey
Presentation transcript:

Adaptive News Access Daniel Billsus Presented by Chirayu Wongchokprasitti

Introduction WWW is a common source for news access anywhere and anytime. WWW is a common source for news access anywhere and anytime. The availability of updated news content recently overloads us The availability of updated news content recently overloads us Adaptive web technology can help discovering relevant content from thousands of sources. Adaptive web technology can help discovering relevant content from thousands of sources.

Types of Adaptive News Access News Personalization News Personalization Adaptive News Navigation Adaptive News Navigation Contextual Recommendations Contextual Recommendations News Aggregation News Aggregation

News Personalization Dynamic content Dynamic content News stories are released and updated continuously News stories are released and updated continuously Content-based methods fit to news personalization Content-based methods fit to news personalization Content-based methods predict user’s interest based on text alone Content-based methods predict user’s interest based on text alone Changing interests Changing interests User’s interests tend to change frequently. User’s interests tend to change frequently. A user model can adjust its interests quickly. A user model can adjust its interests quickly. The techniques of changing target concepts is known as concept drift. The techniques of changing target concepts is known as concept drift.

News Personalization (cont.) Multiple interests Multiple interests Users are interested in different news topics Users are interested in different news topics A user model must be capable of representing multiple interests A user model must be capable of representing multiple interests The k-nearest-neighbor methods (kNN) are good choices. The k-nearest-neighbor methods (kNN) are good choices. Novelty Novelty A new “unknown” story is considered most interesting. A new “unknown” story is considered most interesting. A new story too close to what user previously accessed is classified as a “known” story. A new story too close to what user previously accessed is classified as a “known” story.

News Personalization (cont.) Avoiding tunnel vision Avoiding tunnel vision Personalization should not get in the way of finding important novel information. Personalization should not get in the way of finding important novel information. Editorial input Editorial input A user model ranks stories by a prediction function. A user model ranks stories by a prediction function. Retaining editorial input is an important feature for news organizations. Retaining editorial input is an important feature for news organizations. To ensure users will get to see the top n stories. To ensure users will get to see the top n stories.

News Personalization (cont.) Brittleness Brittleness A single action, with or without intention, should not have a radical effect on a user model. A single action, with or without intention, should not have a radical effect on a user model. Availability of meta-tags Availability of meta-tags News personalization algorithms can usually not rely on the availability of meta-tags. News personalization algorithms can usually not rely on the availability of meta-tags.

News Personalization (cont.)

Adaptive News Navigation The objective is to simplify access to relevant content. The objective is to simplify access to relevant content. This technique focuses on analyzing user’s access patterns to determine the position of menu items within a menu hierarchy. This technique focuses on analyzing user’s access patterns to determine the position of menu items within a menu hierarchy. This approach is suitable to mobile applications due to limited screen space. This approach is suitable to mobile applications due to limited screen space.

Adaptive News Navigation (cont.) On average, the number of selected menu and scroll operations was reduced by over 50%. On average, the number of selected menu and scroll operations was reduced by over 50%. However, this approach does not provide any news recommendations. However, this approach does not provide any news recommendations.

Contextual Recommendations An approach draws on currently displayed information on the screen as an expression of the user’s current interests. An approach draws on currently displayed information on the screen as an expression of the user’s current interests. The system extracts textual information on the user’s screen and the extracted text is used to retrieve related content. The system extracts textual information on the user’s screen and the extracted text is used to retrieve related content. Statistical term-weighting techniques are used to identify informative terms. Statistical term-weighting techniques are used to identify informative terms. Blinkx is a publicly available contextual recommender ( Blinkx is a publicly available contextual recommender (

Contextual Recommendations (cont.)

News Aggregation News aggregators are services that aggregate content from many news sources, and then adapt to the current news landscape as a whole. News aggregators are services that aggregate content from many news sources, and then adapt to the current news landscape as a whole. The services use RSS (Rich Site Summary) feeds to provide links to available content. The services use RSS (Rich Site Summary) feeds to provide links to available content. A news aggregation implementation can use statistical term-weighting and text similarity techniques. A news aggregation implementation can use statistical term-weighting and text similarity techniques. Google News ( is one of these services. Google News ( is one of these services.

News Aggregation (cont.)

Case Study Adaptive News Personalization for Mobile Content Access Adaptive News Personalization for Mobile Content Access Learning User Models for News Access Learning User Models for News Access Evaluation Evaluation

Adaptive News Personalization for Mobile Content Access The constraints of mobile information access make personalization important to produce usable applications. The constraints of mobile information access make personalization important to produce usable applications. A news system in mobile personalizes the orders of news sections the most relevant stories are displayed on the topmost A news system in mobile personalizes the orders of news sections the most relevant stories are displayed on the topmost

Adaptive News Personalization for Mobile Content Access (cont.)

Learning User Models for News Access The system uses a machine learning approach to build a simple model of each user’s interests. The system uses a machine learning approach to build a simple model of each user’s interests. A combination of similarity-based methods and Bayesian methods achieves the balance of learning and adapting quickly to change interests while avoiding brittleness. A combination of similarity-based methods and Bayesian methods achieves the balance of learning and adapting quickly to change interests while avoiding brittleness.

Learning User Models for News Access (cont.) These two algorithms form a multi- strategy learning approach to learn two separate user models. These two algorithms form a multi- strategy learning approach to learn two separate user models. Short-term interests user model Short-term interests user model Long-term interests user model Long-term interests user model

Learning User Models for News Access (cont.) The purpose of the short-term model The purpose of the short-term model First, it should contain information about recently read events, so that stories which belong to the same thread can be identified. First, it should contain information about recently read events, so that stories which belong to the same thread can be identified. To allow for identification of stories that user already knows. To allow for identification of stories that user already knows. The k-nearest-neighbor algorithm (kNN) is used to achieve the desired functionality. The k-nearest-neighbor algorithm (kNN) is used to achieve the desired functionality. Convert news stories to tf-idf vectors (term- frequency/inverse-document-frequency). Convert news stories to tf-idf vectors (term- frequency/inverse-document-frequency). Use the cosine similarity measure to quantify the similarity of two vectors. Use the cosine similarity measure to quantify the similarity of two vectors.

Learning User Models for News Access (cont.) The purpose of the long-term model is to model a user’s general preferences. The purpose of the long-term model is to model a user’s general preferences. The system periodically selects informative words for each news category from a large sample of stories. The system periodically selects informative words for each news category from a large sample of stories. The goal of the feature selection process is to select informative words that reoccur over a long period of time. The goal of the feature selection process is to select informative words that reoccur over a long period of time. A naïve Bayesian classifier is used to assess the probability of stories being interesting. A naïve Bayesian classifier is used to assess the probability of stories being interesting.

Learning User Models for News Access (cont.)

Evaluation They summarize the results from two studies that compare personalization information access to static one. They summarize the results from two studies that compare personalization information access to static one. First, the “alternating sessions” experiment quantifies the difference between static and adaptive information access First, the “alternating sessions” experiment quantifies the difference between static and adaptive information access A half of users used its user modeling approach. A half of users used its user modeling approach. The other half received news in static order from the source. The other half received news in static order from the source.

Evaluation (cont.) The average display rank of selected stories was 6.7 in the static mode and 4.2 in the adaptive mode (based on 50 users that selected 340 stories out of 1882 headlines). The average display rank of selected stories was 6.7 in the static mode and 4.2 in the adaptive mode (based on 50 users that selected 340 stories out of 1882 headlines). The analysis of the distribution of selected stories. The analysis of the distribution of selected stories. In the static mode, 68.7% of the selected stories on the top two headline screens In the static mode, 68.7% of the selected stories on the top two headline screens In the adaptive mode, 86.7% on the top two In the adaptive mode, 86.7% on the top two

Evaluation (cont.)

Second, the “alternating stories” experiment displays stories selected with respect to both the adaptive and static modes on the same screen. Second, the “alternating stories” experiment displays stories selected with respect to both the adaptive and static modes on the same screen. Advantages: Advantages: The system still adapts to user’s interests. The system still adapts to user’s interests. Allow a direct comparison between the two selection strategies. Allow a direct comparison between the two selection strategies.

Evaluation (cont.) The difference was not as pronounced as in the “alternating sessions” experiment. The difference was not as pronounced as in the “alternating sessions” experiment. The average display rank of selected stories was 5.8 in the static mode and 5.27 in the adaptive mode. The average display rank of selected stories was 5.8 in the static mode and 5.27 in the adaptive mode. The analysis of the distribution of selected stories. The analysis of the distribution of selected stories. In the static mode, 75.57% In the static mode, 75.57% In the adaptive mode, 80.44% In the adaptive mode, 80.44% Users are more likely to select adaptive stories (19.02%) than static ones (13.26%) which amounts to a 43.44% increase in selected content. Users are more likely to select adaptive stories (19.02%) than static ones (13.26%) which amounts to a 43.44% increase in selected content.

Evaluation (cont.)

In summary, the “alternating sessions” and “alternating stories” experiments show adaptive information access is higher than static access. In summary, the “alternating sessions” and “alternating stories” experiments show adaptive information access is higher than static access. The “alternating sessions” experiment showed adaptive order helps shifting interesting stories towards the beginning of personalized lists. The “alternating sessions” experiment showed adaptive order helps shifting interesting stories towards the beginning of personalized lists. The “alternating stories” experiment showed the system is capable of ordering content that the top- ranked items have a significantly higher chance to be selected that the ranked static ones. The “alternating stories” experiment showed the system is capable of ordering content that the top- ranked items have a significantly higher chance to be selected that the ranked static ones.

Recent Trends and Systems Podcasting Podcasting Online audio distribution of news content. Online audio distribution of news content. Collaborative filtering techniques is applicable to podcast recommendation. Collaborative filtering techniques is applicable to podcast recommendation. Personalization and the Blogosphere Personalization and the Blogosphere Blogosphere refers to the set of all webblogs. Blogosphere refers to the set of all webblogs. Some systems support personalized blog access such as Findory.com, NewsGator.com. Some systems support personalized blog access such as Findory.com, NewsGator.com. News Zeitgeist News Zeitgeist Zeitgeist is a German word that means “the spirit (Geist) of the time (Zeit)”. Zeitgeist is a German word that means “the spirit (Geist) of the time (Zeit)”. The goal is to automatically identify the most popular topics of the current Blogosphere. The goal is to automatically identify the most popular topics of the current Blogosphere.

Conclusions & References We need new technology to help leverage the full potential web-based news distribution. We need new technology to help leverage the full potential web-based news distribution. [1] Billsus, D. (2005). Adaptive News Access [2] Billsus, D., & Pazzani, M. (2000). User Modeling for Adaptive News Access. User Modeling and User-Adapted Interaction, 10(2/3): [3] Chiu, B. & Webb, G. (1998) Using decision trees for agent modeling: improving prediction performance. User Modeling and User-Adapted Interaction, 8,

Questions or Comments?