Presentation is loading. Please wait.

Presentation is loading. Please wait.

Selected New Training Documents to Update User Profile Abdulmohsen Algarni and Yuefeng Li and Yue Xu CIKM 2010 Hao-Chin Chang Department of Computer Science.

Similar presentations


Presentation on theme: "Selected New Training Documents to Update User Profile Abdulmohsen Algarni and Yuefeng Li and Yue Xu CIKM 2010 Hao-Chin Chang Department of Computer Science."— Presentation transcript:

1 Selected New Training Documents to Update User Profile Abdulmohsen Algarni and Yuefeng Li and Yue Xu CIKM 2010 Hao-Chin Chang Department of Computer Science & Information Engineering National Taiwan Normal University 2011/06/07

2 2 Outline Introduction Pattern-based model Adaptive information filtering Experiment Conclusion

3 Introduction Filtering task indicates to the user which document might be interested to him –Determine which ones are really relevant is fully reserved to the user Information Filtering (IF) model aim is to re-rank the incoming set of documents based on user profile of the user's topic Major studies in this area can be grouped into two main groups –First, the purpose of knowledge extraction from user feedback is to build the user's profile –The second deals with how effectively an efficiently a user profile can be updated with a new feedback in order to follow the user's interest change and improve the quality of the filtering system 3

4 Introduction Relevance Feature Discovery(RFD) High level features (patterns) and low level features will be extracted from initial training documents The higher level features include both positive and negative patterns The low level term weights are evaluated according to both their specicity and their distributions in the higher level features 4

5 Introduction In order to deal with adaptive issues in an IF model, there are two main areas of focus –The first involves updates of the user's profile to follow changes in the user's interests with new information –The second area involves updating the user profile to solve nonmonotonic problems  Training documents about the “Agent”  IF systems may return information objects such as “Intelligent Agent”, “Property Agent”, “Software Agent”  previous matching decisions (e.g. considering “Property Agent” as relevant)  user’s actual information need (e.g. user is only interested in “Software Agent” as non relevant) How slove nonmonotonic problems –The first is how to select a document that contains new knowledge that a system does not have –The second issue is how to evaluate and update based-knowledge with the new one in an efficient way 5

6 Introduction Adaptive Relevance Features Discovery (ARFD) First, is the ability of IF system to extract dierent knowledge for dierent users in dierent interested topics Second, is the ability of updating and reviewing the weight of features in the hypothesis space model when is received a new feedback 6

7 Pattern-based model We used a pattern-based model to extract features from relevance feedback This is different from the usual defnition where a pattern consists of distinct terms and duplicate terms are removed. Coverset({t 3,t 4,t 6 },d) = {dp 2,dp 3,dp 4 } sup a ({t 3,t 4,t 6 },d) = 3 Sup r ({t 3,t 4,t 6 },d) = 3/6=0.5 Closed patterns :,, 7

8 Pattern-based model For a given term t, its weight in discovered patterns in positive text documents The specicity of a given term t in the training set D = D + ∪ D - The initial weights of terms finally are revised 8

9 Adaptive information filtering Document Selection –The new feedback can be categorized into two main categories First, is a document that contains more explanation about what user need on the same topic Second category is documents that contain new area or topic which are indicate that the user changed his interest topic and that is out of our scope The system has used the following ranking function. 9

10 Adaptive information filtering Knowledge Extraction and Merging of Adaptive RFD model (ARFD) (1) Mining features and functions from the initial (or the base) training set D b (2)SNTSelect describes the details of selecting some target documents D s from a new training set D n in order to remove some redundant documents (3)Mining features and functions from the target documents D s (4)Merging these features and the functions discovered from the both initial training set and the selected target documents 10

11 Experiment 11

12 Experiment 12

13 13 Conclusion We proposed an adaptive information fltering system called adaptive relevance features discovery The main aim of this method is the efficient revision and updating of extracted features weight in vector space using new training documents to solve the nonmonotonic problem The combination of the knowledge will be tested to ensure that it helps to solve the nonmonotonic problem

14 P-value P-value 定義上是 : 以現有的樣本資料而言, 能棄絕 (reject) 虛無假說 H0 的最小顯著水準 顯著水準是做檢定時我們能容許的型一錯誤機率上限。因此, 顯著水 準愈小, 則棄絕域愈小。所以, 若在特定的顯著水準下依據目前的資料 H0 能被棄絕, 則可將顯著水準降低 ; 但降得太低, 則目前的資料點可能 被排擠出棄絕域之外, 即不能棄絕 H0 。 P-value 就是表示顯著水準放 寬至能棄絕 H0 後又儘量縮減至幾乎不能棄絕 H0 的情況。 14

15 SPMining 15

16 HLFmining 16

17 NRevision 17

18 SNTselect 18


Download ppt "Selected New Training Documents to Update User Profile Abdulmohsen Algarni and Yuefeng Li and Yue Xu CIKM 2010 Hao-Chin Chang Department of Computer Science."

Similar presentations


Ads by Google