Presentation is loading. Please wait.

Presentation is loading. Please wait.

Deep Learning Powered In- Session Contextual Ranking using Clickthrough Data Xiujun Li 1, Chenlei Guo 2, Wei Chu 2, Ye-Yi Wang 2, Jude Shavlik 1 1 University.

Similar presentations


Presentation on theme: "Deep Learning Powered In- Session Contextual Ranking using Clickthrough Data Xiujun Li 1, Chenlei Guo 2, Wei Chu 2, Ye-Yi Wang 2, Jude Shavlik 1 1 University."— Presentation transcript:

1 Deep Learning Powered In- Session Contextual Ranking using Clickthrough Data Xiujun Li 1, Chenlei Guo 2, Wei Chu 2, Ye-Yi Wang 2, Jude Shavlik 1 1 University of Wisconsin – Madison 2 Microsoft Bing, Redmond, Washington, U.S.A. 12/12/2014

2 Outline Project Goal and Motivation  Background & Goal  Problem Definition Methodology  Feature Generation  Semantic Features  Click-Based Features Experimental Results  Offline Evaluation Future Work

3 Goal Incorporate some expressive signals (e.g. topic, domain) from in- session query history into the ranking, to re-rank the results

4 Example Query: the dangerous days of daniel x Altrelda wikithe dangerous days of daniel x Altrelda wiki URLOriginal RankControlRankerScore http://en.wikipedia.org/wiki/The_Dangerous_Days_of_Daniel_X110.922 http://fanon.wikia.com/wiki/The_Dangerous_Days_of_Daniel_X29.2745 http://www.amazon.com/The-Dangerous-Days-Daniel-X/dp/031611970937.444 http://danielx.wikia.com/wiki/The_Dangerous_Days_of_Daniel_X_(novel)46.7845 http://danielx.wikia.com/wiki/Daniel_X56.4136 http://www.goodreads.com/series/49946-daniel-x65.5571 http://en.wikipedia.org/wiki/Daniel_X:_Watch_the_Skies75.2691 http://www.jamespatterson.com/books_danielX.php84.6821 Example 1. Current Ranking But this Wikipedia url is an UnSat page clicked by the user in the same session? Query 2: the dangerous days of daniel x This is the 6 th query in the same session.

5 Approach Correlate users clicks with their satisfaction Propose a set of fine-grained semantic features to explore the relations between in-session history queries, clicked URLs and current query, URLs, e.g. o The similarity between previous queries and current URL o The similarity of previous sat/unsat clicked URLs of last query to current URL Employ the semantic models (e.g. XCode, DSSM, CDSSM) to measure the similarity Incorporate these features into the ranking, to re-rank the results

6 Problem 1.,…, are the history queries in one session. 2.,…, are the URLs in one impression page for query, it contains Sat and unSat clicks. 3. is the current query. 4.,…, are the URLs in the impression page for query based on the current ranking algorithm. Problem: we will re-rank, …, based on some signals (i.e. topic, domain) derived from the history in- session clickthrough data.

7 Outline Project Goal and Motivation  Background & Goal  Problem Definition Methodology  Feature Generation  Semantic Features  Click-Based Features Experimental Results  Offline Evaluation Future Work

8 Features Engineering – fine-grained features FeaturesLevelDescriptionCategoryFeature Name Semantic Features URL-URL The similarity between the urls of previous queries and urls of current query Sat9 raw features UnSat9 raw features Query-URL The similarity between the previous queries and urls of current query Sat5 raw features UnSat5 raw features Weighted URL-URL The weighted similarity between the urls of previous queries and urls of current query Sat9 raw features UnSat9 raw features Click-Based Features URL Level The domain and history click statistics on the current url of current query History Click 1.clickcount_url_total 2.clickcount_url_sat 3.clickcount_url_unsat Domain Click 1.domain_click_total 2.domain_click_sat 3.domain_click_unsat Impression Level The Sat/UnSat statistics on the in- session history queries Clicked url 1.sat_click_urls 2.unsat_click_urls Clicked count 1.sat_click_count 2.unsat_click_count Last click 1.last_click_dwelltime

9 Semantic Features Three Distance to leverage the similarity for semantic features 1.dis(, ): The distance between query and, say. 2.dis(, ): The distance between i-th url of query and j-th url of query. 3.dis(, ): The distance between query and j-th url of query.

10 Semantic Features Semantic distance calculation ModelDistanceRangeSemantics XCodeHamming distance[0, 512] 0 is the most similar, 512 is the least similar DSSMCosine distance[-1, 1] -1 is the least similar, 1 is the most similar CDSSMCosine distance[-1, 1] -1 is the least similar, 1 is the most similar

11 XCode Map every word, query, url into an intent space. Measure semantic similarities of words, phrases and entities in an intent space. Maximize Mutual Information between Queries and URLs Trained on all of Bing search logs with 100b samples 512 bits X-Code Word: (0 1 1 1 0 1…) URL: (0 1 1 0 0 0…) Query: (0 1 1 0 0 1…) Clicked URLs: www.teslamotors.com www.wikipedia.orgwww.wikipedia.org … Queries: Tesla car Tesla bio … Maximize Mutual Information between Queries and URLs Trained on all of Bing search logs with 100b samples Maximize Mutual Information of queries and URLs on Bing logs * Borrowed from AIM Project

12 Deep Structured Semantic Model (DSSM) Representation: use deep neural networks to extract abstract semantic representations Learning: maximize the similarity between the query and the doc Scalability: use sub-word unit (e.g., letter-ngram) as raw input so that to handle an unlimited vocabulary (word hashing)

13 DSSM Double Model Training DSSM DSSM provides a way to encode how well contextual information is matched in the semantic level. Map queries/titles/docs … to a common low dimensional semantic space using DNN Leverage contextual information, not depend only on context terms Training DataSource ModelTarget Model Query ModelTitle Model Query ModelBrokenURL Model

14 DSSM Distance  Three Distance Calculation Calculate DSSM Distance between queries Calculate DSSM Distance between URLs Calculate DSSM Distance between query and URL Query Model Title Model Title Model Title Model Query Model Cosine Distance

15 Convolutional DSSM (Preliminary Trial) Difference between CDSSM and DSSM Sliding window: capture the contextual feature for a word. Convolutional layer: project each word within a context window to a local contextual feature vector. Max pooling layer: extract the most salient local features to form a fixed-length global contextual feature vector. CDSSM Training Pre-Training on Cosmos GPU Training.

16 Offline Experiment Self-Join Reducer Log Data Pair Distance Calculation {XCode, DSSM, CDSSM} Feature Data Generation {curQuery, Features} Training Data (1 week) Validation Data (1 day) Test Data (1 week) Input Features FastRank Feature Selection Tree Ensemble ReRanker Evaluator Baseline Ranker Indicator Feature Diagram 1. Feature Data Generation Diagram 2. ReRanking Evaluation

17 Outline Project Goal and Motivation  Background & Goal  Problem Definition Methodology  Feature Generation  Semantic Features  Click-Based Features Experimental Results  Offline Evaluation Future Work

18 Offline Performance Individual Features: XCode, Click-Based, DSSM For Semantic Features, DSSM (Query-Title) is the best. For Click-Based Features, it is also better, and easy to be flighted. More info in the table 1 of the Appendix.table 1 TimeToSuccess: The dwelling time to meet the first SatClick in the session. Session Success Position: defined by the position of the first SatClick in the session. Session Success Position SatClick RateUnSatClick Rate TimeToSuccess (sec) TimeToSuccess From Quickback

19 Offline Performance Features Combination: Semantic Features + Click-Based Features Semantic features and Click-Based features can provide different information to boost the performance. More info in the table 2 in the Appendix.table 2 TimeToSuccess (sec) Session Success Position TimeToSuccess From Quickback SatClick RateUnSatClick Rate

20 Click-Based Features Recap. LevelCategoryFeaturesSemanticsFunction URL Level History Click 1. clickcount_url_total 2.clickcount_url_sat 3.clickcount_url_unsat how many sat/unsat clicks for the current url in the previous queries in this session. Improve TimeToSuccess, SessionSuccessPosition Domain Click 1.domain_click_total 2.domain_click_sat 3.domain_click_unsat how many sat/unsat clicks for the domain of current url in the previous queries in this session Increase the SatClick Rate and decrease UnSatClick Rate Impression Level Clicked url 1.sat_click_urls 2.unsat_click_urls Some statistics from in- session history queries Clicked count 1.sat_click_count 2.unsat_click_count

21 Offline Performance Domain Click v.s. History Click Domain Click: Decrease the UnSat click History Click: Improve TimeToSuccess, SessionSuccessPosition More info in the table 4 in the Appendix.table 4 TimeToSuccess (sec) Session Success Position TimeToSuccess From Quickback UnSatClick Rate SatClick Rate

22 Example Query: the dangerous days of daniel x Altrelda wikithe dangerous days of daniel x Altrelda wiki URLOriginal RankCranker ScoreRank DeltaControlRankerScoreExperimentalRankerPositiondomain_click_totaldomain_click_satdomain_click_unsat http://fanon.wikia.com/wiki/The_Dangerous_Days_of_Daniel_X29.0423+19.27451000 http://en.wikipedia.org/wiki/The_Dangerous_Days_of_Daniel_X17.823710.9222505 http://danielx.wikia.com/wiki/The_Dangerous_Days_of_Daniel_X_(novel)47.0838+16.78453000 http://www.goodreads.com/series/49946-daniel-x66.1145+25.55714000 http://danielx.wikia.com/wiki/Daniel_X55.180906.41366550 http://www.amazon.com/The-Dangerous-Days-Daniel-X/dp/031611970935.0326-37.4447505 http://www.jamespatterson.com/books_danielX.php82.9647+14.68218505 http://en.wikipedia.org/wiki/Daniel_X:_Watch_the_Skies76.09865.26915000 Example 1. Five unSat clicks on Wikipedia in session log This is the 6 th query in the same session. In the same session, 2 nd query (“ the dangerous days of daniel x ”), there are 5 unsat clicks for Wikipedia page.

23 Appendix Evaluation:  Table 1. Individual Features  Table 2. Features Combination  Table 3. Domain Click v.s. History Click Features

24 Offline Performance FeaturesTimeToSuccess (sec) Session Success Position TimeToSuccess From Quickback (sec) SatClick Rate UnSatClick Rate Coverage All XCode-0.0159 [-0.0267%]​​ -0.0011 [-0.0483%] ​-0.0053 [-0.0194%] 0.0014 [0.1626%] 0.0004 [0.2923%] 0.3288 All Click- Based -0.0203 [-0.0343%]​ -0.0018 [-0.0758%]​ ​​-0.0038 [-0.0141%] ​0.0019 [0.2178%] ​0.0005 [0.3932%] 0.3288 DSSM (Query- Title Model) -0.0282 [-0.0476%]​​ -0.0026 [-0.1079%]​ ​-0.0067 [-0.0251%] 0.0016 [0.1897%] 0.0004 [0.3458%]​ 0.3193 DSSM (Query- BrokenURL Model) ​​-0.0194 [-0.0328%] ​-0.0018 [-0.0751%] ​-0.0047 [-0.0176%] 0.0014 [0.1734%] 0.0004 [0.3223%]​ 0.3202 CDSSM (Query-Title) -0.0255 [-0.0430%] -0.0024 [-0.1031%] -0.0051 [-0.0191%] 0.0016 [0.1909%] 0.0004 [0.3690%] 0.3193 Table 1. Individual Features

25 Offline Performance FeaturesTimeToSuccess (sec) Session Success Position TimeToSuccess From Quickback (sec) SatClick Rate UnSatClick Rate Coverage XCode + Click-0.0299 [-0.0504%]​ ​-0.0025 [-0.1049%] -0.0070 [-0.0259%]​ 0.0022 [0.2420%]​ 0.0005 [0.3825%]​ 0.3288 DSSM (Query- Title) + Click ​-0.0366 [-0.0617%]​ ​-0.0035 [-0.1492%]​ ​​-0.0059 [-0.0220%] 0.0022 [0.2625%] 0.0005 [0.4541%]​ 0.3193 DSSM (Query- BrokenURL) + Click ​​-0.0286 [-0.0483%] ​​-0.0029 [-0.1209%] ​-0.0044 [-0.0163%] 0.0021 [0.2478%]​​​ 0.0005 [0.4369%] 0.3202 XCode + Click + DSSM (Query-Title) ​-0.0499 [-0.0841%] -0.0045 [-0.1911%] ​-0.0109 [-0.0403%]​ 0.0025 [0.2821%] 0.0005 [0.4133%] CDSSM(Query -Title) + Click -0.0379 [-0.0639%] -0.0037 [-0.1548%] -0.0060 [-0.0224%] 0.0022 [0.2656%] 0.0005 [0.4567%] 0.3193 Table 2. The feature combination

26 Offline Performance FeaturesTimeToSuccess (sec) Session Success Position TimeToSuccess From Quickback (sec) SatClick Rate UnSatClick Rate CoverageUnSatClick Swap 6 Click- Based -0.3933 [-0.5938%]​ -0.0357 [-1.2221%]​ -0.0806 [-0.2815%]​ 0.0118 [1.3100%] 0.0025 [1.0925%] 0.79% 286 -> 307 (+21) 3 domain Click -0.1971 [-0.2976%] -0.0147 [-0.5044%] -0.0758 [-0.2648%] 0.0077 [0.8529%] 0.0014 [0.6157%] 0.79%249 - > 225 (-24) 3 history Click -0.3637 [-0.5490%]​ -0.0357 [-1.2246%] -0.0487 [-0.1701%] 0.0086 [0.9562%] 0.0030 [1.2915%] 0.79%128 -> 213 (+85) Table 3. Domain Clicks and History Clicks Features


Download ppt "Deep Learning Powered In- Session Contextual Ranking using Clickthrough Data Xiujun Li 1, Chenlei Guo 2, Wei Chu 2, Ye-Yi Wang 2, Jude Shavlik 1 1 University."

Similar presentations


Ads by Google