Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning User Behaviors for Advertisements Click Prediction Chieh-Jen Wang & Hsin-Hsi Chen National Taiwan University Taipei, Taiwan.

Similar presentations


Presentation on theme: "Learning User Behaviors for Advertisements Click Prediction Chieh-Jen Wang & Hsin-Hsi Chen National Taiwan University Taipei, Taiwan."— Presentation transcript:

1 Learning User Behaviors for Advertisements Click Prediction Chieh-Jen Wang & Hsin-Hsi Chen National Taiwan University Taipei, Taiwan

2 SIGIR 2011 workshop: Internet Advertising Introduction  The commercial value of advertisements on the web depends on whether users click on the advertisements  Predicting potential advertisement clicks of users before target advertisements are displayed is important -advertisement recommendation -advertisement placement -presentation pricing  Problem specification -Given a current search session (q 1, q 2,..., q (i-1) ), we will predict if there is an ad click event when query q i is submitted.

3 SIGIR 2011 workshop: Internet Advertising Related Work  Advertisiment click prediction model -Feature representation text features (Richardson et al., 2007) demographics features (Cheng & Cantú-Paz, 2010) mouse trajectory features (Guo & Agichtein, 2010) -Machine learning algorithm logistic regression (Richardson, Dominowska, & Ragno, 2007) maximum entropy (Cheng & Cantú-Paz, 2010) support vector machines (Broder et al., 2008) conditional random field (Guo & Agichtein, 2010)

4 SIGIR 2011 workshop: Internet Advertising Related Work  User search intent -navigational, informational and transactional (Broder, 2002) -noncommercial/commercial & navigational/informational (Ashkan et al., 2009) -research & purchase (Guo & Agichtein, 2010) -receptive & not receptive (Guo & Agichtein, 2010) “receptive” (i.e., an advertisement click is expected in a future search within the current session) “not receptive” (i.e., not any future advertisement clicks are expected within the current session)

5 SIGIR 2011 workshop: Internet Advertising Overview

6 SIGIR 2011 workshop: Internet Advertising Overview

7 SIGIR 2011 workshop: Internet Advertising Microsoft AdCenter Logs  Time: 2007-08-10 ~ 2007-11-01(84 days)  The Microsoft AdCenter logs include: -101 million impressions -7.82 million clicks -40.6 million sessions (5.06 million sessions contain at least one click)  An impression is defined as a single search results page described by a set of attributes  A session is defined by a repeated search engine usage of intervals of 10 minutes and less, with a total session not longer then 8 hours

8 SIGIR 2011 workshop: Internet Advertising Data Purify  For the purposes of promotions, some specific queries are issued or advertisements are clicked by software robots  Filter criteria -issue queries more than 7 times in any 10 second interval -issue queries at two distinct places at the same time -click an advertisement more than one time in any 5 second interval -duplicated impression IDs  Data partition -Training: sessions which contain at least one advertisement click in the first 56 days -Testing: sessions in the last 28 days

9 SIGIR 2011 workshop: Internet Advertising Experiment Datasets TrainingTesting # of sessions (clicks)3.12M1.42M # of sessions (non-clicks)010.61M # of click impressions3.75M1.73M # of non-click impressions6.92M37.41M

10 SIGIR 2011 workshop: Internet Advertising Overview

11 SIGIR 2011 workshop: Internet Advertising Feature Extraction  Feature representation -Every impression q i (1  i  n) in session s = (q 1, q 2,..., q (i-1), q i, q (i+1),..., q n ) is represented as a feature vector -q i itself (Current Impression Level) -the first impression q 1 (First Impression Level) -the previous n impression q (i-n) (Previous n Impression Level) -all the contextual impressions q 1, q 2,..., q (i-1) in s (Contextual Impression Level)  Labeling -click if impression q i contains at least one advertisement click, otherwise non- click.

12 SIGIR 2011 workshop: Internet Advertising Feature Extraction from Current Impression Level  These features aim to capture query information, users’ intent and the similarity between current query an previous one  QC (query category) -14 categories (exclusive of “Regional” and “World”) on the 2nd level of the Open Directory Project (ODP) ontology to represent query categories  QIntent (query intent) -4,020 intent clusters are learned from MSN Search Query Log excerpt (Wang et al., 2010) -QIntent is specified by the distribution of the top 100 similar intent clusters FeatureDescriptionFeatureDescription QP Position of q i in s, i.e., iQtypeType of query in q i : information, navigation, or transaction #QT Number of query terms in q i QCODP categories of query in q i QT Query terms in q i QIntentIntent type of query in q i IsURLQ1 if the query in q i is in the form of a URL, and 0 otherwise QSim Cosine similarity between query terms in q i and q i-1 QDMADMA level user location ID of q i QOverlapOverlapping between query terms in q i and q i-1

13 SIGIR 2011 workshop: Internet Advertising Feature Extraction from First Impression Level  These features aim to capture an initial search goal of a session. FeatureDescriptionFeatureDescription FQQuery terms in q 1 TimeToFQTime duration (in seconds) between q 1 and q i

14 SIGIR 2011 workshop: Internet Advertising Feature Extraction from Previous n Impression Level  These features aim to capture the advertisements clicks information of the previous n impression.  In our experiments, n is set to 1 and 2 FeatureDescriptionFeatureDescription PNP n Page number of the result page of q (i-n) ClickDNP n URLdomain names of clicked advertisements in the result page of q (i-n) #AdP n Number of advertisements displayed in the result page of q (i-n) AdCP n ODP categories of the clicked advertisements in q (i-n) IsClickP n 1 if there is at least one advertisement click in q (i-n), and 0 otherwise AdIntentP n Intent types of the clicked advertisements in q (i-n) T#ClickP n Total number of clicked advertisements in q (i-n) TimeToP n Time duration (in seconds) between q (i-n) and q i ClickRP n The ranks of clicked advertisements in the result page of q (i-n) #AdoverlapDisplayed advertisements overlapping between q i-n and q i-(n+1)

15 SIGIR 2011 workshop: Internet Advertising Feature Extraction from Contextual Impression Level FeatureDescriptionFeatureDescription T#AdTotal advertisements reported in q 1, q 2,..., q (i-1) ConClicki-j where qj, q(j+1),..., q(i-1) contain clicked advertisements continuously T#ClickTotal number of clicked advertisements in q 1, q 2,..., q (i-1) NearClicki-j where qj is the nearest impression containing clicked advertisements CTRAdvertisements click through ratio before qi = total clicked ads divided by total ads before qi CTQCODP categories of queries in q1, q2,..., q(i-1) T#Ad@mTotal number of advertisement reports at rank m of q1, q2,..., q(i-1), where m=1, 2,..., 8 CTQIntentIntent types of queries in q1, q2,..., q(i-1) T#Click@ m Total number of advertisements clicks at each rank of q1, q2,..., q(i-1) CTAdCODP categories of clicked advertisements in q1, q2,..., q(i-1) CTR@mClick through ratio for each rank at q1, q2,..., q(i-1) CTAdIntentIntent types of clicked advertisements in q1, q2,..., q(i-1) T#ConCli ck Total number of advertisements clicked in q 1, q 2,..., q (i-1) CTIntentDisIntents of clicked advertisements in q1, q2,..., q(i-1) after disambiguation

16 SIGIR 2011 workshop: Internet Advertising Feature Extraction from Contextual Impression Level  These features represent a sequence of users’ behaviors  Weight of intent types of submitted queries (CTQIntent) and clicked advertisements (CTAdIntent) in the access history is defined as: -P m is a probability of the type m intent -w j denotes a query or a clicked advertisement in q j  Weight of ODP categories (CTQC & CTAdC) Jelinek-mercer smoothing

17 SIGIR 2011 workshop: Internet Advertising Overview

18 SIGIR 2011 workshop: Internet Advertising Click Prediction Model  Four learning algorithms -Conditional Random Fields (CRF) -Support Vector Machine (SVM) kernel function (RBF, linear kernel) parameter optimization (grid algorithm for c and g) -Decision Tree C4.5 Tree -Back-Propagation Neural Networks Hidden Layer =2 Learning rate = 0.8 Momentum = 0.2

19 SIGIR 2011 workshop: Internet Advertising Feature Selection Algorithm  Random Subspace Method (RS) -an ensemble classifier that consists of several classifiers -prediction is through a majority vote from the classifiers  F-Score (FS) & Information Gain (IG) -greedy inclusion algorithm -retain a number of the best terms or features for use by the classier

20 SIGIR 2011 workshop: Internet Advertising Overview

21 SIGIR 2011 workshop: Internet Advertising Performance of Advertisements Click Prediction All FeaturesNon-click typeClick type ModelAccPrecRecF1F1PrecRecF1F1 Guess 0.9559 1.00000.9780 000 MM 0.69170.95860.70810.83340.05050.33690.1937 CRF 0.84690.97980.85750.91860.16630.61670.3915 DT 0.87060.96660.89550.93110.12700.32960.2283 BPN 0.87500.96720.89980.93350.13440.33750.2359 SVM (RBF) 0.88090.96790.90540.93660.14510.34810.2466 SVM (Linear) 0.87810.96750.90280.93510.13990.34310.2415  Metrics -accuracy (Acc), precision (Prec), recall (Rec), and F-measure (F1)  Baseline -guessing the majority class (non-click) is one baseline. -Markov Model (MM), formulated by query transition.

22 SIGIR 2011 workshop: Internet Advertising Performance of Feature Selection Features SelectionNon-click typeClick type ModelAccPrecRecF1F1PrecRecF1F1 CRF(ALL) 0.84690.97980.85750.91860.16630.61670.3915 CRF(RS15) 0.84570.97970.85630.91800.16480.61450.3897 CRF(RS25) 0.84930.98010.85980.91990.16960.62100.3953 CRF(RS35) 0.85110.98030.86150.92090.17210.62420.3982 CRF(RS45) 0.85040.98020.86090.92050.17110.62300.3971 CRF(FS) 0.84730.97990.85790.91890.16700.61750.3923 CRF(IG) 0.84790.97990.85850.91920.16780.61860.3932 SVM(ALL) 0.88090.96790.90540.93660.14510.34810.2466 SVM(RS15) 0.87960.96770.90420.93590.14260.34570.2442 SVM(RS25) 0.88110.96790.90570.93680.14560.34860.2471 SVM(RS35) 0.88130.96790.90580.93690.14590.34880.2474 SVM(RS45) 0.88150.96790.90600.93700.14630.34920.2477 SVM(FS) 0.88110.96790.90560.93680.14550.34850.2470 SVM(IG) 0.88120.96790.90580.93680.14580.34880.2473

23 SIGIR 2011 workshop: Internet Advertising Top-10 Important Features F-ScoreInformation Gain RankFeatureFLRIFeatureFLRI 1QTCI1QTCI1 2CTAdIntentCT0.7751 CTIntent Dis CT 0.6284 3CTIntent Dis CT0.6498CTQIntentCT0.5268 4CTQIntentCT0.5092T#ClickP 1 PI0.4128 5FQFI0.3557CTRCT0.2884 6IsClickP 1 PI0.3222T#AdCT0.2612 7CTRCT0.3052ConClickCT0.2475 8T#ClickP 1 PI0.2943CTAdIntentCT0.2386 9ConClickCT0.2688NearClickCT0.2179 10NearClickCT0.2568QtypeCI0.2082

24 SIGIR 2011 workshop: Internet Advertising Conclusion and Future Work  We explore the effects of various intent-related features on advertisements click prediction  CRF model performs better than two baselines and SVM significantly  When random subspace method is introduced to feature selection, the precision of click prediction is increased from 0.1663 to 0.1721  In the future, we plan to expand our model to consider fine-grained user intent and user interactions  In addition, we will extend this approach to predict which advertisements will be clicked

25 SIGIR 2011 workshop: Internet Advertising Thank You Q & A


Download ppt "Learning User Behaviors for Advertisements Click Prediction Chieh-Jen Wang & Hsin-Hsi Chen National Taiwan University Taipei, Taiwan."

Similar presentations


Ads by Google