Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1.

Similar presentations


Presentation on theme: "Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1."— Presentation transcript:

1 Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1

2 Motivation  The main source of income for search engines is web search advertising, which places relevant advertisements together with the search engine results.  Given a specific keyword, advertisers bid for it, and the winner of the auction has her ads displayed as sponsored links next to the search results.

3 Keyword Generation Problem  The problem of identifying an appropriate set of keywords for a specific advertiser.  Also known as keyword research.  Examples: Google’s Adwords Keyword Tool, Overture/Yahoo! Keyword Selector Tool and Microsoft adCenter Labs’ Keyword Group Detection

4 Solution – Query-Click Logs  Maintain the queries that users pose to the search engine and the documents that are clicked in return.  Clicks define a strong association between the queries and the URLs.  This association is used to find the queries that are related to the interests of the advertisers.

5 Example  Suppose the owner of shoes.com online store launches an ad campaign.  Most of the queries to shoes.com come from users interested in buying shoes  Query-click log has mapping of queries and the clicked documents/url

6 Problem Definition  Input A search engine click log L that consists of triples where q is a query, u is the URL of a document, and f qu is the number of times that the users issued query q ɛ Q (set of all queries) and clicked and clicked on URL u ɛ U (set of all queries). The click log L is considered as a weighted bipartite graph and is known as click graph where Q and U constitute the partitions of the graph, and for every record in the log, there is an edge (q,u) ɛ E with weight f qu. A set of concepts C={c 1 ….c k }. The concepts represent abstract themes that the advertiser is interested in which can be either general (eg. Shoes) or specific (eg. Running shoes) A seed set S u U×C of URLs in the click log that are manually assigned to the concepts in C. The seed set S consists of pairs where u ɛ U and c ɛ C is the label of concept c.

7 Problem Definition cont..  Output – Given G, C and S the goal of keyword generation problem is to populate the concepts in C with queries from Q.  These queries are then used as keyword suggestions to the advertisers that are interested in the specific concept.

8 A Random Walk Algorithm  For some query q ɛ Q, compute the affinity of q to some seed node s ɛ S as the probability that a random walk that starts from q ends up at node s.  Similarly, the affinity of q to the concept class c ɛ C is the probability that the random walk that starts from q ends up in any seed node in class c.

9 ARW cont..  l q (or l u ) denote random variable pertaining to the concept label for query q (or URL u)  P(l q = c), probability that a random that starts from q will be absorbed at some node of the class c.  α is the probability of making a transition to the null class absorbing node, from any node in the graph.  γ threshold to discard probabilities of class to increase efficiency.

10 Markov Random Fields (MRF)  An MRF is an undirected graph, where each node in the graph is associated with a random variable and edges model the pairwise relationships between the random variables.  Markov assumption is the characteristic of MRF that the value of a random variable is independent of the rest of the graph, given the value of all its neighbors.

11 Gaussian Markov Random Fields  Use of continuous relaxation instead of discrete, that is, the class labels are real numbers in the [0,1] interval.

12 Variational Inference and Mean Field Algorithm  Labels are discrete unlike Gaussian Markov Random Field.  The goal of variational inference is to approximate the true intractable posterior distribution P with a tractable distribution P ^ that has a simpler form.

13 Experiment Result  Total 20 categories  ARW has max Micro- avg Relevance  Snippets has higher Micro-avg for 5 out of 20 categories  Mean field has higher avg for 2 categories

14 Conclusion  An approach to keyword generation that leverages the information available in the search engine click logs.  This approach requires minimal effort from the part of the advertisers.  Promising experimental results demonstrate that these algorithms can scale to large query logs and produce high-quality results.

15 Thank you !!


Download ppt "Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1."

Similar presentations


Ads by Google