Presentation is loading. Please wait.

Presentation is loading. Please wait.

0 WikiMania08, July 18, 2008 Alexandria, Egypt Recent Developments at Yahoo! in Search & Mobile, and Future Challenges Research Usama Fayyad Chief Data.

Similar presentations

Presentation on theme: "0 WikiMania08, July 18, 2008 Alexandria, Egypt Recent Developments at Yahoo! in Search & Mobile, and Future Challenges Research Usama Fayyad Chief Data."— Presentation transcript:

1 0 WikiMania08, July 18, 2008 Alexandria, Egypt Recent Developments at Yahoo! in Search & Mobile, and Future Challenges Research Usama Fayyad Chief Data Officer & Executive VP Yahoo! Inc.

2 Research 1 Overview About Yahoo! and its business Yahoo! Mobile Philosophy OneSearch 2.0 Challenges in Mobile Search Some words about search advertising Examples of Search Evolution at Yahoo! Concrete examples of the changes that are relevant to Social Web Concluding thoughts

3 Research 2 Globally, Internet Users Number Over 1 Billion Source: IDC, December Internet Users in Millions:

4 Research 3 Yahoo! is the #1 Destination on the Web 73% of the U.S. Internet population uses Yahoo! – Over 500 million users per month globally! Global network of content, commerce, media, search and access products 100+ properties including mail, TV, news, shopping, finance, autos, travel, games, movies, health, etc. 25 terabytes of data collected each day… and growing Representing thousands of cataloged consumer behaviors More people visited Yahoo! in the past month than: Use coupons Vote Recycle Exercise regularly Have children living at home Wear sunscreen regularly Sources: Mediamark Research, Spring 2004 and comScore Media Metrix, February Data is used to develop content, consumer, category and campaign insights for our key content partners and large advertisers

5 Research 4 Yahoo! Data – A league of its own… GRAND CHALLENGE PROBLEMS OF DATA PROCESSING TRAVEL, CREDIT CARD PROCESSING, STOCK EXCHANGE, RETAIL, INTERNET Y! Data Challenge Exceeds others by 2 orders of magnitude

6 Research 5 What About Yahoo! Mobile? Fast growing initiative that is one of the companies priorities in the future Great success in distribution –signed deals with 29 carriers, and therefore its accessible to 600 million subscribers, who are now under contract. –OneSearch is Yahoos mobile search application that it launched 13 months ago. Just launched OneSearch 2.0 Marco Boerries, EVP of Mobile at Yahoo!: No one has never amassed that kind of distribution under that short period of time.

7 Research 6 Mobile Device Internet Penetration Will Eclipse the PC * U.N Telecommunications Agency, Sept 07 ** Informa, Nov 07 1 Billion people across the world use the Internet * 3.3 Billion people across the world are mobile service subscribers (thats half the global population)**

8 Research 7 Yahoo!s Global Mobile Reach 16.9 Million Unique Users Per Month In The U.S. Alone Yahoo! Google MSN AOL Unique Users Per Month (mm) Million in USA

9 Research 8 = PC Search Mobile Search Mobile Search Built for the Consumer

10 Research 9 The Mobile Use Case is Different Give me Answers, Entertainment, Images…

11 Research 10 Y! oneSearch Changed the Game Answers Instead of Web Links. Relevant, Complete Results

12 Research 11 Yahoo! Mobile Approach to Search OneSearch is a special federated search engine –Analyses Concept and Intent of the query against a large collection of vertical backends Web, News, Images, Finance, etc… UGC such as Wikipedia and Yahoo! Answers –Aggregates results from verticals and blends to optimize to user query and to device used for query Goal is to minimize clicks by taking user to results around tasks Query sources: –Browsers: WAP/XHTML –Java app interface for Yahoo! Go –SMS text messaging for Yahoo! Mobile SMS

13 Research 12 Approach Be as Open as possible on interfaces Fundamentally believe the mobile OS market will remain fragmented from a platforms perspective for quite a while –Windows Mobile only reached 30M users after more than 7 years of effort Provide an environment to allow users to program to one target platform and let Yahoo! bear the effort of making it run on wide range of devices Focus on the highest value apps for users today involving access to on-line world (less on client apps) Return results and not links

14 Research 13 Yahoo! Mobile Products * M:Metrics, October 2008 **All Yahoo! Mobile services are free. Check with your wireless carrier about data plan charges that may apply. Yahoo! Go 3.0Yahoo! oneSearch Yahoo! Home Page Yahoo! onePlace Yahoo! oneConnect

15 Research 14 OneSearch 2.0 OneSearch is being opened up to all publishers and content owners so they can write rich metadata that will be returned as part of results, rather than just a link, –Similar to Yahoos Search Monkey service for the Internet. More about this later… Three new major upgrades –Search Assist: The search box will predict what you are typing. –Voice input: Users can search by speaking into the device instead of typing (provided by Vlingo) –The search box will be integrated into the home screen of the phones.

16 Research 15 Providing more relevant content Unlocking the power of the semantic web Turning web search results into answers Better answers OneSearch 2.0

17 Research 16 Contextual recommendations Predictive text completion Easier, faster input Easier input OneSearch 2.0

18 Research 17 Personalized to your voice Search for anything Speak your search Voice input OneSearch 2.0

19 Research 18 Supports text & voice Gateway to the Internet Persistent 1-click access Always there OneSearch 2.0

20 19 Internet Use on Mobile vs. PC Research

21 20 Mobile Use Today, we believe Internet use on PC is about 10x that of Mobile Mobile is faster growing, in all regions There are > 3x mobiles today than Internet users globally –But most phones are not data capable yet The world today: –We are learning from the web, and attempting to figure out what makes sense for mobile users –Trying to work with the Smart Phones users as they represent the early adopters

22 Research 21 Classical web search user needs Informational (~25%) – want to learn about something Navigational (~40%) – want to go to that page Transactional (~35%) – want to do something (web- mediated) –Access a service –Downloads –Shop Gray areas –Find a good hub –Exploratory search see whats there Low hemoglobin United Airlines Mendocino weather Mars surface images Nikon CoolPix Car rental Finland Broder 2002, A Taxomony of web search

23 Research 22 What about on Mobile No good classification Several studies that cover –Query frequency distribution –Words per query –Characters per query Categorization by query type into traditional categories: –Adult and Entertainment, Autos, Consumer Goods, Finance, Government & politics, Sports, Technology, Travel, etc… Best known studies by –Kamvar and Baluja (2006 and 2007) –Yi, Maghoul, and Pedersen (2008) Good quantitative statistics, little on qualitative purpose-driven analysis (early days still)

24 Research 23 What do We Believe about Mobile Queries We believe it is a different distribution than the query distribution for PC users –Bias towards shorter queries Data contradicts that: 2.6 words per query, same # chars as PC –Difficulty of query entry is a significant hurdle –Much higher location-based activity –Much more task-oriented than exploration or research Notifications adds a whole new push dimension –Trigger alerts (stocks, news, auctions) –Location-based (geo-driven) –Event-based (calendar entires such as travel alerts, flight delays, etc.) Can learn much more about user intent and hence eventually more promising for advertising

25 Research 24 Implications and Challenges Task-orientation Specialized content packaging Locality Inference from queries Locality Inference from device (LBS) Minimize typing and round-trips: get results, not just links –Less room to display SERP + other accessories Monetization strategies to fund this model still not decided –Advertising –Subscription to premium services –Revenue share on leads –Pay per usage of special high-value areas In the meantime, the web, and Search are evolving…

26 Research 25 Even Larger Challenges Modeling Social Media and use of mobile in social settings on the go –Understanding UGC –Classifying, categorizing, organizing UGC and folksonomy A different problem of search -- Semantics of content are critical, especially if we are to target –Intent –Task-orientation –Motion dimension (distance to target of search) –Push and notifications –Understanding the physical world (common sense): what is close? Business hours? Holidays? Web Content growing, changing, diversifying, fragmenting Truly leveraging the notification abilities and finding new everyday uses – far more versatile a space than PC Long-term memory (state) for long-running tasks and queries

27 26 A Tale of Two Search Engines Research

28 27 Algorithmic results =Audience Advertisements =Monetization -$ +$

29 Research 28 Algorithmic vs. Ad Search Analogous to classical separation of editorial vs commercial content Technical underpinnings: –Some commonalities (IR, ML) –Many differences (incentives, spam, mechanism design)

30 Research 29 The two engines The Web Ad indexes Web spider Indexer Indexes Search User

31 Research : The Yahoo! Directory Apply human expertise and editorial to organize web sites What worked –Practical, Navigable –Trustworthy, Authoritative What didnt –Scalability –Granularity –Etc.

32 Research : Altavista (Inktomi, Lycos, etc.) Automate the process of acquiring pages; use information retrieval techniques to return pages that contain a particular term What worked –Scalable (query for IBM returns 40M pages) –Simple –Granular What didnt –Scalability a double-edged sword –Ranking and relevance poor –Not authoritative (spam, irrelevance, etc.)

33 Research 32 c : PageRank (Google, Yahoo) Use topology (link structure) of the web to confer authority What works –Relevance is greatly improved –Navigational query is born (query for IBM gets me to What doesnt –Homogeneity of results (no personalization) means no subjective queries – webmasters vote by proxy for everyone – and their answer is the only answer –System easily gamed by spammers – leads to arms race

34 Research 33 Meanwhile, On the Money Front… Sponsored search ranking: (morphed into Yahoo!) –Your search ranking depended on how much you paid –Auction for keywords: casino was expensive! 1998+: Link-based ranking pioneered by Google –Blew away all early engines except Inktomi –Great user experience in search of a business model –Meanwhile Goto/Overtures annual revenues were nearing $1 billion Result: Google added sponsored search ads to the side, independent of search results –2003: Yahoo follows suit, acquiring Overture (for paid placement) and Inktomi (for search) The Monetization Mechanisms… Conversion of marketplace machanisms in 2007

35 Research 34 Search query Ad

36 Research 35 Questions for the audience Do you think an average user, knows the difference between sponsored search links and algorithmic search results? Do you think an average user knows there are sponsored links on the page? Do you think a user knows where a sponsored link would navigate to upon a click?

37 Research 36 How it works Advertiser Landing page Sponsored search engine I want to bid $5 on canon camera I want to bid $2 on cannon camera Engine decides when/where to show this ad. Engine decides how much to charge advertiser on a click. Ad Index

38 Research 37 Engine: Three sub-problems 1.Retrieve ads matching query 2.Order the ads 3.Pricing on a click-through IR Econ

39 Research 38 Ads go in slots like these

40 Research 39 Higher slots get more clicks

41 Research Order the ads Most generally, composite IR+Econ score … for todays talk, focus on Econ Original GoTo/Overture scheme: –Order by bid

42 Research 41 Economic ordering Bid and revenue ordering: two forms of ordering by an econ score Does revenue ordering maximize revenue? No – advertisers react to ordering scheme, by changing their bid behavior! Lahaie+Pennock ACM EC 2007 –Family of schemes bridging Bid and Revenue ordering –Game-theoretic analysis Edelman, Ostrovsky, Schwarz 2006

43 Research 42 A new convergence Monetization and economic value an intrinsic part of system design –Not an afterthought –Mistakes are costly! Computing meets humanities like never before – sociology, economics, anthropology …

44 43 Towards Getting Things Done… vs. Searching Research

45 44 Example I want to book a vacation in Tuscany. StartFinish

46 Research 45 Example 2 Loved the vacation, want to make that sweet Italian coffee at home –Search: making good espresso –Browse: –Study: Temperature surfing a Rancilio Silvia –Price comparison: –Vendor comparison –Purchase from –Frothing milk tutorial –Cleaning and maintenance –Purchase grouphead brush and Urnex

47 Research 46 Loved the vacation, want to make that sweet Italian coffee at home

48 Research 47 Trends in task complexity Dawn of search: –Navigational queries –Pockets of information Today: –Increasing migration of content online –New forms of media only available online –Infrastructure for payments and reputation sufficient for many users

49 Research 48 Things to notice Long-running user goals Search as hub: –start there –return for resource discovery and at task boundaries –traverse the web broadly to complete task Web services integrated into task

50 49 Content Growth Research

51 50 Content trends [Ramakrishnan and Tomkins 2007]

52 Research 51 Metadata trends [Ramakrishnan and Tomkins 2007]

53 52 Content Complexity Research

54 53 Content ownership Content consumption is fragmenting – nobody owns more than 10% of WW PVs No single place will own all the content Best of breed processing will operate on the web version (?) Value transitions to ecosystem

55 Research 54 Content access is fragmenting

56 Research 55 Content itself is fragmenting

57 Research 56 Evolution of Social Media Although the traditional notion of portal and web content is still attracting growing audiences The original notion of publishing content to attract audiences is changing fast –As people discover the fact that the Internet is an Interactive Medium –The uses of the Internet enter areas we could not imagine a short time ago A new notion of publishing is fast emerging –The opportunity of user-generated content

58 Research 57 Challenges in social media How do we use these tags for better search? Whats the ratings and reputation system? How do you cope with spam? The bigger challenge: where else can you exploit the power of the people? What are the incentive mechanisms?

59 58 Evolution is starting The Search Interface Research

60 59 What does this mean for search? Few changes through 2005 Entering period of massive change to handle more complex content Rich media, aggregation, simple task analysis, etc Moving beyond the stateless query/response paradigm Personalization theory

61 Research 60 Rich media and search assistance

62 Research 61 Structured aggregation

63 Research 62 Simple task-focused queries

64 Research 63 Google Base

65 64 Open Ecosystems

66 Research 65 Structured data on the Web Structured databases power a vast majority of pages on the web –Certainly ecommerce catalogs etc –But also user generated content (eg blogs) Content owners open to exposing structure, but dont see how and why –Microformats adoption at an all-time high –Yet, its produced much more than is consumed Experiments with pure structured data aggregation have met with mixed success –Google Base, Freebase, even Co-op

67 Research 66 What have we announced? Yahoo! Search Monkey: API for publishers to push metadata and structure to search engine Wide-ranging support for semantic web standards Vocabulary to surface structure and semantics Community Tools to evolve standards and vocabulary

68 Research 67 Search as Killer App for Data Web Publishers and search engine collaborate Users see richer search experience Accomplish their tasks faster and more effectively Example: abstracts surfacing structured content

69 Research 68 babycenter epicurious Search results of the future LinkedIn webmd Gawker New York Times

70 Research 69 Search results of the future babycenter epicurious LinkedIn webmd New York Times Gawker

71 Research 70 Comprehensive support for emerging semantic web standards ++ Microformats –hCard, hEvent, hReview, hAtom, XFN –More as they get adopted RDFa and eRDF markup OpenSearch –+extensions to return structured data Atom/RSS Feeds –+extensions to embed structured data markup (crawl) apis (pull) push

72 Research 71 Vocabulary to surface structure dataRSS provides a common framework for embedding structured data –Use with RDFa, eRDF or OpenSearch –Preferred Vocabulary includes Atom, Dublin Core Creative Commons FOAF, GeoRSS, MediaRSS RDF, RDFS, RDF Review vCal, vCard

73 Research 72 Community Tools Were seeding the Vocabulary and Standards Support Well evolve both of these with the help of the Web Community Yahoo! Groups: used to collect contributor and community suggestions, feedback, etc… Suggestions Board to vote on changes

74 Research 73 Implications for publishers? Yahoo! open search platform does not modify ranking Richer abstracts may provide more information to users and draw higher quality/quantity of clicks We want rich abstracts that give users a better experience –We dont want misleading abstracts

75 Research 74 The whole story User needs becoming more complex Content growing, changing, diversifying, fragmenting Search responding by increase in sophistication Value migrating to ecosystem Unlock the value by enabling interoperability – expose semantics

76 Research 75 Subjective Queries The kinds of queries that rely on domain expertise… Do you know a reputable plumber in Atlanta? Where is the cool nightlife in Soho? What political blogs do you think Id enjoy reading? Where can I buy a cool pair of boots? These kinds of queries are ill-served by todays search engines, but are ironically the most valuable (i.e. transactional queries.)

77 Research 76

78 Research 77

79 Research 78

80 Research 79 No definitive answer Unverifiable answer Community consensus

81 Research 80 IncentivesLegitimate?

82 Research 81 Where is the Science? Which questions are legitimate? What is the incentive system? How do we validate answers? What is the role of the community? What is the reputation system?

83 Research 82 What are the challenges? Community of users – Social system Incentives and reputations –Economic system Poorly phrased, grammatically limited queries –Language analysis Improving user experience from past data –Data mining

84 83 These are early days… Back to Business Research

85 84 Advertising: Brand and DR Knowledge of users & their behavior throughout the purchase funnel can grow brand & direct response revenue Awareness Purchase Consideration > $200B Brand Advertising Market > $200B Direct Response Market Most time & activity is in consideration & engagement, but there are limited metrics & reach strategies

86 85 Why is search-related advertising so powerful? A question for the Audience:

87 Research 86 It is all about Inferring User Intent User type 2.8 keywords –Note the non-sense use of average –Average query returns > 600K matches! We get an idea of intent Coupled with immediacy (recency) – an amazing matching engine – 10x to 100x click through rate over banner ads

88 Research 87 Do I know this users intent?

89 Research 88 Brand Ads and Search Ads Interact! Is ad search strategy enough for a direct marketer? Do brand ads play a role in search advertising? Harris Direct Case Study Awareness Purchase Consideration

90 Research 89 Case Study: Harris Direct On: Viewing These Ads:Had This Effect On: Brand Favorability –Up 32% Purchase Intent –Up 15% Aided Brand Awareness –Up 7%

91 Research 90 Case Study: Harris Direct People who saw display ads were 61% more likely to search on related topics… …and drove 139% more clicks on algorithmic and sponsored links… …specifically driving 249% more sponsored search clicks … …and driving 91% more activity on the website.

92 91 Inventing the new sciences of the Internet Yahoo! Research Research

93 92 New Science? The Internet touches all of our lives: personal, commercial, corporate, educational, government, etc… Yet many of the basic notions we talk about: –Search, Community, Personalization, Engagement, Interactive Content, Information Navigation, Computational Advertising –Are not at all understood, or well-defined –These are not disciplines that academia or any industry research labs focus on…

94 Research 93 Areas of Research Information Navigation and Advanced Search –We are in the early days of search and retrieval –Inferring intent –New ways of extracting entities and objects Community: –How do you know what to believe on the Internet? –Trust models on-line and trust propagation –What makes communities thrive? Whither? –Social media, tagging, image and video sharing Microeconomics: a new generation of economics driven by massive interactions –Auction marketplaces –The web as a new LEI of activities and economies Computational Advertising –Targeting and matching sciences, Inferring user intent –Pricing models (CPM, CPC, CPA, CPL, etc…) –Large-scale optimization and yield management

95 Research 94 Concluding Thoughts (1) The notion of corpus and publishing is changing fundamentally We still do not have the basic sciences to understand what is happening and what needs to happen to combine the new capabilities The problem of mobile search is different, but poorly understood The web is changing, content sources are fragmenting and changing –the source distribution is radically changing –Publisher – consumer divide is becoming fuzzy Search engine interface is finally changing to adapt –Much of the change came from worrying about mobile search

96 Research 95 Concluding Thoughts (2) The view that Search is everything is LIMITED (at best) –Economics of publishing and advertising –Users do not differentiate ad and content –Behavioral data is the most powerful –Nothing predicts behavior like behavior Monetization and economic value an intrinsic part of system design –Not an afterthought –Mistakes are costly! Computing meets humanities like never before – sociology, economics, anthropology … A more holistic view of Search and Information Navigation is needed

97 96 Thank You! & Questions? Research

98 97 No time to cover today Micro-Economics of the Web –Auction marketplaces –Marketplace and Exchange Design –The economics of Engineering IT Decisions Computational Advertising –Targeting and matching sciences –Inferring user intent –Pricing models (CPM, CPC, CPA, CPL, etc…) –Large-scale optimization and yield management

Download ppt "0 WikiMania08, July 18, 2008 Alexandria, Egypt Recent Developments at Yahoo! in Search & Mobile, and Future Challenges Research Usama Fayyad Chief Data."

Similar presentations

Ads by Google