Presentation is loading. Please wait.

Presentation is loading. Please wait.

WHAT AND HOW CHILDREN SEARCH ON THE WEB Sergio Duarte Torres, Ingmar Weber.

Similar presentations


Presentation on theme: "WHAT AND HOW CHILDREN SEARCH ON THE WEB Sergio Duarte Torres, Ingmar Weber."— Presentation transcript:

1 WHAT AND HOW CHILDREN SEARCH ON THE WEB Sergio Duarte Torres, Ingmar Weber

2

3

4

5

6 Motivation

7

8 Goals of this work Identify and quantify search struggle of young users Retrace stages of child development through their web searches

9 What data was used? US Yahoo! search logs from May to August of 2010 Cleaning steps: User wise: Logs from users without Yahoo! accounts were removed Query wise: Queries issued by a single user were removed Queries with personally identifiable information Non alpha-numerical single token queries Why the cleaning? What could be advantages/disadvantages?

10 An aside about the data Users under 13 years old required the consent of an responsible adult to register at Yahoo! (costs $.50) Some people may lie about their age… General trends are expected to be robust to noise People may lie about their age but … usually they tend to make themselves appear older Where do you think millions of children lie about their age? http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/3850/3075

11 Data segmentation Users grouped based on their reported birth year Age estimated as: 2010 – Birth year Following age buckets were created: 6-7: early elementary 8-9: readers 10-12: advance readers 13-15: teenagers 16-18 : mature teenagers >18: grown ups

12 Data characteristics Data set size Below 10 years oldAbove 10 years old Volume of queries>100K>1M Number of users>10K>100K

13 Methodology: Micro- vs. Macro-Averages User A: 100x cooking 10x science User B: 1x cooking 5x science User C: 2x cooking 10x science Micro avg.: cooking = (100+1+2)/(100+10+1+5+2+10) = 0.80 Macro avg.: cooking = (100/110 + 1/6 + 2/12) / 3 = 0.41 People search mostly for cooking. True? False?

14 Methodology: Detecting Navigational Queries facebook, yahoo mail, google,... How would you do it? Editorial judgments Ask human judges to mark queries a navigational Drawbacks? Click entropy Look at the diversity of the results clicked in response Drawbacks? String similarity heuristics Try to find query as substring in clicked domain Drawbacks?

15 Search Difficulty Outline 1. Query length 2. Natural language usage 3. Click position bias 4. Other signs of click position bias 5. Children expose to adult content 6. Time spent on web results 7. Sessions characteristics

16 Query length Increasing query length through the age groups Slightly bigger gap for non-navigational queries Greater ambiguity in children queries

17 Natural language usage (I) Questions instead of queries what is the only immortal animal? Modal queries I don’t want to go to school Factual queries describe the parts of a cell Superlative queries the fastest dog Targeted queries for kids car photos for kids

18 Natural language usage (II) Greater NL usage at younger ages Teenagers behavior closer to children than adults behavior

19 Click position bias Other explanations?

20 Clicks on ads Children aged 6-9 more likely to click on ads! Evidence of disorientation during the search process

21 How to evaluate search success using click data? How would you do it?

22 Time spent on web results Click duration as a signal of search success. Hassan et al (2010) WSDM ‘10 Short click (0-10 secs): Unsuccessful click Long click (≥ 100 secs): Successful click

23 Children exposed to adult content Likelihood of accidental click on adult content: Click on adult content is short and the action is immediately reverted by a click on a non-adult content

24 Sessions characteristics (I) Shorter sessions in young users Jump to adulthood also occurs in the group of users from 19 to 25

25 Sessions characteristics (II) Query refinding c q q’ q What do refinding queries indicate?

26 Sessions characteristics (III) Click refinding q c c’ c

27 Sessions characteristics (IV) Shorter sessions?

28 Tracing children development on the web: Outline 1. What do children search for? 2. What entities are children interested in? 3. Does the reading level of the clicks varies across ages and education?

29 Classifying queries into topics

30 “sigir 2011”? computers_and_internet/programming_and_development Classifying queries into topics

31 What do children search for? Children and teenager groups have few dominant topics Adults have more diverse query topics Also due to smaller vocabulary

32 Gender differences (I) Which topic is most responsible for gender differences?

33 Gender differences (II)

34 What entities are children interested in? Queries mapped to Wikipedia entities using site search on wikipedia.org/wiki QueryEntity facebook, facebook loginen.wikipedia.org/wiki/Facebook back to school clothes, london schol uniforms en.wikipedia.org/wiki/School_uniform Hummus recipe, ideal proteinen.wikipedia.org/wiki/Hummus How to map web queries to Wikipedia pages?

35 What entities are children interested in? (10-12)

36 What entities are adults interested in? (40+)

37 What entities are children interested in? Greater used of child oriented entities at young ages

38 Does the reading level of the clicks varies across ages? Based on Google reading level classification 70% (kids) vs 50% (adults) of clicks classified as basic

39 Does the reading level of the clicks vary across ages? (II) Reading level also varies according to education level Education level of adults according to US census CIKM 2011. Glasgow, 26 of October

40 Gender: Male Birth year: 1978 ZIP code: 95054 cheap holidays Expected income: $ 31k Expected education: 45% BA Race distribution: 38% w, 47% A Label (Q,D) with $31k, 45%BA,... Q D US Census Data factfinder.census.gov Getting demographics from US census

41 Conclusions Clear behavioral differences between children and adults Although not clean between teenagers and children Sudden jump to adulthood from 19 to 25 years old Stronger position click biased for children, including ads Assistance of question queries Understanding concerns expressed in their queries

42 THANK YOU FOR YOUR ATTENTION


Download ppt "WHAT AND HOW CHILDREN SEARCH ON THE WEB Sergio Duarte Torres, Ingmar Weber."

Similar presentations


Ads by Google