Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Ideas for Integrating Browsing and Search in the CDL Marti Hearst SIMS, UC Berkeley

Similar presentations


Presentation on theme: "1 Ideas for Integrating Browsing and Search in the CDL Marti Hearst SIMS, UC Berkeley"— Presentation transcript:

1 1 Ideas for Integrating Browsing and Search in the CDL Marti Hearst SIMS, UC Berkeley http://www.sims.berkeley.edu/~hearst

2 2 Goals Seamlessly integrate browsing and searching –Give users a “browsing the shelves” feeling –Allow them to discover new things –Mix and match different concepts in the query –Do this in an intuitive, unconfusing interface Avoid empty search results

3 3 Faceted Metadata Time/DateTopicRoleGeoRegion 

4 4 There are many ways to do it wrong Examples: –Melvyl online catalog: no way to browse enormous category listings –Audible.com, BooksOnTape.com, and BrillianceAudio: no way to browse a given category and simultaneosly select unabridged versions –Amazon.com: has finally gotten browsing over multiple kinds of features working; this is a recent development but still restricted on what can be added into the query

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 15

16 16

17 17

18 18

19 19 The Flamenco Project Incorporating Faceted Hierarchical Metadata into Interfaces for Large Collections Key Goals: –Support integrated browsing and keyword search Provide an experience of “browsing the shelves” –Add power and flexibility without introducing confusion or a feeling of “clutter” –Allow users to take the path most natural to them Method: –User-centered design, including needs assessment and many iterations of design and testing Yee, Swearingen, Li, Hearst, Faceted Metadata for Image Search and Browsing, Proceedings of CHI 2003.

20 20 Some Challenges Users don’t like new search interfaces. How to show lots more information without overwhelming or confusing? Our approach: –Integrate the search seamlessly into the information architecture. –Use proper HCI methodologies. –Use faceted metadata

21 21 Example of Faceted Metadata: Medical Subject Headings (MeSH) Facets 1. Anatomy [A] 2. Organisms [B] 3. Diseases [C] 4. Chemicals and Drugs [D] 5. Analytical, Diagnostic and Therapeutic Techniques and Equipment [E] 6. Psychiatry and Psychology [F] 7. Biological Sciences [G] 8. Physical Sciences [H] 9. Anthropology, Education, Sociology and Social Phenomena [I] 10. Technology and Food and Beverages [J] 11. Humanities [K] 12. Information Science [L] 13. Persons [M] 14. Health Care [N] 15. Geographic Locations [Z]

22 22 Each Facet Has Hierarchy 1. Anatomy [A] Body Regions [A01] 2. [B] Musculoskeletal System [A02] 3. [C] Digestive System [A03] 4. [D] Respiratory System [A04] 5. [E] Urogenital System [A05] 6. [F] …… 7. [G] 8. Physical Sciences [H] 9. [I] 10. [J] 11. [K] 12. [L] 13. [M]

23 23 Descending the Hierarchy 1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities [A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] 9. [I] 10. [J] 11. [K] 12. [L] 13. [M]

24 24 Descending the Hierarchy 1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities [A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] Electronics 9. [I] Astronomy 10. [J] Nature 11. [K] Time 12. [L] Weights and Measures 13. [M] ….

25 25 The Approach Assign faceted metadata to content items Allow users to navigate through the faceted metadata in a flexible manner Organize search results according to the faceted metadata so navigation looks similar throughout Give previews of next choices Allow access to previous choices

26 26 The Flamenco Interface Hierarchical facets Chess metaphor –Opening –Middle game –End game Tightly Integrated Search

27 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 36 What is Tricky About This? It is easy to do it poorly –Yahoo directory structure It is hard to be not overwhelming –Most users prefer simplicity unless complexity really makes a difference It is hard to “make it flow” –Can it feel like “browsing the shelves”?

37 37 Using HCI Methodology Identify Target Population –Architects, city planners Needs assessment. –Interviewed architects and conducted contextual inquiries. Lo-fi prototyping. –Showed paper prototype to 3 professional architects. Design / Study Round 1. –Simple interactive version. Users liked metadata idea. Design / Study Round 2: –Developed 4 different detailed versions; evaluated with 11 architects; results somewhat positive but many problems identified. Matrix emerged as a good idea. Metadata revision. –Compressed and simplified the metadata hierarchies

38 38 Using HCI Methodology Design / Study Round 3. –New version based on results of Round 2 –Highly positive user response Identified new user population/collection –Students and scholars of art history –Fine arts images Study Round 4 –Compare the metadata system to a strong, representative baseline

39 39 Most Recent Usability Study Participants & Collection –32 Art History Students –~35,000 images from SF Fine Arts Museum Study Design –Within-subjects Each participant sees both interfaces Balanced in terms of order and tasks –Participants assess each interface after use –Afterwards they compare them directly Data recorded in behavior logs, server logs, paper-surveys; one or two experienced testers at each trial. Used 9 point Likert scales. Session took about 1.5 hours; pay was $15/hour

40 40 The Baseline System Floogle Take the best of the existing keyword-based image search systems

41 41 Comparison of Common Image Search Systems System Collection# Results /page Categor ies? # Familiar GoogleWeb20No27 AltaVistaWeb15No8 CorbisPhotos9-36No8 GettyPhotos, Art 12-90Yes6 MS OfficePhotos, Clip art 6-100YesN/A ThinkerFine arts images 10Yes4 BASELINEFine arts images 40YesN/A

42 42 sword

43 43

44 44

45 45

46 46 Evaluation Quandary How to assess the success of browsing? –Timing is usually not a good indicator –People often spend longer when browsing is going well. Not the case for directed search –Can look for comprehensiveness and correctness (precision and recall) … –… But subjective measures seem to be most important here.

47 47 Hypotheses We attempted to design tasks to test the following hypotheses: –Participants will experience greater search satisfaction, feel greater confidence in the results, produce higher recall, and encounter fewer dead ends using FC over Baseline –FC will perceived to be more useful and flexible than Baseline –Participants will feel more familiar with the contents of the collection after using FC –Participants will use FC to create multi-faceted queries

48 48 Four Types of Tasks –Unstructured (3): Search for images of interest –Structured Task (11-14): Gather materials for an art history essay on a given topic, e.g. Find all woodcuts created in the US Choose the decade with the most Select one of the artists in this periods and show all of their woodcuts Choose a subject depicted in these works and find another artist who treated the same subject in a different way. –Structured Task (10): compare related images Find images by artists from 2 different countries that depict conflict between groups. –Unstructured (5): search for images of interest

49 49 Other Points Participants were NOT walked through the interfaces. The wording of Task 2 reflected the metadata; not the case for Task 3 Within tasks, queries were not different in difficulty (t’s 0.05 according to post-task questions) Flamenco is and order of magnitude slower than Floogle on average. –In task 2 users were allowed 3 more minutes in FC than in Baseline. –Time spent in tasks 2 and 3 were significantly longer in FC (about 2 min more).

50 50 Results Participants felt significantly more confident they had found all relevant images using FC (Task 2: t(62)=2.18, p<.05; Task 3: t(62)=2.03, p<.05) Participants felt significantly more satisfied with the results (Task 2: t(62)=3.78, p<.001; Task 3: t(62)=2.03, p<.05) Recall scores: –Task2a: In Baseline 57% of participants found all relevant results, in FC 81% found all. –Task 2b: In Baseline 21% found all relevant, in FC 77% found all.

51 51 Post-Interface Assessments All significant at p<.05 except simple and overwhelming

52 52 Perceived Uses of Interfaces Baseline FC

53 53 Post-Test Comparison 1516 230 129 428 823 624 283 131 229 FCBaseline Overall Assessment: More useful for your tasks Easiest to use Most flexible More likely to result in dead ends Helped you learn more Overall preference Find images of roses Find all works from a given period Find pictures by 2 artists in same media Which Interface Preferable For:

54 54 Facet Usage Facets driven largely by task content –Multiple facets 45% of time in structured tasks For unstructured tasks, –Artists (17%) –Date (15%) –Location (15%) –Others ranged from 5-12% –Multiple facets 19% of time From end game, expansion from –Artists (39%) –Media (29%) –Shapes (19%)

55 55 Qualitative Observations Baseline: –Simplicity, similarity to Google a plus –Also noted the usefulness of the category links FC: –Starting page “well-organized”, gave “ideas for what to search for” –Query previews were commented on explicitly by 9 participants –Commented on matrix prompting where to go next 3 were confused about what the matrix shows –Generally liked the grouping and organizing –End game links seemed useful; 9 explicitly remarked positively on the guidance provided there. –Often get requests to use the system in future

56 56 Study Results Summary Overwhelmingly positive results for the faceted metadata interface. Somewhat heavy use of multiple facets. Strong preference over the current state of the art. This result not seen in similarity-based image search interfaces. Hypotheses are supported.

57 57 New Features Save groups of images and searches “Find Similar Images”

58 58 Advantages Users have a feeling of control Users can predict what will happen –Not true of statistical ranking or clustering Adding new items to the system changes the behavior in understandable ways Users have flexibility –In ordering of operations –In combining of operations

59 59 Thank you! Marti Hearst flamenco.berkeley.edu www.sims.berkeley.edu/~hearst


Download ppt "1 Ideas for Integrating Browsing and Search in the CDL Marti Hearst SIMS, UC Berkeley"

Similar presentations


Ads by Google