Presentation is loading. Please wait.

Presentation is loading. Please wait.

Interfaces for Querying Collections. Information Retrieval Activities Selecting a collection –Lists, overviews, wizards, automatic selection Submitting.

Similar presentations


Presentation on theme: "Interfaces for Querying Collections. Information Retrieval Activities Selecting a collection –Lists, overviews, wizards, automatic selection Submitting."— Presentation transcript:

1 Interfaces for Querying Collections

2 Information Retrieval Activities Selecting a collection –Lists, overviews, wizards, automatic selection Submitting a request –Queries & expressiveness –Graphical interfaces –Natural language Examining the response –Next class

3 Simple Query Interface

4 Complex Query Interface

5 Primary HCI Styles Command language Form filling Menu selection Direct manipulation Natural language Others?

6 Boolean Queries Most commercial full-text retrieval systems (until recently) supported only Boolean queries. Many studies show users have difficulty with Boolean expression –And and Or not as used in English “cats and dogs” “tea or coffee” –Syntax specifying nesting is often cryptic Boolean model does not include ranking –Earlier systems used reverse chronological order

7 Web-based Boolean Queries Search engines based on Boolean or extended Boolean engines needed to make their systems usable by the Web audience Reduce expressiveness for ease of use –Use “all the words” and “any of the words” –Boolean-based search engines added the + prefix Ranking performed using statistical algorithms and Web-specific heuristics

8 Command Line Search Command line interfaces for search Example Queries from Melvyl: –FIND PA darwin and TW species or TW descent –FIND TW Mt St. Helens AND DATE 1981

9 Command Line Search Still in use …

10 Form and Menus Melvyl

11 Faceted Queries Boolean queries often return too many or too few results –Conjunctions reduce sets too quickly –Disjunctions grow sets too quickly Solution: –Try out smaller queries to see if they have an appropriately sized set of results –Combine the smaller queries that are successful into larger query. Example: 1.(osteoporosis OR “bone loss”) 2.(drugs OR pharmaceuticals) 3.(preventions OR cure) 4.1 AND 2 AND 3

12 Post-Coordinate or Quorum Ranking Results are first ranked based on how many facets of the query they match. Faceted Search with Quorum ranking allows specifying each concept in multiple ways yet ranking based on number of concepts included in document. Further extension is to allow users to weight each facet. –Found on the web to help balance different goals of search (e.g. selecting a car or house)

13 Result Size Problem Occurs with Web Search Too

14 Graphical Query Specification Graphical interfaces can be static, direct manipulation, or combine the two. Direct manipulation –Continuous representation of objects –Physical actions replace complex syntax –Rapid incremental reversible operations on objects –Immediate feedback on actions

15 Graphical Boolean Queries Graphical queries are more accurate and faster than command-line queries in some studies Venn diagrams are common graphical approach –Limit to three elements in conjunction VQuery –Let users draw ellipses to create their own queries

16 VQuery

17 Process-Based Graphs Can graphically represent the query as a process of selection. Filter-flow model presents a set of filters. –One attribute and set of potential values per filter, multiple values treated as disjunction –Branches in flow indicate disjunctions –Serialized filters indicate conjunctions Fewer errors made with filter-flow than with SQL

18 Filter-Flow

19 Block-diagram Visualization Users arrange blocks to specify query. STARS –Users initially type in natural language query –Query terms are turned into blocks –Blocks are then arranged into query –Blocks in same row represent conjunction –Blocks in same column represent disjunction –Allows for previewing the query results by simple rearrangement of blocks

20 STARS

21 Magic Lenses Lenses act as filters on an overview visualization. –Disjunction is represented by independent lenses –Conjunction is expressed by placing multiple lenses over one another –Lenses can include addition information Where the term must appear Term frequency requirements Switches to use stemming …

22 Magic Lenses

23 Phrases and Proximity Specifying phrases and proximity constraints can be used to vastly improve precision. Phrase search is often used in the context of the Web. –But the phrase must be literal –“President Lincoln” does not match “President Abraham Lincoln” Proximity constraints allow for more general queries –Examples: LEXIS-NEXIS “white w/3 house” means “white within three words of house”

24 Natural Language and Free Text Queries Many systems treat question as a bag of words Natural language processing can be used to try to better determine the information need. –Extract noun (and verb) phrases –Find noun (and verb) phrases in same sentence Ask.com uses sites preselected to answer particular question forms. –Need to recognize type of question

25 Ask.com


Download ppt "Interfaces for Querying Collections. Information Retrieval Activities Selecting a collection –Lists, overviews, wizards, automatic selection Submitting."

Similar presentations


Ads by Google