Presentation is loading. Please wait.

Presentation is loading. Please wait.

ASSISTED BROWSING THROUGH SEMISTRUCTURED DATA PROBLEM The development of the RDF standard highlights the fact that a great deal of useful information is.

Similar presentations


Presentation on theme: "ASSISTED BROWSING THROUGH SEMISTRUCTURED DATA PROBLEM The development of the RDF standard highlights the fact that a great deal of useful information is."— Presentation transcript:

1 ASSISTED BROWSING THROUGH SEMISTRUCTURED DATA PROBLEM The development of the RDF standard highlights the fact that a great deal of useful information is in the form of semistructured data — objects connected by relations fitting no rigorous schema. Information is inherently semi- structured in nature: while it usually has an inherent structure behind it, it always has exceptions. To make use of this infor- mation it is important to be able to search and browse semistructured repositories. Approach When modifications are suggested, users need to choose among the many routes to take next, rather than finding the routes themselves. Since users often follow a common set of strategies for searching, we have built several agents each capable of suggesting routes following one strategy. Given any starting point, including an intermediate search result, together these agents enumerate several routes users can take next. Information searches generally involve dialogues between users and computers. Users respond to intermediate results by modifying previous queries to further narrow the relevant search spaces. We provide agents implementing generic search strategies that are useful at any kind of starting point. These strategies include: Attribute Selection: Pick other items that share certain attributes with the item at the starting point. Follow Links: Pick items explicitly connected to the starting item. Containing Collections: Enumerate collections containing the starting item. Similarly Visited: Recall items that have in the past been viewed at a similar time as the starting item. GENERIC STRATEGIES COLLECTION-SPECIFIC STRATEGIES Collections are a necessary concept for information management and a required mechanism for returning search results. We provide agents implementing strategies specifically tuned for aiding searches that start at collections. These strategies include: Collection Refinement: Narrow the starting collection by grouping terms present in that collection’s member elements. Similar Collections: Find other collections containing elements similar to some elements of the starting collection (using the vector-space model). The screenshot below shows suggestions for refining a collection of documents based on their types and due dates. When new metadata is made available (e.g. classification of these documents into projects), the system will pick them up and make new suggestions appropriately. USER INTERFACE Browsing suggestions are made available at all time in a collapsible docking pane. Convenient suggestions are included in context menus (screenshot above). Suggestions are grouped by strategies. Appropriate widgets are used based on the nature of the suggestion (e.g. a date range selector for refining a collection by date). As in web searches, users can revisit intermediate steps of the search. Vector-space model is used to represent collections so that traditional information retrieval algorithms can be leveraged. Vector-space model consists of vectors representing documents with each dimension representing a term (attribute-values pairs). Similarity in documents becomes the dot-product of these vectors. Vineet Sinha - vineet@ai.mit.edu David Karger - karger@ai.mit.edu David Huynh - dfhuynh@ai.mit.edu http://haystack.lcs.mit.edu/


Download ppt "ASSISTED BROWSING THROUGH SEMISTRUCTURED DATA PROBLEM The development of the RDF standard highlights the fact that a great deal of useful information is."

Similar presentations


Ads by Google