The LINDI Project Linking Information for New Discoveries UIs for building and reusing hypothesis seeking strategies. Statistical language analysis techniques.

The LINDI Project Linking Information for New Discoveries UIs for building and reusing hypothesis seeking strategies. Statistical language analysis techniques for extracting propositions Two Main Thrusts:

LINDI: Target Components 1. Special UI for retrieving appropriate docs 2. Language analysis on docs to detect causal relationships between concepts 3. Probabilistic representation of concepts and relationships 4. UI + User: Hypothesis creation

Design Goals of LINDI UI Support for the development of extended search strategies 1. Text filtering and manipulation tool to help the development of strategies 2. Text visualization and analysis tool to help the formulation of hypotheses

The User Interface l A general search interface should support –History –Context –Comparison –Operators: Intersection, Union, Slicing –Operator Reuse –Visualization (where appropriate) l We have an initial implementation l It needs lots of work

Scenario: Explore Functions of a Gene l Objective –Determine the functions of a newly sequenced Gene X. l Known facts –Gene X co-expresses (activated in the same cell) with Gene A, B, C –The relationship of Gene A, B, C with certain types of diseases (from medical literature) l Question –What types of diseases are Gene X related to?

Medical Literature Explore Functions of New Gene X Possible Function For Gene-X Gene-A Keywords Gene-B Keywords Slide adapted from K. Patel Slicing Gene-C Keywords Projection Keywords Intersection Mapping Query

Architecture of LINDI UI l Data Layer l Annotation Layer l User Interface Layer

Data Layer l Purpose –Hide different formats of text collections l Components –Data: Abstractions representing records of a text collection –Operations: performed on the data l Data –A set of records –Each record is a set of tuples with types l Operations –union, intersection, projection, mapping

Annotation Layer l Purpose –Associate data set with operations that produced them (history) –History is a first class object l Advantage –Streamline a sequence of operations –Reuse operations –Parameterize operations

User Interface l This version completed Aug 10, 2000 –Designed by Marti Hearst and Hao Chen –Code written by Hao Chen l Direct manipulation of information objects and access operations –Query –Intersection –Union –Mapping –Slicing l Record and reuse of past operations l Parameterization of operations l Streamlining of operations

Initial Palette

Query Structure Determined by Collection Type

Query Operation Results

Projection Operation and Subsequent Results

Parameterized Query: Repeat operations with different values GC GB GA

Intersection over Projected Attribute

Example Interaction with UI Prototype 1 Query on Gene names 2 Project out only mesh headings 3 Intersect the results 4 Map to create a ranking 5 Slice out the top-ranked.

Second Version of UI l LINDI Miner l Circa May 2002 –Designed by Marti Hearst –Implemented by Melody Ivory l Emphasize reusing results of prior text analysis l See lindi-miner.ppt

The Language Analysis Component l Goal: Extract Propositions from Text and Make Inferences l Why Extract Propositions from Text? –Text is how knowledge at the propositional level is communicated –Text is continually being created and updated by the outside world

Example: Etiology l Given –medical titles and abstracts –a problem (incurable rare disease) –some medical expertise l find causal links among titles –symptoms –drugs –results

Traditional Semantic Grammars l Example (Burton & Brown 79) –Interpreting “What is the current thru the CC when the VC is 1.0?” := when := what is := := is := VC –Resulting semantic form is: (RESETCONTROL (STQ VC 1.0) (MEASURE CURRENT CC))

Example: Statistical Semantic Grammar l To detect causal relationships between medical concepts –Title: Magnesium deficiency implicated in increased stress levels. –Interpretation: related-to –Inference: »Increase(stress, decrease(mg))

Statistical Semantic Grammars l Empirical NLP has made great strides –But mainly applied to syntactic structure l Semantic grammars are powerful, but –Brittle –Time-consuming to construct l Idea: –Use what we now know about statistical NLP to build up a probabilistic grammar

The LINDI Project Linking Information for New Discoveries UIs for building and reusing hypothesis seeking strategies. Statistical language analysis techniques.

Similar presentations

Presentation on theme: "The LINDI Project Linking Information for New Discoveries UIs for building and reusing hypothesis seeking strategies. Statistical language analysis techniques."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The LINDI Project Linking Information for New Discoveries UIs for building and reusing hypothesis seeking strategies. Statistical language analysis techniques.

Similar presentations

Presentation on theme: "The LINDI Project Linking Information for New Discoveries UIs for building and reusing hypothesis seeking strategies. Statistical language analysis techniques."— Presentation transcript:

Similar presentations

About project

Feedback