Presentation is loading. Please wait.

Presentation is loading. Please wait.

START: Natural Language Access to Information Boris Katz, Gary Borchardt, Sue Felshin, Jimmy Lin, Jerome McFarland, Ali Ibrahim, Luciano Castagnola, Baris.

Similar presentations


Presentation on theme: "START: Natural Language Access to Information Boris Katz, Gary Borchardt, Sue Felshin, Jimmy Lin, Jerome McFarland, Ali Ibrahim, Luciano Castagnola, Baris."— Presentation transcript:

1 START: Natural Language Access to Information Boris Katz, Gary Borchardt, Sue Felshin, Jimmy Lin, Jerome McFarland, Ali Ibrahim, Luciano Castagnola, Baris Temelkuran, Aaron Fernandes, Alp Simsek, Jonathan Wolfe, Matthew Bilotti MIT Artificial Intelligence Lab

2 I had a dream... Library of Congress ?

3 Reality What we can do: Understand ordinary sentences and questions What we can’t do (yet): 1. Full-text NL understanding still beyond reach Common sense implication Intersentential reference Summarization 2. Not all information is language—most Web resources are not textual Maps and Images Sound and Video Multimedia Web resources are distributed across numerous non-traditional databases

4 Bridging the Gap Library of Congress + In 1492, Columbus sailed the ocean blue. An object at rest tends to remain at rest. Four score and seven years ago our forefathers brought forth

5 The Solution: Natural Language Annotations Annotations bridge the gap between our ability to analyze natural language sentences and our desire to access the huge amount of data available in our libraries and on the Web. Annotations are collections of natural language sentences and phrases that describe the content of various information segments. START analyzes these annotations creates the necessary representational structures produces special pointers to the information segments summarized by the annotations

6 Natural Language Annotations Annotation “Mars’s year is long.” + Questions “How long is the Martian year?” “How long is a year on Mars?” “How many days are in a Martian year?” … START knowledge base Annotator User is year long related-to year Mars... one Mars year lasts 687 Earth days.

7 noun molecule quantity two det Parsing a noun NP N PP NPprep converts VP S A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate. each NP N PP of glucose into smaller prepNP N PP molecules of pyruvate N V chain noun reactions of

8 Ternary expressions (T-expressions) A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate. into molecules-5> into molecules converts chain molecule related-to reactions glucose related-to pyruvate related-to each quantifier two quantity smaller is

9 T-expression Representation List of node-link-node triples Nouns, adjectives are nodes Links cover: relationships between verbs and their arguments fundamental semantic relationships: “is-a” (for equality, membership, and subclass relationships), “related-to” (for possessives, etc.) modification of nouns: “quantifier”, “quantity”, “is” (for adjectives) prepositions

10 S-rules for Structural Variation S-rule for the Property Factoring alternation: emotional- reaction- verb someone 1 someone 2 with something related-to someone 1 someone 1 emotional-reaction-verb someone 2 with something someone 1 ’s something emotional-reaction- verb someone 2 emotional- reaction- verb something 1 someone 2 something 1 related-to someone 1 The president impressed the country with his determination. The president’s determination impressed the country. Emotional reaction verbs: surprisestun amazestartle impressplease embarrassannoy etc.

11 Sample Assertion A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate. into molecules converts chain molecule related-to reactions glucose related-to pyruvate related-to each quantifier two quantity smaller is into molecules-5>

12 Sample Query How are the glucose molecules converted into pyruvate molecules? into molecules converts molecules glucose related-to pyruvate related-to something into molecules-5>

13 Matching Matcher T-expressions from Query T-expressions from Assertion into molecules converts chain molecule related-to reactions glucose related-to pyruvate related-to each quantifier two quantity smaller is something Key: Input Processing Query Processing

14 A. Reply by Generating A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate. Generator Displayed Answer Ternary expressions Query: How are the glucose molecules converted into pyruvate molecules? into molecules converts chain molecule related-to reactions glucose related-to pyruvate related-to each quantifier two quantity smaller is Answer:

15 Reply by Generating: Example

16 B. Reply from annotation Find resource Displayed Answer Ternary expressions related-to picture Cog Annotated resource + Query: Show me a picture of Cog.

17 Reply from annotation: Example

18 C. Reply from annotation with script directs any-person any-IMDb-movie + Gone with the Wind (1939) was directed by George Cukor, Victor Fleming, and Sam Wood. Source: The Internet Movie Database Script get m/Details? match regexp... IMDb T-exps Run script Displayed Answer Find resource Query: Who directed Gone with the Wind?

19 Reply from annotation with script: Example

20

21 NASA POTUS Webster Uniform Access START NL questions Multimedia responses Omnibase Queries Data Local knowledge base of ternary expressions Core vocabulary Uniform interface to multiple database formats (Web, text, etc.) Integration time independent of size of database Extended lexicon U.S. Census IMDb

22 How START works Web browser START Parser Matcher English Input T-exps Database of T-exps T-exps from KB Generator HTML English Annotations Scripts Omnibase (external knowledge) Native knowledge Scripts WWW Potus IMDb World Factbook U.S. Census

23 Q. "I'd like to speak to Trevor." Q. "Is Trevor in his office?" A. "Trevor is in his office but he is on the phone." A. "Trevor is in his office but he is talking to Boris now." A. "Trevor is in his office; however, he doesn't want to be disturbed until 2pm." Multi-Modal Interaction


Download ppt "START: Natural Language Access to Information Boris Katz, Gary Borchardt, Sue Felshin, Jimmy Lin, Jerome McFarland, Ali Ibrahim, Luciano Castagnola, Baris."

Similar presentations


Ads by Google