Presentation is loading. Please wait.

Presentation is loading. Please wait.

GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and.

Similar presentations


Presentation on theme: "GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and."— Presentation transcript:

1 GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist william.underwood@gtri.gatech.edu “The Current Status and Future of Search and Retrieval Technology” WG1 Mid-Year Meeting Cambridge, Maryland April 21-22, 2002

2 GTRI.ppt-2 Research Sponsored by ERA Program of NARA Application of Natural Language Processing Technology to effectively: Summarize Series of Presidential e-records Identify FOIA exemptions and PRA restrictions in Presidential e-records Search for e-records relevant to a FOIA request Search for e-records in massive collections in support legal discovery

3 GTRI.ppt-3 NLP Methods in Document Retrieval Morphological processing Identifying words Parsing-Linguistic representation Word sense disambiguation Represent, identify and exploit semantic relationships Conceptual indexing Matching concepts in query to conceptual index

4 GTRI.ppt-4 Current Weaknesses of NLP in Information Retrieval NLP methods of document retrieval have failed to perform better that Boolean and statistical methods. Why? Broad nature of retrieval tasks Lack of weighting scheme for compound terms Poor word sense ambiguation for documents and queries. Need to handle verbs as well as nouns and noun phrases. Poor POS tagging Need better parsing algorithms and grammars. Inadequate handling of negation

5 GTRI.ppt-5 Advanced NLP Methods Applied to PERPOS Research Tasks Morphological analysis Word sense disambiguation Larger lexicon Domain-dependent Lexicons. Information extraction to identify classes of words Template filling to identify communication acts of records (nominate, request information, provide information) Learning and identification of document types Method of reasoning with negation in NL Conceptual taxonomy Rule-based reasoning Question answering technology

6 GTRI.ppt-6 Plausible, Hybrid Approach to Investigating e-discovery Formulate e-discovery task not just in search terms but also complaint itself including parties and laws involved. Express the kinds of evidence that would enable one to prove the case as a series of questions or if-then rules drawn from precedent cases. And experience. Use a COTS text retrieval system with Boolean queries and statistical method to retrieve documents using key terms related to the case. Use contextual knowledge with questions and NLP methods, (e.g., question answering) to review the retrieved documents to determine more precisely those relevant to the case, i.e., those that would represent evidence.


Download ppt "GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and."

Similar presentations


Ads by Google