Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Natural Language Interface for Crime-related Spatial Queries Chengyang Zhang, Yan Huang, Rada Mihalcea, Hector Cuellar Department of Computer Science.

Similar presentations


Presentation on theme: "A Natural Language Interface for Crime-related Spatial Queries Chengyang Zhang, Yan Huang, Rada Mihalcea, Hector Cuellar Department of Computer Science."— Presentation transcript:

1 A Natural Language Interface for Crime-related Spatial Queries Chengyang Zhang, Yan Huang, Rada Mihalcea, Hector Cuellar Department of Computer Science and Engineering University of North Texas ISI 2009 Presentation

2 ISI 2009  Motivation Related Work Proposed Method System Evaluation Outline

3 ISI 2009 The databases and query interfaces hosted by Federal and state justice departments are heterogeneous and complicated. Motivation

4 ISI 2009 Need tools for crime-related spatial queries. Motivation Find a police office near the school 1 Find a house in neighborhood with low crime rate 2

5 ISI 2009 Neither web forms nor keyword search has the expressive power and flexibility desired in crime-related spatial queries. But natural language does! No need for training No need for proprietary user interface or esoteric formal language like SQL or Xquery Ideal for ad-hoc real time query in emergency conditions Motivation

6 ISI 2009  We propose a method to translate crime-related natural language spatial queries into spatial data queries  We implement a prototype query system  Experiments show that the system achieves results significantly better than those obtained by using Google Maps. Our Contributions

7 ISI 2009  Motivation  Related Work Proposed Method System Evaluation Outline

8 CSCE 5290 Related Work Syntax-based methods [3-4] use template or grammar rules to match natural language sentences into database schemas Simple but not scalable Sometimes may lead to serious errors Semantic Parsing algorithms [5-9] preserve syntactic dependencies, but also seek to enforce semantic constraints over the possible mappings The quality of mapping is significantly improved Precise system in [9] focused on high precision only

9 CSCE 5290 Related Work Lambda-calculus encoding can be used as the intermediate representation between natural language and database queries. [10] Training corpus is used to derive lexicons and grammars for the specific domain The approach was found to lead to good results Structure of XML documents can be used to match natural language parse trees. [11] Identify a meaningful lowest common ancestor structure (MLCAS) from the tree structure Includes an interactive component to receive help from the user when formulating the query

10 ISI 2009  Motivation  Related Work  Proposed Method System Evaluation Outline

11 ISI 2009 System Framework

12 ISI 2009 1. Part of Speech Tagging In POS tagging, we employ the classic Viterbi algorithm. dynamic programming framework coupled with a Markov assumption Efficient and widely used Use manually labeled Penn Treebank Dataset for training purpose Running Example: I wish to find a police department within 2 miles of a law court POS Tagging: I/NP wish/VB to/IN find/VB a/DT police/NN department/NN within/IN 2/CD miles/NNS of/IN a/DT law/NN court/NN

13 ISI 2009 2. Semantic Parsing In semantic parsing, we identify three type of “key words” using the parsing tree. Target object Spatial predicate Reference object Example Parsing tree:

14 ISI 2009 2. Semantic Parsing Running Example: I wish to find a police department within 2 miles of a law court Semantic Parsing: Target Object: police department Spatial predicate: within 2 miles Reference object: law court

15 ISI 2009 3. Schema Matching In schema matching, we try to match target and reference spatial objects from the backend spatial database using Table name Attribute name Content of the database We then perform a spatial join for each retrieved candidate pair based on spatial predicate

16 ISI 2009  Motivation  Related Work  Proposed Method  System Evaluation Outline

17 ISI 2009 Query Interface

18 ISI 2009 Experimental Evaluation Database contains real spatial data obtained from City of Denton 32 tables Including crime-related objects such as police office, law courts Gold standard: human prepared answers for 30 different crime-related queries. Baseline: Top 10 answers from Google Maps Result:

19 ISI 2009 Summary We proposed a method to build a natural language interface to spatial database queries. The prototype system demonstrated effectiveness of our approach in crime-related spatial queries. In our future work, we plan to extend our system by increasing the dataset size, and improving the accuracy of the tagging and parsing algorithms. We will collect more user queries and improve the system performance based on a larger evaluation dataset.

20 ISI 2009 References 1. http://maps.google.com/ 2. http://maps.met.police.uk/ 3. I. Androutsopoulos, G. Ritchie, and P. Thanisch, “Natural language interfaces to databases – an introduction,” Journal of Natural Language Engineering, vol. 1, no. 1, 1995. 4. W. Woods, R. Kaplan, and B. Webber, “The Lunar sciences natural language information system,” Bolt Beranek and Newmann, Tech. Rep.,1972. 5. R. Ge and R. J. Mooney, “A statistical semantic parser that integrates syntax and semantics,” in Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005), Ann Arbor, MI, Jul. 2005, pp. 9–16. 6. R. J. Kate and R. J. Mooney, “Using string-kernels for learning semantic parsers,” in Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL-06), Sydney, Australia, July 2006, pp. 913–920. 7. R. J. Mooney, “Learning for semantic parsing,” in Computational Linguistics and Intelligent Text Processing: Proceedings of the 8th International Conference, CICLing 2007, Mexico City, A. Gelbukh, Ed. Berlin: Springer Verlag, 2007, pp. 311–324. 8. Y. Wong and R. J. Mooney, “Learning for semantic parsing with statistical machine translation,” in Proceedings of Human Language Technology Conference / North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT-NAACL-06), New York City, NY, 2006, pp. 439–446. 9. A. Popescu, A. Armanasu, and O. Etzioni, “Modern natural language interfaces to databases: Composing statistical parsing with semantic tractability,” in Proceedings of the 20st International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, 2004. 10. L. Zettlemoyer and M. Collins, “Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars,” in Proceedings of the Twenty First Conference on Uncertainty in Artificial Intelligence (UAI-05), 2005. 11. Y. Li, H. Yang, and H. Jagadish, “NaLIX: an interactive natural language interface for querying XML,” in Proceedings of SIGMOD 2005, Baltimore, MD, 2005.


Download ppt "A Natural Language Interface for Crime-related Spatial Queries Chengyang Zhang, Yan Huang, Rada Mihalcea, Hector Cuellar Department of Computer Science."

Similar presentations


Ads by Google