Automatic indexing and retrieval of crime-scene photographs Katerina Pastra, Horacio Saggion, Yorick Wilks NLP group, University of Sheffield Scene of.

Slides:



Advertisements
Similar presentations
University of Sheffield NLP Machine Learning in GATE Angus Roberts, Horacio Saggion, Genevieve Gorrell.
Advertisements

Chapter 5: Introduction to Information Retrieval
Improved TF-IDF Ranker
QA-LaSIE Components The question document and each candidate answer document pass through all nine components of the QA-LaSIE system in the order shown.
Large-Scale Entity-Based Online Social Network Profile Linkage.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Presented by Zeehasham Rasheed
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Introduction to Machine Learning Approach Lecture 5.
Indexing Overview Approaches to indexing Automatic indexing Information extraction.
Ontology-based Information Extraction for Business Intelligence
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Image-Language Association: are we looking at the right features? Katerina Pastra Language Technology Applications, Institute for Language and Speech Processing,
Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.
Analysing Crime-Scene Reports Katerina Pastra and Horacio Saggion University of Sheffield Scene of Crime Information System.
WuArchivalContr.ppt-1 Information Technology & Telecommunications Laboratory Presidential Electronic Records Pilot Operating System (PERPOS) William Underwood.
Survey of Semantic Annotation Platforms
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
Using Text Mining and Natural Language Processing for Health Care Claims Processing Cihan ÜNAL
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
1 CS 430 / INFO 430 Information Retrieval Lecture 23 Non-Textual Materials 2.
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
SemSearch: A Search Engine for the Semantic Web Yuangui Lei, Victoria Uren, Enrico Motta Knowledge Media Institute The Open University EKAW 2006 Presented.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
1/26/2004TCSS545A Isabelle Bichindaritz1 Database Management Systems Design Methodology.
Data Mining By Dave Maung.
Math Information Retrieval Zhao Jin. Zhao Jin. Math Information Retrieval Examples: –Looking for formulas –Collect teaching resources –Keeping updated.
IEEE Int'l Symposium on Signal Processing and its Applications 1 An Unsupervised Learning Approach to Content-Based Image Retrieval Yixin Chen & James.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.
Artificial Intelligence Research Center Pereslavl-Zalessky, Russia Program Systems Institute, RAS.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Information Retrieval
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.
Yixin Chen and James Z. Wang The Pennsylvania State University
Chapter 2.1 Notes. Objectives Define physical evidence. Discuss the responsibilities of the first police officer who arrives at the crime scene. Explain.
Unclassified//For Official Use Only 1 RAPID: Representation and Analysis of Probabilistic Intelligence Data Carnegie Mellon University PI : Prof. Jaime.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Intelligent Database Systems Lab Presenter : JHOU, YU-LIANG Authors : Jae Hwa Lee, Aviv Segev 2012 CE Knowledge maps for e-learning.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
An Ontology-based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design Feng Wang, Lanfen Lin, Zhou Yang College.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Visual Information Retrieval
Introduction Multimedia initial focus
Data and Applications Security Developments and Directions
Multimedia Content-Based Retrieval
Presented by: Hassan Sayyadi
Multimedia Information Retrieval
Improving DevOps and QA efficiency using machine learning and NLP methods Omer Sagi May 2018.
Multimedia Information Retrieval
CSE 635 Multimedia Information Retrieval
Introduction to Information Retrieval
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Information Retrieval
Presentation transcript:

Automatic indexing and retrieval of crime-scene photographs Katerina Pastra, Horacio Saggion, Yorick Wilks NLP group, University of Sheffield Scene of Crime Information System (SOCIS)

Cambridge 2002 Outline > Application Scenario > Project Overview > SOCIS features > Text-based approaches > Using NLP: > The Indexing mechanism > The Retrieval mechanism > Preliminary system evaluation > Links

Cambridge 2002 Crime Scene Documentation: Current Practices > Scene of Crime Officers:  attend crime scene  photograph the scene  collect evidence (package and label items)  write reports and create indexed photo-album(s)  case-files piled in storage rooms

Cambridge 2002 Examples

Cambridge 2002 IT support for CSI > Crime Investigation requires:  Fast and accurate retrieval of case-related info (and therefore efficient classification of this info)  Identification of “patterns” among cases > IT support for Crime Investigation:  Governmental agencies’ Systems (HOLMES)  Commercial Systems (LOCARD, SOCRATES) (Crime Management and Administration Systems) Needed: “Intelligent” support for Crime Investigation

Cambridge 2002 Project Overview > Domain: Scene of Crime Investigation (SOC) > Scenario: Use of digital photography and speech to populate a central police database with case related information > Objective: Creation of a prototype system that allows for intelligent indexing and retrieval of crime photographs

Cambridge 2002 SOCIS features  Access through the web (JSP application)  Storage of case documentation & meta-information in central database  Automatic indexing of photographs  Automatic retrieval of photographs  Automatic population of official forms

Cambridge 2002

“view of deceased with computer cable removed”

Cambridge 2002 Text-based image indexing & retrieval: approaches Manual assignment of keywords Automatic extraction of keywords (statistics +/ semantic expansion) [Smeaton’96, Sable’99, Rose’00] Extraction of logical form representations (syntactic relations and concept classification) [Rowe’99] Precision and recall increase as indexing terms go beyond keywords capturing relational info

Cambridge 2002 Text-based image indexing & retrieval: problems  “view to the loft” vs. “view into loft”  “position of baby with no bedding”  “position of baby with bedding removed”  keyword barrier  syntactic relations need to be complemented with semantic information Consider:

Cambridge 2002 Pipeline of processing resources: tokeniser  sentence splitter  POS tagger  lemmatizer  NE recognizer  parser  discourse interpreter (+ triple extraction layer) Indexing-Retrieval Mechanism Free text query OntoCrime + KB Indexing terms ARG1 REL ARG2 Query triples ARG1 REL ARG2 matchingmatching captions

Cambridge 2002 Corpus and Domain Model  1200 captions from 350 different crime cases dealt by South Yorkshire Police (text files)  65 captions (transcribed speech experiment) Different lengths but same characteristics: Phrasal constructions, named entities, meta-info, what and where references Domain model = OntoCrime and knowledge base Role = selection restrictions for triples’ arguments and semantic expansion for retrieval

Cambridge 2002 Triple Extraction  17 Relations : AND, AROUND, MADE-OF, OF, ON, WITHOUT etc.  Form of triples: ARG1 REL ARG2  Restrictions and filters for arguments  Rules for captions with multiple relations  Inferences restricted to certain cases

Cambridge 2002 Triple Extraction examples  “body on floor surrounded by blood”  “shot of footprint on top of bar”  “photograph from behind bar of body on floor”  “bottle, gun and ashtray on table”  “footprint with zigzag and target on chair” blood AROUND floor blood AROUND body Body ON floor

Cambridge 2002 Retrieval Mechanism  Allow for free text query  Extract relational facts from the query  Match the query triples with the indexing triples of each captioned photograph  Allow for exact match of arguments or class info ARG1, RELATION, ARG2Class:  If no triples can be extracted, keyword matching takes place with semantic expansion if needed

Cambridge 2002 Preliminary Evaluation  Indexing mechanism evaluation run on 600 captions indicated refinements on the rules (80% accuracy in extracting and inferring triples)  Preliminary usability evaluation with real users: Relational information considered to be an intuitive way for forming queries for image retrieval  Future work: overall evaluation of free text query for image retrieval

Cambridge 2002 Conclusions  Could the SOCIS approach be ported to other domains ?  Thorough testing and experimentation needed  However, it is a corpus-driven approach: Not just an alternative image indexing/retrieval approach,but the one dictated by a real application For more information on SOCIS: