CrimeLink Explorer: Lt. Jennifer Schroeder Tucson Police Department Jie Xu University of Arizona June 2, 2003 Using Domain Knowledge to Facilitate Automated.

Slides:



Advertisements
Similar presentations
Agenda Levels of measurement Measurement reliability Measurement validity Some examples Need for Cognition Horn-honking.
Advertisements

Ch2 Data Preprocessing part3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Book Recommender System Guided By: Prof. Ellis Horowitz Kaijian Xu Group 3 Ameet Nanda Bhaskar Upadhyay Bhavana Parekh.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Plateau Competency Management and Assessment Overview v 5.8.
Wisconsin Department of Health Services Richard Miller Research Scientist Wisconsin Office of Health Informatics October 28, 2014 Matching Traffic Crash.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Thinking ‘Behind’ the Steps Engaging Students in Thinking ‘Behind’ the Steps.
OUTLINE Why are measures of crime important? Crime Rates v. Amounts
Uniform Crime Report (UCR) FBI Compiles data from the nation’s law enforcement agencies on crime for: Numbers of arrests Reports of crimes This is the.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
COPLINK: A Collaboration of Research and Application for Law Enforcement Rosie Hauck, MIS Dept., Research Associate Sgt. Jennifer Schroeder, Tucson Police.
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
Research Methods in MIS
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
CORRELATIO NAL RESEARCH METHOD. The researcher wanted to determine if there is a significant relationship between the nursing personnel characteristics.
Crime Victims: An Introduction to Victimology Seventh Edition
Innovations in Justice Information Sharing Strategies and Best Practices February 2007 Melissa R. Johnson, CCA Communications Director, International Association.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
1 Measuring violence against women: The Canadian experience François Nault Director, Statistics Canada November 2013.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Evaluation IMD07101: Introduction to Human Computer Interaction Brian Davison 2010/11.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Criminal Statistics: the nature and extent of crime
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Copyright © 2003 by Prentice Hall Computers: Tools for an Information Age Chapter 13 Database Management Systems: Getting Data Together.
Chapter Nine Copyright © 2006 McGraw-Hill/Irwin Sampling: Theory, Designs and Issues in Marketing Research.
Presented by Abirami Poonkundran.  Introduction  Current Work  Current Tools  Solution  Tesseract  Tesseract Usage Scenarios  Information Flow.
Léon van Berlo / Jos van Leeuwen The Neighbourhood Wizard Cause and effect of changes in urban neighbourhoods.
Iterative Readability Computation for Domain-Specific Resources By Jin Zhao and Min-Yen Kan 11/06/2010.
Innovations in Justice Information Sharing Strategies and Best Practices November 30, 2006 Lisa M. Palmieri, CCA-Supervisory Intelligence Analyst President,
Copyright 2010, The World Bank Group. All Rights Reserved. Prosecution Statistics Part 1 Crime, Justice & Security Statistics Produced in Collaboration.
Using Identity Credential Usage Logs to Detect Anomalous Service Accesses Daisuke Mashima Dr. Mustaque Ahamad College of Computing Georgia Institute of.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
1 Issues in Assessment in Higher Education: Science Higher Education Forum on Scientific Competencies Medellin-Colombia Nov 2-4, 2005 Dr Hans Wagemaker.
Measuring Complex Achievement
Designing Semantics-Preserving Cluster Representatives for Scientific Input Conditions Aparna Varde, Elke Rundensteiner, Carolina Ruiz, David Brown, Mohammed.
Theory and Application of Database Systems A Hybrid Approach for Extending Ontology from Text He Wei.
Interviewing and Deception Detection Techniques for Rapid Screening and Credibility Assessment Dr. Jay F. Nunamaker, Jr. Dr. Judee K. Burgoon.
INTERACTIVE ANALYSIS OF COMPUTER CRIMES PRESENTED FOR CS-689 ON 10/12/2000 BY NAGAKALYANA ESKALA.
HPN: IFSS1 Intelligent Flight Support System (IFSS) A Real-Time Intelligent Decision Support Prototype PRESENTER/COTR Anthony Bruins (X37071) HPN Software.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
For: CS590 Intelligent Systems Related Subject Areas: Artificial Intelligence, Graphs, Epistemology, Knowledge Management and Information Filtering Application.
C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Finding Experts Using Social Network Analysis 2007 IEEE/WIC/ACM International Conference on Web Intelligence Yupeng Fu, Rongjing Xiang, Yong Wang, Min.
Introductory Criminal Analysis Thomas E. Baker PRENTICE HALL ©2005 Pearson Education, Inc. Introductory Criminal Analysis: Crime Prevention and Intervention.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
More Than Relevance: High Utility Query Recommendation By Mining Users' Search Behaviors Xiaofei Zhu, Jiafeng Guo, Xueqi Cheng, Yanyan Lan Institute of.
Aim: How much crime is there in the United States?
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
CJ 102 Unit 2. Primary Sources of Crime Data Uniform Crime Reports (UCR) National Incident-Based Reporting System (NIBRS) National Crime Victimization.
Byron Marshall: Oregon State University Hsinchun Chen: University of Arizona ISI 2006 IEEE Intelligence and Security Informatics Conference May San.
Randy Bennett Frank Jenkins Hilary Persky Andy Weiss Scoring Simulation Assessments Funded by the National Center for Education Statistics,
Argumentative Writing Grades College and Career Readiness Standards for Writing Text Types and Purposes arguments 1.Write arguments to support a.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Copyright © 2007 Pearson Education Canada 9-1 Chapter 9: Internal Controls and Control Risk.
Mediation in Criminal and Civil Cases 2014 /Statistical Report 13/20141 Mediation in Criminal and Civil Cases 2014 Aune Flinck, Tuula Kuoppala.
CJ 102 Criminology. Chapter Two: The Nature and Extent of Crime.
CJ 425 Crime Mapping Unit 6 Seminar “Patterns”. Outline Repeat Incidents Tactical Analysis – Definition – Information Used 7 types of Patterns Inductive/Deductive.
Data Mining.
Criminal Violence Riedel and Welsh, Ch. 2 “Measures of Violence”
OUTLINE Why are measures of crime important? Crime Rates v. Amounts
Presentation transcript:

CrimeLink Explorer: Lt. Jennifer Schroeder Tucson Police Department Jie Xu University of Arizona June 2, 2003 Using Domain Knowledge to Facilitate Automated Crime Link Analysis

Agenda Review of Problem: Link Analysis In Law Enforcement Problem Literature Review System & Heuristic Design User Study Design Demo User Study Results & Conclusions Q & A

Agenda Review of Problem: Link Analysis In Law Enforcement Problem Literature Review System & Heuristic Design User Study Design Demo User Study Results & Conclusions Q & A

Link Analysis in Law Enforcement Extremely valuable, but extremely time consuming for investigators (sometimes months are spent constructing a large network) Can uncover valuable investigative leads Usually only conducted in high profile cases that justify the resource expenditure Can be very complex (high branching factors, especially among repeat offenders)

Data Sources for Link Analysis Police Incident reports (Police RMS) –Often largest source of data for analysis –Link based on co-occurrence in an incident –Analysts must examine each report to determine the strength of the link –Must be searched across multiple jurisdictions Field interviews Phone records Financial information Intelligence information (sometimes stored in databases) Interviews with witnesses, suspects, confidential informants

An example Eddie “Smith” is in 18 incident reports These incidents contain a total of 152 entities that are potential branches: –31 People –11 Vehicles –57 Locations –2 Organizations –1 Property item –1 Weapon This complexity is at a depth of one! Imagine the task for crime analysts to search each of these possible branches to create a large, multi-level link chart

Obstacles to LA Automation Lack of Integration/Data Consolidation High branching factors cause information overload Investigators must manually analyze every link to determine relevance No domain specific way to automate analysis for relevance of links

Proposed Approach Use concept space to extract associations from incident records Focus on domain specific heuristic to provide accurate link assessment Use shortest-path algorithm to find best path between individuals of interest Incorporate the approach into a prototype system with visualization of resulting paths Conduct a user study to evaluate the system

Literature Review Link Analysis –Anacapa Charting –Free Text Association Searches (NLP) –Watson –COPLINK Detect Domain Knowledge Incorporation –Expert Systems –Bayesian Networks Shortest-Path Algorithms

Domain Knowledge Incorporation Expert Systems Bayesian Networks Law Enforcement Specific Research

Agenda Review of Problem: Link Analysis In Law Enforcement Problem Literature Review System & Heuristic Design User Study Design Demo User Study Results & Conclusions Q & A

System Design Concept Space Incident Reports Heuristics (crime types, shared address, shared phone) Association Path Search (shortest-path algorithm) Graphical User Interface Heuristic WeightsCo-occurrence Weights

Experimental Database Dataset must contain real data so that crime investigators will be engaged and interested in the results The dataset must contain sufficient amounts of data for association paths between a reasonable number of subjects to exist Approximately 20 months of incident reports were extracted Age, gender, race, addresses, and phone numbers of persons involved in the incidents was also extracted Simple data consolidation on name for prototype

Heuristic Design Goals Provide weighting scheme for links that more accurately reflects judgment of human analysts Weights should be understandable to law enforcement users Improved weights should be used for shortest-path calculations

Heuristic Design Incorporated most important information considered by human analysts: –Relationship between crime type and person roles –Shared addresses or telephone numbers –Repeated co-occurrence in incident reports Employed a scale, familiar to users (used in RMS queries) Logarithmic transformation of link weight used to compute shortest path during searches

Crime Type and Person Role We constructed a matrix and assigned scores to role combinations in each of the crime types To construct the crime type/role matrix we interviewed sergeants from Homicide, Aggravated Assault, Robbery, Fraud, Auto Theft, Sexual Assault, Child Sexual Abuse, Domestic Violence Crime type/role combinations were assigned weights based on estimation by experts of likelihood of association for that combination Person roles used in the TPD dataset include: Victim, Witness, Suspect, Arrestee, and Other.

Co-occurrence Goal was to capture judgments of analysts when looking at repeated co-occurrences of entities Analyzed a random sample of 40 incident reports counting the number of times each pair of persons co-occurred Read supporting narrative reports for each incident to determine whether an association was important

Co-occurrence probability distribution Co-occurrence count Association probability (%)  4 100

Heuristic Function Investigators may rely more on crime type/role and shared associations, but a high co-occurrence weight can outweigh a low association weight Value calculated based on summed crime-type/person-role relationship, shared address, shared phone values Second value based on association probability of co- occurrence counts Maximum (0.85 (crime-type/person-role score) (shared phone score) (shared address score)) (100 (association probability based on co-occurrence counts))

Association Path Search Used Dijkstra’s shortest-path algorithm (1959) to address the search complexity problem Conventional shortest-path algorithms could not be used directly to solve the problem of identifying the strongest association between a pair of persons (Xu & Chen 2000) A logarithmic transformation was made on association weights

Agenda Review of Problem: Link Analysis In Law Enforcement Problem Literature Review System & Heuristic Design User Study Design Demo User Study Results & Conclusions Q & A

User Study Questions Can the automated link analysis approaches proposed (concept space approach, heuristic approach, and the shortest-path algorithm) help address the information overload and search complexity problem? Can incorporated domain knowledge help identify associations between crime entities more accurately than the concept space approach? Will domain experts perceive the automated link analysis approaches to be useful for crime investigation?

Hypotheses H1: Subjects will achieve higher efficiency conducting an association path search with the prototype system than with the “single-level” link analysis tool H2: Association paths found using heuristics will be more accurate than paths found using simple co-occurrence weight H3: Subjects will perceive the heuristic approach to be more useful than the concept space approach for investigative work.

Efficiency and Accuracy H1 and H2 Efficiency = the time a subject spends completing a given task Accuracy = the average agreement scale a subject indicates on the weights of associations on a path Usefulness = the average agreement scale is > 4, indicating positive assessment of usefulness

Agenda Review of Problem: Link Analysis In Law Enforcement Problem Literature Review System & Heuristic Design User Study Design Demo User Study Results & Conclusions Q & A

User Study Tasks Task 1: Use COPLINK Detect to find the strongest association paths between those criminals. Task 2: Use the concept space approach provided by the prototype system to find the strongest association paths, evaluate each association on the path, and indicate scales of agreement on the association weights. Task 3: Given the same set of criminal names used in task 2, use the heuristic approach to do same

Two or more names were entered to search for association paths

Returns are displayed in a network

Weak links can be removed to focus investigation where more information is needed

Clicking on a link displays information about origin and strength of link

Concept Space and heuristic values were compared by the users to assess comparative accuracy

H1, H2, H3 Two-tailed t-tests H1 was supported (t = 11.47, p < 0.001) H2 was supported (t = 2.04, p < 0.001) H3 was supported (t = 2.35, p < 0.05)

Weighting Agreement Scale

Agenda Review of Problem: Link Analysis In Law Enforcement Problem Literature Review System & Heuristic Design User Study Design Demo User Study Results & Conclusions Q & A

Conclusions The system evaluation focused on the approaches’ efficiency, accuracy, and usefulness The three characteristics are desirable features of a sophisticated link analysis system The experiment results demonstrated the potential of our approach to achieve these features using domain-specific heuristics

Future Work Apply a statistical analysis on NIBRS (National Incident-Based Reporting System) data for more accurate crime type/relationship weights Extend heuristics to include common vehicles and common organization associations Encode expert knowledge in Bayesian networks and incrementally learn new knowledge from crime data Interface improvements suggested by users Improve data consolidation rules

Agenda Review of Problem: Link Analysis In Law Enforcement Problem Literature Review System & Heuristic Design User Study Design Demo User Study Results & Conclusions Q & A