Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 RAPID: Representation and Analysis of Probabilistic Intelligence Data Carnegie Mellon University PI : Prof. Jaime G. Carbonell / / (412)

Similar presentations


Presentation on theme: "1 RAPID: Representation and Analysis of Probabilistic Intelligence Data Carnegie Mellon University PI : Prof. Jaime G. Carbonell / / (412)"— Presentation transcript:

1 1 RAPID: Representation and Analysis of Probabilistic Intelligence Data Carnegie Mellon University PI : Prof. Jaime G. Carbonell / jgc@cs.cmu.edu / (412) 268-7279 Dr. Eugene Fink / e.fink@cs.cmu.edu / (412) 268-6593 Dr. Anatole Gershman / anatoleg@cs.cmu.edu / (412) 268-8259 DYNAMiX Technologies POC: Dr. Ganesh Mani / gmani@dynamixtechnologies.com / (412) 401-0121 Mr. Dwight Dietrich / ddietrich@dynamixtechnologies.com / (724) 940-4304 PAINT

2 2 Carnegie Mellon Faculty Jaime G. Carbonell Eugene Fink Anatole Gershman Students Bin Fu Diwakar Punjani Andrew Yeager People DYNAM i X Principals Dwight Dietrich Ganesh Mani Engineers Atul Bhandari Jeremy Hermann Veera Manda

3 3 Outline of the presentation RAPID functionality Preliminary demo Architecture and main components Integration with REALISM Current results and work plan

4 4 Analysis of uncertain intelligence RAPID is a probabilistic reasoning engine for the analysis of dynamically evolving intelligence data. Intelligence results RAPID will help: Identify important holes Locate most crucial missing pieces Insert these pieces Initial knowledge Available knowledge Observable facts Hidden facts Jigsaw analogy: Knowledge sources: Public domain Intelligence Inferences

5 5 Analysis of uncertain intelligence RAPID will help intelligence analysts to accomplish the following tasks. Draw probabilistic conclusions from available intelligence, including uncertain and missing data Identify potentially surprising developments Formulate and assess hypotheses Identify critical uncertainties Develop strategies for proactive collection of additional intelligence to resolve uncertainties, based on the analysis of cost / benefit trade-offs Filtering and processing of new intelligence Propagation of inferences Analysis of key indicators Development of intelligence- collection plans Massive new intelligence Intelligence collection Analysts

6 6 Underlying functionality Representation of uncertainty: Novel representation of massive uncertain data, which supports fast matching and inferences Inferences from uncertain data: Scalable inference mechanism for reasoning about uncertain intelligence Analysis of critical uncertainties: Assessment of uncertain situations, evaluation of data utility, and identification of important missing data Proactive intelligence planning: Evaluation of available probes and construction of optimized intelligence-collection plans

7 7 Outline of the presentation RAPID functionality Preliminary demo Architecture and main components Integration with REALISM Current results and work plan

8 8 Preliminary demo Uncertainty analysis and probe evaluation, integrated into Excel.

9 9 Outline of the presentation RAPID functionality Preliminary demo Architecture and main components Integration with REALISM Current results and work plan

10 10 Architecture Advanced analysis of incomplete data, identification of critical uncertainties, evaluation and selection of probes, what-if analysis, and visualization. Excel extension for the analysis of uncertainty, probes, and proactive data collection Uncertainty calculus and proactive probe planning A large-scale database of incomplete and uncertain facts, uncertain inference rules, and hypotheses, which allows scalable planning of proactive data collection. Scalable assessment of uncertain intelligence Relational database of uncertain data and inference rules Uncertain situation assessment and data-collection planning An advanced API for integration with other systems. Optional user interface for the integrated access to all system components, which extends the standard Excel interface. Analyst interface

11 11 Architecture Proactive intelligence collection General intelligence collection Massive new intelligence Processing of data streams Real-time matching of queries and inference rules against a massive stream of new data Approved plans for proactive data collection Fast database operations on a stream of newly incoming data, and integration of this stream with the static database. Scalable assessment of uncertain intelligence Relational database of uncertain data and inference rules Uncertainty calculus and proactive probe planning Excel extension for the analysis of uncertainty, probes, and proactive data collection Uncertain situation assessment and data-collection planning Analyst interface Hypotheses, conclusions, and data-collection plans

12 12 Architecture Proactive intelligence collection General intelligence collection Massive new intelligence Scalable assessment of uncertain intelligence Relational database of uncertain data and inference rules Uncertainty calculus and proactive probe planning Excel extension for the analysis of uncertainty, probes, and proactive data collection Uncertain situation assessment and data-collection planning Analyst interface Processing of data streams Real-time matching of queries and inference rules against a massive stream of new data Value-added reasoning tools Hypotheses, conclusions, and data-collection plans Approved plans for proactive data collection

13 13 Processing of data streams Value-added reasoning tools Uncertainty database Uncertainty calculus and proactive probe planning Microsoft Excel Representation of probability distributions and qualitative uncertainty Uncertainty arithmetic Uncertainty analysis Representation of data utility Tracking utility changes during data collection Identification of critical uncertainties Situation assessment Representation of probes Evaluation of probe utility Automated selection and launching of critical probes Proactive probe planning What-if analysis of alternative future developments and data- collection plans based on an extension of Excel “scenarios” Contingency planning Analyst interface

14 14 Scalable assessment of uncertain intelligence Uncertain facts Goals, queries, and hypotheses Prioritized plans for proactive data collection Uncertain inference rules Semantic network Critical uncertainties Query matches Evaluation of hypotheses Inferred facts Learned inference rules Conflict detection Manual entry, selection, and editing of knowledge Analyst interface

15 15 Value-added reasoning tools Part of uncertainty database Known patterns Identification of patterns and their gradual changes in massive data streams ARGUS data explorer Contingency analysis What-if analysis of alternative hypotheses, data-collection plans, and possible future developments Alternative scenarios and their implications Markov reasoning Selection of most likely hypotheses and possible future developments Markov models Adversarial search Analysis of possible concealment and disinformation, and plans to prevent them Adversarial goals and resources Identification of syntactically different words that refer to the same objects Entity co-reference These tools are not essential for the core functionality. Uncertainty calculus and proactive probe planning Excel extension for the analysis of uncertainty, probes, and proactive data collection The available intelligence data and inference rules are in Excel tables, and in the uncertainty database integrated with Excel.

16 16 Analyst interface Optional extension of the Excel interface Visualization and explanation of intelligence data, inferences, and data-collection plans

17 17 Outline of the presentation RAPID functionality Preliminary demo Architecture and main components Integration with REALISM Current results and work plan

18 18 Integration goals We will integrate the text-extraction system developed by HNC / Fair Isaac with the uncertainty-analysis system developed by CMU / DYNAM i X. The integrated system will support the following capabilities. Extraction of facts, relations, and causal links from natural-language documents Evaluation of given hypotheses Proactive information gathering Application to the analysis of Iranian nano-technology plans and capabilities

19 19 Inputs and outputs Output: Large structured tables of relevant facts and entities, which include uncertainty Inference-rule representation of relations and causal links, also including uncertainty Input: Requirements and filters for the information extraction Natural-language documents World-wide web Output: Inferences from uncertain data Exact and approximate matches for given queries Hypothesis assessment Proactive plans for collecting additional data Input: Tables of uncertain facts Uncertain inference rules Queries for specific data Analyst hypotheses REALISMRAPID

20 20 Architecture Hypotheses, conclusions, and data-collection plans Information requests REALISM HNC / Fair Isaac Structured relations and causal links Structured facts and entities Topic filters RAPID CMU / DYNAM i X Analyst interface Scalable assessment of uncertain intelligence Uncertainty calculus and proactive probe planning Uncertain situation assessment and data-collection planning

21 21 Outline of the presentation RAPID functionality Preliminary demo Architecture and main components Integration with REALISM Current results and work plan

22 22 Initial results Detailed technical plan of uncertain situation assessment and proactive probe planning: architecture, functionality, and algorithms Uncertain intelligence scenario based on public data about Iranian nano-technology Preliminary prototype of situation assessment tools integrated with a relational database Preliminary prototype of a tool for the resolution of entity co-references Application of DYNAM i X Data Explorer to the nano-tech conference data provided by PAINT

23 23 Current work Uncertainty calculus, integrated with Excel Proactive probe planning Scalable uncertainty assessment, integrated with a relational database Integration with REALISM Initial analyst interface

24 24 Prototype of uncertainty calculus March Prototype of probe-planning tools March Initial RAPID / REALISM integration May Initial analyst interface (extended Excel) June Prototype of uncertainty database July Short-term plan

25 25 Uncertain situation assessment and proactive probe planning July 2008 Discrimination among competing hypotheses and identification of critical uncertainties July 2009 Fully integrated deployable prototype July 2009 Advanced proactive-intelligence planning and learning of inference rules July 2010 Value-added tools, which may include data- stream processing, entity co-reference, adversarial search, and Markov reasoning July 2011 Fully integrated deliverable system Jan 2012 All versions of RAPID will demonstrate all main capabilities, with increasing functionality over time. Long-term plan

26 26 Evaluation We expect that RAPID will provide significant advantage over available off-the-shelf tools, such as standard spreadsheets and database systems. To support this claim, we plan to compare the productivity of analysts using RAPID with that of analysts who perform the same tasks using commercially available tools. Experimental group: Use of RAPID Control group: Use of standard tools

27 27 Evaluation We expect that RAPID will provide significant advantage over available off-the-shelf tools, such as standard spreadsheets and database systems. To support this claim, we plan to compare the productivity of analysts using RAPID with that of analysts who perform the same tasks using commercially available tools. We will view RAPID as success if it consistently outperforms the standard tools, and the analysts report the overall positive experience of using it.

28 28 Adjustment of the earlier plan We need to adjust the plan to the new budget. We will deliver the full core functionality, but we propose to reduce the work on value-added tools. Reduced work Processing of data streams Advanced contingency analysis Analyst interface Suspended work Predictive Markov models Analysis of adversarial actions

29 29

30 30 Appendices Previous work Empirical evaluation PAINT contributions

31 31 ARGUS ARGUS project sponsored by DTO/ARDA : Identification and tracking of novel patterns in massive databases and data streams. Create Background Model Detect Novel Events Generate Profiles Re-cluster Update Profiles Match Historical Data Background Model Novel Events Novel Clusters Tracked Events New Profiles Data Alerts Analysts Create Background Model Detect Novel Events Generate Profiles Re-cluster Match Historical Data Background Model Novel Events Novel Clusters New Profiles New Alerts Analysts

32 32 ARGUS Estimate the density function at t 0 Grow the cluster for a period of Δt while reducing the weight of old records Estimate the new density function at t 0 +Δt Compare the two estimates

33 33 ARGUS t0t0 + Δt Re-clustering Respiratory Diseases SARS Density change

34 34 RADAR RADAR project sponsored by DARPA : Analysis and management of volatile crisis situations based on uncertain data. Data elicitorParserOptimizer Process new data Update crisis- management plans Suggest data- collection strategies Top-level control and learning Analysts

35 35 RADAR We have applied the system to repair a schedule of a conference after a crisis loss of rooms. After Crisis 0.50 Manual Repair 0.61 Auto w/o Elicitation 0.72 Auto with Elicitation 0.93 Schedule Quality Manual and auto repair 20 0.72 0.93 Schedule Quality 60 40 80100 Number of Questions (Out of 1100) Dependency of the quality on the number of questions 0

36 36 RAPID Unlike ARGUS … Represents and analyzes uncertainty Supports complex inferences Unlike RADAR … Scales to massive intelligence datasets Analyzes complex “external” situations Develops intelligence-collection plans

37 37 Appendices Previous work Empirical evaluation PAINT contributions

38 38 Evaluation goals We expect that RAPID will provide significant advantage over available off-the-shelf tools, such as standard spreadsheets and database systems. To support this claim, we plan to compare the productivity of analysts using RAPID with that of analysts who perform the same tasks using commercially available tools. Experimental group: Use of RAPID Control group: Use of standard tools

39 39 Experimental setup We expect to recruit retired intelligence analysts for the system evaluation, and ask them to perform several tasks based on given uncertain data. Identify the data most relevant to given tasks Evaluate the validity of given hypotheses Find relevant hidden patterns Identify critical missing data and propose a cost-effective plan for collecting this data

40 40 Performance measurements We will measure the following main factors to evaluate the performance of analysts: Number of high-level tasks completed within the experiment time frame Accuracy of hypothesis evaluation Number and relevance of identified patterns Effectiveness and costs of data-collection plans We will also ask analysts to complete a questionnaire on their overall experience.

41 41 Expected results We will view the proposed work as success if RAPID consistently outperforms the off-the- shelf tools in all four performance factors, the performance difference for each factor is statistically significant, and analysts report the overall positive experience of using the system.

42 42 RAPID / REALISM evaluation Component utility: We will also evaluate the utility of REALISM and RAPID by comparing the productivity of subjects under the following three conditions: Use of the integrated system Use of REALISM without RAPID Use of RAPID without REALISM Component evaluation: We will measure the following performance factors: Accuracy and completeness of text extraction Accuracy of hypothesis evaluation Effectiveness of data-collection plans Speed of each system component

43 43 Appendices Previous work Empirical evaluation PAINT contributions

44 44 Main contributions Feedback Strategy Generation and Exploration Dynamic Simulation Models Response Options 2 3 4 Representation of massive uncertain knowledge Automated discovery of causal relationships Fast probabilistic integration of all evidence Analysis of possible future developments 1 Identification of critical uncertainties Planning of proactive intelligence gathering 1 4 3 Data

45 45 Inputs and outputs Uncertain intelligence and analyst opinions: Massive stream of structured records Specific hypotheses New learned rules Data-search queries Query matches Evaluation of hypotheses Plans for proactive intelligence collection Uncertain situation assessment Inference rules Domain knowledge RAPID General intelligence collection Proactive intelligence collection

46 46 Inputs From other PAINT components: Available intelligence data and its certainty Hypotheses about unknown factors Available domain knowledge From analysts: Intelligence-analysis tasks and priorities Hypotheses and related opinions Responses to RAPID -generated probes Additional domain knowledge From other sources: Databases with available intelligence Public databases with relevant data

47 47 Outputs Inferences from available uncertain data Evaluation of given hypotheses New hypotheses and their certainties Plans for proactive intelligence collection Learned inference rules


Download ppt "1 RAPID: Representation and Analysis of Probabilistic Intelligence Data Carnegie Mellon University PI : Prof. Jaime G. Carbonell / / (412)"

Similar presentations


Ads by Google