Presentation is loading. Please wait.

Presentation is loading. Please wait.

UNCLASSIFIED 1 Searching for the Quantifiable, Scalable, Verifiable, and Understandable Quantitative Methods in Defense of National Security, 25 May 2010.

Similar presentations

Presentation on theme: "UNCLASSIFIED 1 Searching for the Quantifiable, Scalable, Verifiable, and Understandable Quantitative Methods in Defense of National Security, 25 May 2010."— Presentation transcript:

1 UNCLASSIFIED 1 Searching for the Quantifiable, Scalable, Verifiable, and Understandable Quantitative Methods in Defense of National Security, 25 May 2010 Dewey Murdick, Ph.D. Program Manager 25 May 2010

2 UNCLASSIFIED Intelligence Advanced Research Projects Activity (IARPA) 25 May 20102

3 UNCLASSIFIED Overview  This is about taking real risk. –This is NOT about “quick wins”, “low-hanging fruit”, “sure things”, etc.  CAVEAT: HIGH-RISK/HIGH-PAYOFF IS NOT A FREE PASS FOR STUPIDITY. –Competent failure is acceptable; incompetence is not.  “Best and brightest”. –World-class PMs. o IARPA will not start a program without a good idea and an exceptional person to lead its execution. –Full and open competition to the greatest possible extent.  Cross-community focus. –Address cross-community challenges –Leverage agency expertise (both operational and R&D) –Work transition strategies and plans IARPA’s mission is to invest in high-risk/high-payoff research programs that have the potential to provide the U.S. with an overwhelming intelligence advantage over our future adversaries 25 May 20103

4 UNCLASSIFIED The “P” in IARPA is very important  Technical and programmatic excellence are required  Each Program will have a clearly defined and measurable end-goal, typically 3-5 years out. –Intermediate milestones to measure progress are also required –Every Program has a beginning and an end –A new program may be started that builds upon what has been accomplished in a previous program, but that new program must compete against all other new programs  This approach, coupled with rotational PM positions, ensures that… –IARPA does not “institutionalize” programs –Fresh ideas and perspectives are always coming in –Status quo is always questioned –Only the best ideas are pursued, and only the best performers are funded. 25 May 20104

5 UNCLASSIFIED The “Heilmeier Questions” 1.What are you trying to do? 2.How does this get done at present? Who does it? What are the limitations of the present approaches? –Are you aware of the state-of-the-art and have you thoroughly thought through all the options? 3.What is new about your approach? Why do you think you can be successful at this time? –Given that you’ve provided clear answers to 1 & 2, have you created a compelling option? –What does first-order analysis of your approach reveal? 4.If you succeed, what difference will it make? –Why should we care? 5.How long will it take? How much will it cost? What are your mid-term and final exams? –What is your program plan? How will you measure progress? What are your milestones/metrics? What is your transition strategy? 25 May 20105

6 UNCLASSIFIED The Three Strategic Thrusts (Offices)  Smart Collection: dramatically improve the value of collected data –Innovative modeling and analysis approaches to identify where to look and what to collect. –Novel approaches to access. –Innovative methods to ensure the veracity of data collected from a variety of sources.  Incisive Analysis: maximizing insight from the information we collect, in a timely fashion –Advanced tools and techniques that will enable effective use of large volumes of multiple and disparate sources of information. –Innovative approaches (e.g., using virtual worlds, shared workspaces) that dramatically enhance insight and productivity. –Methods that incorporate socio-cultural and linguistic factors into the analytic process. –Estimation and communication of uncertainty and risk.  Safe and Secure Operations: countering new capabilities of our adversaries that could threaten our ability to operate effectively in a networked world –Cybersecurity o Focus on future vulnerabilities o Approaches to advancing the "science" of cybersecurity, to include the development of fundamental laws and metrics –Quantum information science & technology 25 May 20106

7 UNCLASSIFIED Program Manager Interest Areas by Office 7 smart collection safe and secure operations incisive analysis 20 April May 2010

8 UNCLASSIFIED Concluding Thoughts on IARPA  Technical Excellence & Technical Truth –Scientific Method –Peer/independent review –Full and open competition  We are looking for outstanding PMs.  How to find out more about IARPA: 25 May 20108

9 UNCLASSIFIED  Conference on Technical Information Discovery, Extraction & Organization –Mark Heiligman, IARPA PM, Mile-wide, Mile-deep (M2) Exploration –Held October 28-29, 2008, consisted of talks, breakout sessions, and open discussion –Attended by 30+ researchers, business intelligence, and government participants  Facilitated an open and active discussion on current methods, challenges, and opportunities in: –Information Retrieval –Text Processing –Knowledge Discovery –Information Extraction –Social Network Analysis –Scientometrics –Information Visualization and –Closely related research domains  Goal: Drive technical innovation and explore novel applications in the area of systematically mining the global technical literature for useful and non- obvious information and insights 925 May 2010 This talk is a personal summary of the materials presented and discussed at the conference.

10 UNCLASSIFIED M 2 Information Content  Formal Presentations –Mile-wide, Mile-deep, Mark Heiligman, IARPA –Information Retrieval, Scientometrics/Text Mining,and Literature-related Discovery and Innovation, Ron Kostoff, MITRE –From Knowledge Mapping to Innovation Evolution, Hsinchun Chen, University of Arizona –Machine Learning for Extraction, Integration and Mining of Research Literature, Andrew McCallum, University of Massachusetts Amherst –Information Retrieval:The Path Ahead, Jamie Callan, Carnegie Mellon University –Sentiment Analysis from User Forums, Ronen Feldman, Hebrew University –The Accuracy of a Map of Science: Measurement & Implications, Richard Klavans, SciTech Strategies, Inc –Document Classification Using Nonnegative Matrix Factorization, Michael W. Berry, University of Tennessee, Knoxville  Breakout Sessions & Open Discussion – richest idea content, and biggest contribution to what follows  MITRE Summary: –A Two-step Analytic-workshop Process For Identifying Promising Research Opportunities, by Ronald Kostoff et al. 25 May

11 UNCLASSIFIED Problems  Too Much Data / Diversity –Scale –Textual / Multimedia –Multilingual –Multiple Sources  Too Complex –Motivation (Create / Disseminate) –Topics / Domains (# / Connectedness) –Shared Intentionally or Not  Too Fast – Streaming Example for Technical Topics: Scientific Literature, Patents, Conference Proceedings, Talks, Technical Blogs, S&T News, Social Media, Experimental Data, Computational Models / Code, Forecasts, Corporate Filings, Government Funding, Policy, Public Opinion, etc May 2010

12 UNCLASSIFIED Weak Signals in Context  Find weak signals  Use weak signals within context for –Finding connections –Anomaly detection/rare events –Cultural meaning / implications  Manage uncertainty  Development new standards for “ground truth” 1225 May 2010

13 UNCLASSIFIED  Automated Connection Making / Knowledge Discovery  Iterative information retrieval (IR), extraction (IE), and linkages identification  Leveraging previous relevancy judgments and feedback  Probabilistic linking of subjective qualities within text  Goal: find high-value, low-signature information in context Connecting Weak Signals 13 Material processing method X may be interesting for property Y Intriguing Rumors, Uncertain Source Analyst Analyst w/ Quantitative System 25 May 2010

14 UNCLASSIFIED Enhancing Contextual Awareness  Automatically –Leverage element characteristics in connection building process –Focused information augmentation from secondary sources –Characterize and apply to analogous situations o Network Behaviors and Features o Assessments of subjectivity (e.g., theme, sentiment)  Goal: rapidly inform non-experts with context about a given area/issue 14 Context S&T Literature Where does this nugget of information fit? Analyst 25 May 2010

15 UNCLASSIFIED Identifying Outliers, Rare Events  Automatically –Measuring and analyzing low-frequency indicators in group trends –Systematically identifying anomalies from records of interest and early-stage emerging technologies –Identifying rare events based on non-technical phrase association patterns –Extracting technical phrases of interest by targeting non-technical phrases such as sentiment, analysis, stylistics, etc. –Intelligent clustering techniques  Goal: Identify significant rare events 15 Bank statements Is Jim doing something illegal? Analyst 25 May 2010

16 UNCLASSIFIED Collaboration (Two Different Kinds)  Common playground facilitating: –Large-scale data sharing –Data discovery annotation –Error corrections –Multi-source integration –Recall of what has been done in the past  Measure collaboration –Recognize cultural differences –Discover key players –Process changes over time 1625 May 2010

17 UNCLASSIFIED Multilingual Methods 17  Need algorithms that can process, filter, and analyze multilingual data  Leverage domain-specific machine translation  Compare and contrast translated and multilingual data for improvements in queries, trends, etc.  Language translation is high cost  Translation is not enough to understand meaning in non-English text  Cultural information helps to understand social landscape, motivation, and production of scientists in S&T 25 May 2010

18 UNCLASSIFIED No Black Boxes  No Algorithm black boxes –Shared environment for algorithm development –Success verifiable through indicator metrics –Output must be humanly comprehensible  Human comprehension metrics: o Number of potential associations o Number of dimensions simultaneously analyzed o Steps to finding information o Amount of time to digest information o Amount of information at time o Efficiency of user-driven tuning of level-of-detail  Algorithmic output exportable to interactive tools 1825 May 2010

19 UNCLASSIFIED User-Friendly Displays for Data Analysis  Interactive and multifaceted views of scientific landscape –Geo-location –Entity Networks –Topical Networks  Environments that provide both contextual awareness and visualizations –Contextual information (Wikipedia style) provided when user encounters unfamiliar term or concept  Interactive interfaces to pull out information 1925 May 2010

20 UNCLASSIFIED Metric Validation Processes  User studies and human labeling to verify data in information extraction(IE) and NLP is costly  Use hybrid methods (e.g., boosting)  Leverage automatically processed information from a external source to validate output  Automating identification of trusted sources to help validation process  Validate results with historical studies, knowledge of current state, and forecasts 20 Serious Need for Novel Thinking 25 May 2010

21 UNCLASSIFIED Things to Remember  Track Uncertainty –Indicator metrics –Weak signals  No black boxes –Human comprehensible output  Provide clear view of evaluation metrics –Gold standards –Ground truth 2125 May 2010

22 UNCLASSIFIED Take Action  Respond to an open BAA  Chat with a Program Manager (PM)  Come up with new ideas for programs, become a PM  Provide information to open RFIs 2225 May 2010

Download ppt "UNCLASSIFIED 1 Searching for the Quantifiable, Scalable, Verifiable, and Understandable Quantitative Methods in Defense of National Security, 25 May 2010."

Similar presentations

Ads by Google