Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analytic Standards: Common Standards for Evaluating the Quality of Analysis Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.

Similar presentations

Presentation on theme: "Analytic Standards: Common Standards for Evaluating the Quality of Analysis Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist."— Presentation transcript:

1 Analytic Standards: Common Standards for Evaluating the Quality of Analysis
Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community AOL Government Blogger June 18, 2012

2 Background OMG Financial and Government DTF Meetings, Reston, VA, March 21 & 22, 2012 Big Data Analytics: Finding the right needles in the Haystacks Working session (cancelled). Interactive working session to define the charter and scope of this new working group within FDTF to focus on ‘Linked Semantic Networks’. Semantics and Ontologies for the Intelligence Community Working Toward Standards I had the assignment to get the Catalyst Program Report and provide recommendations back on Analytic Standards: Common Standards for Evaluating the Quality of Analysis. Teamed with David Webber: Rapid NIEM XML Exchange Design and UML Models.

3 New Developments New Government Funding and Challenge Competitions
OMG OpenGovernment to Information Sharing Working Group Proposal from Cory Casanave, OMG Open Government Working Group, Chair, June 19th OMG SmartData Coalition Kick-off, June 19th

4 New Government Funding and Challenge Competitions
Building a Digital Government Federal CIO Steve VanRoekel Innovation with Health Data Federal CTO Todd Park Big Data Initiatives: NIST: In-Q-Tel and NITRD:

5 Building a Digital Government

6 Building a Digital Government by Example
Report Reference Number Report Explanation My By Example 2 Digital information is information that the government provides digitally. Information, as defined in OMB Circular A-130, is any communication or representation of knowledge such as facts, data, or opinions in any medium or form, including textual, numerical, graphic, cartographic, narrative, or audiovisual forms. I have  included facts, data, or opinions in any medium or form, including textual, numerical, graphic, cartographic, narrative, or audiovisual forms. 3 Digital services include the delivery of digital information (i.e. data or content) and transactional services (e.g. online forms, benefits applications) across a variety of platforms, devices and delivery mechanisms (e.g. websites, mobile applications, and social media). I have not included transactional services in this example (see Reference 34), but this does work across a variety of platforms, devices and delivery mechanisms. 4 Device-agnostic means a service is developed to work regardless of the user’s device, e.g. a website that works whether viewed on a desktop computer, laptop, smartphone, media tablet or e-reader. MindTouch and Spotfire work on multiple devices. 14 For the purposes of this document, the term “content” will refer to all unstructured information, while the term “data” will refer to all structured information unless otherwise noted. I have integrate both unstructured and structured information. 15 Web APIs are a system of machine-to-machine interaction over a network. Web APIs involve the transfer of data, but not a user interface. MindTouch and Spotfire provide both APIs and user interfaces. Continues with 9 more Mappings of the Report Reference Numbers to My By Examples

7 Innovation with Health Data

8 Now and Next II Semantic Community II Comments Drupal 7 MindTouch TCS Publishing and Community Engagement Solr Lucene (MindTouch TCS) and Spotfire Faceted Search CKAN and (MindTouch TCS) On Demand Resources EC2 - GovCloud EC2 - Amazon Platform- and Data-as-a-Service GetHub Deki Script (MindTouch TCS) No Coding Agile Dynamic Case Management MindTouch, Spotfire, and Be Informed

9 OMG OpenGovernment to Information Sharing Working Group Proposal
It has been a while since we started the OMG OpenGovernment working group under the government domain task force. This seems to have suffered the same fate as the overall open government initiative – it isn’t what people are talking about today. What would people think about re-chartering this as the “model driven information sharing” ( working group as that theme seems more front and center, it is also more in line with NIEM and potential related standards. Cory Casanave, Open Government Working Group, Chair Comments: I like the idea very much! Richard Soley Intriguing idea. Would be curious what others think. Thanks Cory for sparking the thread. Thanks. Kshemendra Paul

10 Question and Answer Question: How far do you think NIEM UML takes us towards building a useful bridge between open government and information sharing, and what do you think are the next steps? Kshemendra Paul Answer: Not very far. What about joining with this new OMG Smart Data SIG? It supports the current Building a Digital Government, Innovation with Health Data, and Big Data focus that is getting new funding. Brand Niemann

11 OMG SmartData Coalition Kick-off, June 19th
SmartData Coalition will NOT be developing any standards. Members will work to define and validate business usage scenarios and data innovation in a given domain or across domains. We are hoping to get a business champion from major domains such as financial services, healthcare, insurance...who can understand and champion ‘why we need data semantics’ – Big Data is gaining traction BUT SmartData is where the future business value resides and without semantics we’ll not get there…or get there at a higher cost… We’ll partner with OMG and other domain standards groups and identify available standards and gaps. Of course, we’ll also partner with technical/data management standards experts to figure out how we might ‘dynamically connect’ structured and unstructured/semi-structured data and its semantics within an organization and public data. As public and private sector face many similar challenges in deriving business value from data, we’ll also reach out to leaders from public sector to get their support We will refine the charter, roadmap, deliverables etc. during the June 19th working session. Source: Harsh Sharma, June 12, 2012.

12 GovDTF Roadmap U.S. Government WG Steering Committee manages the overall roadmap of the GovDTF. It is chaired by Bob Spangler of NARA and Rick Murphy of GSA. This group has been operating in the context of the plenary. Perhaps we should push on expanding its independent activity. It’s membership needs to be brought up to date. I would like to see the ISE & DNI join this Steering Committee. Members are appointed by the chairs guided by the consensus of the GovDTF membership. Source: Larry Johnson, Government Information Sharing and Services DTF, Co-Chair, June 14, 2012.

13 SmartData Coalition: Draft Charter
Help organizations better leverage and monetize their ‘Data Assets’ Empower Business to ‘dynamically stitch together’ internal and external data to answer operational questions, meet regulatory needs, get competitive insights and seek additional revenue opportunities… Champion the notion that focus on ‘Data Semantics*’ is the key to deriving better business value from data assets Define and promote the role and certification for SmartData professional (SDP) Define and validate business usage scenarios (domain and cross-domain) for SmartData Define and publish ‘business framework/Core’ for SmartData Partner with stakeholders from private & public sector, OMG and other standards bodies to define and validate the framework, namespace (URI?) alignment framework and required standards** Research taxonomies of identifiers that will allow reasoning engines to access and process domain semantics in near real-time Namespace/URI registry metamodel for industry standard semantics networks Develop requirements for a logical data model for portfolio of Data Assets (”Depository of corporate data Assets’ Semantics”) Chair(s): TBD Footnotes: * Semantics (business meaning, context, nuances, rules) ** SmartData SIG by itself will NOT develop standards Source: Neville Teagarden, Lucid, Harsh Sharma, Citi, June 12, 2012.

14 My Perspective Object Management Group Became The Home of Modeling Standards in 1989. OMG Embraced Cloud Computing as a New Wave in 2009 and Formed the Cloud Standards Customer Council in 2011. OMG Supported the Business Value of Data and Semantics in March 2012. OMG Can Become the Home of SmartData and Semantics for Business. This will improve all of the modeling standards work to have better data and semantics in the business applications! OMG Can Become the Home of Analytic Standards for Quality Analysis Based on SmartData and Semantics for Business.

15 Be Informed Has Joined OMG
My colleague Mills Davis provided the following SemTech 2012 take-aways on Be Informed (a new member of OMG): Be Informed is perceived as a technology leader -- it has taken semantic, model-driven computing further than anyone else in the space, yet it is still RDF and RDF/S at the core! Be Informed is the largest semantic technology pure play -- as measured by customers, employees, and annual revenues. From a "crossing the chasm" perspective, Be Informed perceived as already being established at the "bowling alley" stage -- it is capable of handling mainstream, enterprise class problems, and delivering solid value at every stage of the solution life cycle -- as shown by multiple customer stories and case study presentations. The conference organizer (Tony Shaw) was recommending Be Informed as a "company on the way up", and offered them speaking opportunities at upcoming events.

16 Be Informed – They Really Got It!

17 It All Starts With The Heilmeier Questions
Each IARPA program starts with a good idea and a good person to lead it. Without both, IARPA will not start a program. 1. What are you trying to do? 2. How does this get done at present? Who does it? What are the limitations of present approaches? Are you aware of the state-of-the-art and have you thoroughly thought through all the options? 3. What is new about your approach? Why do you think you can be successful at this time? Given that you've provided clear answers to 1 & 2, have you created a compelling option? What does first-order analysis of your approach reveal? 4. If you succeed, what difference will it make? Why should we care? 5. How long will it take? How much will it cost? What are your mid-term and final exams? What is your program plan? How will you measure progress? What are your milestones/metrics? What is your transition strategy? Source:

18 OMG Government DTF Meeting, Reston, VA, March 22, 2012
I went to this meeting and learned things, I volunteered to work on an assignment for the next meeting, I got a report that was very helpful in my assignment, and I used the CIA World Fact Book to illustrate ten Catalyst functions (see next slides).

19 Analytic Transformation: Unleashing the Potential of a Community of Analysts
Linking Disparate and Dispersed Data to Aid Intelligence Discovery, Analysis, and Warning What is Catalyst? Catalyst is a program to enable analysts to make discoveries in large amounts of intelligence data without succumbing to information overload. How can Catalyst help us? Catalyst will introduce an all-source data-linking process into the traditional intelligence business model. What is happening with Catalyst? A scaling experiment has been completed to support the design of common services for the Community.

20 Catalyst Knowledge Base Dashboard
This is a simple example of Catalyst! The CIA World Factbook is a simple example that is scaled up to 267 countries! Web Player

21 Gall’s Law "A complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: a complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a simple system." - John Gall, systems theorist Key Points: Gall's Law says that all complex systems that work evolved from simpler systems that worked. If you want to build a complex system that works, build a simpler system first, and then improve it over time. Gall's Law is why Prototypes and Iteration work so well when creating value. Creating a complex system from scratch is sure to end in failure. Questions for Consideration: Are you trying to build a complex system from scratch? Could you start with a simpler system that already works, then build upon it?  Gall's Law

22 Analytic Transformation Catalyst Program Definitions
Entity: A representation of a thing in the real world, either concrete or abstract (e.g., Name). Entity Extraction: The identification and classification of entities embedded in some kind of unstructured data, such as free text, an image, a video, etc. (People, Places, and Things).  Relationship Extraction: The identification and classification of object properties (relationships) embedded in some kind of unstructured data, such as free text, an image, a video, etc. Semantic Integration: Integrate entities and their attributes and relationships to provide better data to work with in the knowledge base Entity Disambiguation: The association of two entities extracted from data as being two instances of the same real-world entity.  Knowledge Base: A collection of entities (instances) called quad stores, where each datum is a triple of an entity's property with value and the associated metadata. Visualization:  Interfaces to the integrated entities knowledge base like timeline or geographic displays of the entities that help the analyst understand the set as a whole. Query: Interface that allows analysts to search the integrated entities knowledge base for entities of interest. Analysis: Information made available to users so they can retrieve information about entities and detect patterns of interest to their mission. Ontology/Data Model: The definitions of the classes and the properties of the classes. Reference Data: Government databases that are openly available at no cost (e.g., CIA World Fact Book).

23 Analytic Transformation Catalyst Program Example: CIA World Fact Book
Entity: CIA Subject Matter Experts Entity Extraction: MindTouch and Excel Relationship Extraction: MindTouch and Excel Semantic Integration: MindTouch, Excel, and Spotfire Entity Disambiguation: MindTouch and Excel Knowledge Base: MindTouch, Excel, and Spotfire Visualization: Spotfire Query: MindTouch and Spotfire Analysis: Spotfire Ontology/Data Model: Be Informed Reference Data: MindTouch

24 A CIA World Factbook Framework
The World Factbook provides information on the history, people, government, economy, geography, communications, transportation, military, and transnational issues for 267 world entities. Our Reference tab includes: maps of the major world regions, as well as Flags of the World, a Physical Map of the World, a Political Map of the World, and a Standard Time Zones of the World map.

25 Entity Extraction: MindTouch and Excel
Steps in Creating Country Sub-Pages: Copy: wiki.toc(page.path) embedded inside double braces Source: Add URL Go to: Click on New Page Select Blank Page Paste: {{wiki.toc(page.path)}} Click on Nauru in Excel: Copy URL: Click on Expand All Paste URL to Source: Copy Nauru to Page Title Save Page Copy Nauru Page (carefully) Edit Nauru Page Delete Line Space at Top Paste Nauru Page Below Source Delete First and Expand All/Coppalse All Rows Delete Editing Icon (Yellow) and Text After :: and Make Header 1 Do the Same for the Eight Additional Editing Icons Save the Page and Check to Make Sure there are Nine Items in the Table of Contents at the Top (there are a few Countires that have less than Nine) Repeat the Process 277 More Times

26 Entity Extraction: MindTouch and Excel

27 Entity Extraction: MindTouch and Excel

28 Relationship Extraction: MindTouch and Excel
One Table: Two Columns Example: Column 1: Section and Column 2: URL Note: A Column 3: Description could be in the URL Example: See Slide 11 Three Columns: Example: Column 1: Subject, Column 2: Object, and Column 3: Predicate Note: This is the Semantic Web’s Linked Open Data Cloud as Linked Open Data for Network Analytics! Example: See Semantic Medline Four Columns: Examples: Column 1: Subject, Column 2: Attribute, Column 3: From, and Column 4: To, or Column 1: City, Column 2: Country, Column 3: Longitude, and Column 4: Latitude Note: This is the format for Spotfire’s Network Analytics Module developed for the CIA Note: Also Multiple Tables for Federation of Data Sets with Spotfire Information Designer and Open Software Virtuoso.

29 Semantic Integration: MindTouch, Excel, and Spotfire
Web Player

30 Remaining Catalysts Functions
Entity Disambiguation: MindTouch and Excel Work of CIA SME’s and see next slides on Query. Knowledge Base: MindTouch, Excel, and Spotfire See previous slides. Visualization: Spotfire Query: MindTouch and Spotfire See next slides. Analysis: Spotfire Ontology/Data Model: Be Informed See separate slides. Reference Data: MindTouch Note: The System of System Architecture and Process is: Semantic Index of Linked Data, Data Science Products, Data Science Library, and Dynamic Case Management.

31 Query: MindTouch and Spotfire
Google Chrome Browser: Find

32 Query: MindTouch and Spotfire
Spotfire Tools: Find and Filters Web Player

33 Analytic Standards: Common Standards for Evaluating the Quality of Analysis
What are Analytic Standards? Analytic Standards govern the production and evaluation of national intelligence analysis. The standards are intended to guide the writing of intelligence analysis in all Intelligence Community (IC) analytic elements and should be included in analysis teaching modules and case studies.  The following five core principles serve as the nucleus of analytic standards: Objectivity Independent of political considerations Timeliness Informed by all relevant sources of information Demonstrates proper standards of analytic tradecraft  The Office of Analytic Integrity and Standards (AIS) within the Office of the Director of National Intelligence is constantly working to build a network of analysts interested in learning new methods, connecting with other analysts using structured techniques, and learning from methodological experts both inside and outside of the IC.

34 Analytic Standards How can Analytic Standards help us?
Common standards across the IC leave no room for ambiguity, and provide clear, consistent guidance to analysts, managers, and trainers for the production of analytic products and processes. The five core principles set the standard by which analytic products can be measured using quantitative and qualitative methods. What is happening with Analytic Standards? AIS provides continuous feedback to IC elements on the quality of analytic tradecraft and recently published a report analyzing a sample of over 1,500 of the Community’s finished intelligence products from 2006 and To promote continuous learning and improvement, each IC analytic element is developing or refining its own in-house analytic tradecraft evaluation program to further advance understanding of the analytic standards and how to apply them.

35 Next Steps I am building a team to work on the OMG project where each team member will have a short list of tools they are familiar with to apply to our NGA and other work. Team (to date): Kate Goodier Elisa Kendall Eric Little Brand Niemann, Jr. Brand Niemann Sr. Joe Rockmore My short list is: Cambridge Semantics (Lee Feigenbaum) Digital Reasoning (Eric von Eckartsberg) Recorded Future (Jason Hines) Semantic Insights Research Assistant (Chuck Rehberg) Semantic Medline (Tom Rindflesch) Spotfire (Jim Hawley) NIST BIG DATA Workshop (June 13-14) Neal Ziring, NSA, Technical Director of Information Assurance Directorate Dr. Peter Highman, IARPA 14th SOA for Government Conference (October 3, 2012): Eric Little (Orbis Technologies) Kate Goodier (L3 Stratis) George Strawn (OSTP NITRD) Victor Pollara (Noblis) Tom Rindflesch (NLM) Mark Guilton (Cray Computer) or Steve Reinhardt (YARC Data)

36 Questions and Answers Dr. Brand Niemann
Director and Senior Data Scientist Semantic Community

Download ppt "Analytic Standards: Common Standards for Evaluating the Quality of Analysis Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist."

Similar presentations

Ads by Google