Graph Data Analytics Arka Mukherjee, Ph.D. Global IDs Resolving Complexity at an Enterprise Scale.

Slides:



Advertisements
Similar presentations
Theme 3: Architecture. Q1: Who houses stuff, both records and identifiers All useful services and repositories are centralized (latency, etc.) … but centralizing.
Advertisements

C Introduction to the Geostat project Session on User needs (Geostat workshop in Bled 1-3 october 2008) Lars H. Backer
Data Science for Business: Semantic Verses Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Generating Dynamic Social Networks from Large Scale Unstructured Data Enterprise Software to Make Sense of Really Junky Data Tim Estes - CEO, Digital Reasoning.
A Fast Growing Market. Interesting New Players Lyzasoft.
From Relational to Semantics A Methodology Arka Mukherjee, Ph.D. Founder / CTO Global IDs David Schaengold Director,
Future of Financial Management Transparency and Intelligent DataTM
Third-generation information architecture November 4, 2008.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Chapter 14 The Second Component: The Database.
Implementing ISO Aleta Vienneau and David Danko ESRI.
Overview of Web Data Mining and Applications Part I
© 2011 Infotech Enterprises. All Rights Reserved We deliver Global Engineering Solutions. Efficiently.August 7, 2015 Geo-Technical Data management – A.
Information systems and databases Database information systems Read the textbook: Chapter 2: Information systems and databases FOR MORE INFO...
Chapter 2: Business Intelligence Capabilities
IBM User Technology March 2004 | Dynamic Navigation in DITA © 2004 IBM Corporation Dynamic Navigation in DITA Erik Hennum and Robert Anderson.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
© 2011 IBM Corporation Smarter Software for a Smarter Planet The Capabilities of IBM Software Borislav Borissov SWG Manager, IBM.
Context and Prosopography: Putting the 'Archives' Into LOD-LAM Corey A Harper SAA MDOR
Understanding Data Warehousing
Chapter 1 Course Orientation. Outline Definition of data source management Definition of data source management Importance data source management to organization.
Creating the Foundation for Enterprise Information Management.
Attribute Data in GIS Data in GIS are stored as features AND tabular info Tabular information can be associated with features OR Tabular data may NOT be.
Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Organizational Memory: Issues in Design & Implementation Sree Nilakanta May 1, 2000.
DAY 14: ACCESS CHAPTER 1 Tazin Afrin October 03,
IST SEWASIE SEWASIE 3rd Review March 14, 2005 SEWASIE Value Proposition and End User Demo Andreas Becks.
What You Need before You Deploy Master Data Management Presented by Malcolm Chisholm Ph.D. Telephone – Fax
© 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 Ecommerce Antoine Harfouche.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Leaping the chasm: moving from buzzwords to implementation of learning analytics George Siemens Technology Enhanced Knowledge Research Institute (TEKRI)
Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) Himanshu Grover.
SharePoint, The Semantic Web, Serendipity, Search & Metadata.
Using Taxonomies Effectively in the Organization KMWorld 2000 Mike Crandall Microsoft Information Services
Reading Discussions Metcalfe’s Law paper What is metcalfe’s Law? Examples from the Web? How can we utilize it? How semantics contribute to social networks,
Database Essentials. Key Terms Big Data Describes a dataset that cannot be stored or processed using traditional database software. Examples: Google search.
When bet365 met Riak and discovered a true, “always on” database.
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Database Systems. Role and Advantages of the DBMS Improved data sharing Improved data security Better data integration Minimized data inconsistency Improved.
Report and Output Management in an Enterprise Content Management Ecosystem.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
HEMANTH GOKAVARAPU SANTHOSH KUMAR SAMINATHAN Frequent Word Combinations Mining and Indexing on HBase.
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
Big Data Analytics Are we at risk? Dr. Csilla Farkas Director Center for Information Assurance Engineering (CIAE) Department of Computer Science and Engineering.
Extreme Content Management at LexisNexis Alfresco Summit 2013 Presenter: Flavio Villanustre LexisNexis Risk Solutions, Reed Elsevier November 13 th, 2013.
Big Data analytics in the Cloud Ahmed Alhanaei. What is Cloud computing?  Cloud computing is Internet-based computing, whereby shared resources, software.
Course : Study of Digital Convergence. Name : Srijana Acharya. Student ID : Date : 11/28/2014. Big Data Analytics and the Telco : How Telcos.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
MarkLogic The Only Enterprise NoSQL Database Presented by: Aashi Rastogi ( ) Sanket Patel ( )
What is the Big Data Challenge? Organizations are seeking solutions that combine the real-time analytics capabilities of SAP HANA and accessibility to.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
BIG DATA BIGDATA, collection of large and complex data sets difficult to process using on-hand database tools.
The Rise of Data CHAOS Driving Growth through Digital Transformation.
The BI360 Business Intelligence Suite
SAS users meeting in Halifax
Pentaho 7.1.
Johannes Peter MediaMarktSaturn Retail Group
Let's make a complex dataset simple using Azure Cosmos DB
Microsoft Connect /22/2018 9:50 PM
Automating SSIS Design Patterns with Biml
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Database Systems Summary and Overview
Zoie Barrett and Brian Lam
About Thetus Thetus develops knowledge discovery and modeling infrastructure software for customers who: Have high value data that does not neatly fit.
Strategy of big data Submitted by: Lehar Karthik Student ID:
Advanced Geospatial Techniques: Aiding Earth Observation Applications
Presentation transcript:

Graph Data Analytics Arka Mukherjee, Ph.D. Global IDs Resolving Complexity at an Enterprise Scale

© 2013 Global IDs 2 Proprietary 11 The “Complex Data” Context Current Challenges 22 Governance Methodology 33 Topics

The “Complex Data” Context

© 2013 Global IDs 4 Proprietary The Big Shift

© 2013 Global IDs 5 Proprietary The cost structure is unsustainable The cost of managing information is going up exponentially.

© 2013 Global IDs 6 Proprietary The Complexity growth is unmanageable 1.Complex data ecosystems 2.Highly dynamic 3.Limited traceability 4.Systemic Risk : Hard to measure Financial Services Institutions

© 2013 Global IDs 7 Proprietary Question How can Enterprises handle the cost and complexity of managing complex data landscapes ?

© 2013 Global IDs 8 Proprietary Global IDs Focus To organize enterprise data landscapes

© 2013 Global IDs 9 Proprietary Global IDs: Product Suite

Challenges

© 2013 Global IDs 11 Proprietary The typical Financial Institution’s # Databases > 1000 # Tables > 200,000 # Columns > 2,000,000

© 2013 Global IDs 12 Proprietary Question How can we understand the relationships across 2,000,000 attributes?

© 2013 Global IDs 13 Proprietary Converging Data Variety Structured Unstructured Multi Structured Data Content

© 2013 Global IDs 14 Proprietary Converging Data Ecosystems Social Data Enterprise Data Machine Data Data Ecosystems

© 2013 Global IDs 15 Proprietary Current Approaches do not Scale # Databases > 1,000> 10,000> 100,000 Small Average Large

© 2013 Global IDs 16 Proprietary A New Approach is Required

© 2013 Global IDs 17 Proprietary 5 Utilize Graph Structures for Governance

Graph Analytics : Use Cases

© 2013 Global IDs 19 Proprietary Key Challenges Vast diversity and volume of metadata and data Storage and indexing of metadata to facilitate search and navigation Understanding the connection between different pieces of metadata (Crosswalk)

© 2013 Global IDs 20 Proprietary Utilize Graphs Structures for Storing Complex Data

© 2013 Global IDs 21 Proprietary Use Case 1: Enterprise Metadata Search with Hadoop

© 2013 Global IDs 22 Proprietary Use Case 2: Unstructured Data Integration

© 2013 Global IDs 23 Proprietary Use Case 3: Cross Database Similarity Mapping

© 2013 Global IDs 24 Proprietary Use Case 4 : Graph Analytics

Demo

Methodology

© 2013 Global IDs 27 Proprietary What we do 1.Scan 2.Analyze 3.Map / Organize 4.Govern

© 2013 Global IDs 28 Proprietary Automation

© 2013 Global IDs 29 Proprietary 1 : Scan

© 2013 Global IDs 30 Proprietary 2 : Semantic Analysis

© 2013 Global IDs 31 Proprietary 3 Automate Semantic Mapping

© 2013 Global IDs 32 Proprietary 4 Link the Data Landscape

Thank You!