Graph Data Analytics Arka Mukherjee, Ph.D. Global IDs Resolving Complexity at an Enterprise Scale
© 2013 Global IDs 2 Proprietary 11 The “Complex Data” Context Current Challenges 22 Governance Methodology 33 Topics
The “Complex Data” Context
© 2013 Global IDs 4 Proprietary The Big Shift
© 2013 Global IDs 5 Proprietary The cost structure is unsustainable The cost of managing information is going up exponentially.
© 2013 Global IDs 6 Proprietary The Complexity growth is unmanageable 1.Complex data ecosystems 2.Highly dynamic 3.Limited traceability 4.Systemic Risk : Hard to measure Financial Services Institutions
© 2013 Global IDs 7 Proprietary Question How can Enterprises handle the cost and complexity of managing complex data landscapes ?
© 2013 Global IDs 8 Proprietary Global IDs Focus To organize enterprise data landscapes
© 2013 Global IDs 9 Proprietary Global IDs: Product Suite
Challenges
© 2013 Global IDs 11 Proprietary The typical Financial Institution’s # Databases > 1000 # Tables > 200,000 # Columns > 2,000,000
© 2013 Global IDs 12 Proprietary Question How can we understand the relationships across 2,000,000 attributes?
© 2013 Global IDs 13 Proprietary Converging Data Variety Structured Unstructured Multi Structured Data Content
© 2013 Global IDs 14 Proprietary Converging Data Ecosystems Social Data Enterprise Data Machine Data Data Ecosystems
© 2013 Global IDs 15 Proprietary Current Approaches do not Scale # Databases > 1,000> 10,000> 100,000 Small Average Large
© 2013 Global IDs 16 Proprietary A New Approach is Required
© 2013 Global IDs 17 Proprietary 5 Utilize Graph Structures for Governance
Graph Analytics : Use Cases
© 2013 Global IDs 19 Proprietary Key Challenges Vast diversity and volume of metadata and data Storage and indexing of metadata to facilitate search and navigation Understanding the connection between different pieces of metadata (Crosswalk)
© 2013 Global IDs 20 Proprietary Utilize Graphs Structures for Storing Complex Data
© 2013 Global IDs 21 Proprietary Use Case 1: Enterprise Metadata Search with Hadoop
© 2013 Global IDs 22 Proprietary Use Case 2: Unstructured Data Integration
© 2013 Global IDs 23 Proprietary Use Case 3: Cross Database Similarity Mapping
© 2013 Global IDs 24 Proprietary Use Case 4 : Graph Analytics
Demo
Methodology
© 2013 Global IDs 27 Proprietary What we do 1.Scan 2.Analyze 3.Map / Organize 4.Govern
© 2013 Global IDs 28 Proprietary Automation
© 2013 Global IDs 29 Proprietary 1 : Scan
© 2013 Global IDs 30 Proprietary 2 : Semantic Analysis
© 2013 Global IDs 31 Proprietary 3 Automate Semantic Mapping
© 2013 Global IDs 32 Proprietary 4 Link the Data Landscape
Thank You!