Stop Data Wrangling, Start Transforming Data to Intelligence

Slides:



Advertisements
Similar presentations
Advance Analytics Capabilities
Advertisements

Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
DATA WAREHOUSING.
Governance, Risk, and Compliance Bill Greene Senior Industry Director.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Metadata, the CARARE Aggregation service and 3D ICONS Kate Fernie, MDR Partners, UK.
Management Information Systems MANAGING THE DIGITAL FIRM, 12 TH EDITION GLOBAL EDITION FOUNDATIONS OF BUSINESS INTELLIGENCE ENHANCING DECISION MAKING Lecture.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
7 Strategies for Extracting, Transforming, and Loading.
IT Enablement Approaches Large Business may have hundreds of processes to be enabled by IT. Several Types of Application may be deployed –Departmental.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
(OBIA) Training & Placement Program By Keen IT To request free demo session please mail us at
What we mean by Big Data and Advanced Analytics
Open Governance Platform
Business Intelligence Overview
ITIL: Service Transition
Pengantar Sistem Informasi
Deployment Planning Services
Wallpaper only – on screen during welcome and chat
CHAPTER SIX DATA Business Intelligence
Viewing Data-Driven Success Through a Capability Lens
Built on Microsoft Azure, 11Ants Retail Analytics Customer Science Solution Delivers Real Growth Opportunities to Retailers with Loyalty Programs MICROSOFT.
92% of the world’s data was created in the past 2 years
Hybrid Management and Security
Ralleo Enterprise-Grade Solution for Managing Change and Business Transformation Provides Opportunities to Better Analyze Real-Time Data MICROSOFT AZURE.
Overview of MDM Site Hub
A10 Networks vThunder Leverages the Powerful Microsoft Azure Cloud Platform to Offer Advanced Layer 4-7 Networking, Security on a Global Scale MICROSOFT.
Cherwell Service Management is an IT Service Management Solution that Makes it Easier for Users to Capitalize on Power of Microsoft Azure MICROSOFT AZURE.
Insurance Fraud Analytics in the Cloud with Saama and Microsoft Azure
Governance, Risk, and Compliance Bill Greene Senior Industry Director
Measure Effectiveness of Communication, Engage Your Employees, and Bridge Communication Gaps with Sparrow App and Power of Microsoft Azure MICROSOFT AZURE.
Insights driven Customer Experience
Stylelabs Develops the Marketing Content Hub to Offer Enterprises a High-End Marketing Content Management Platform Based on Microsoft Azure MICROSOFT AZURE.
Nicole Steen-Dutton, ClickDimensions
Speaker’s Name, SAP Month 00, 2017
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Warehouse.
Creating New Business Value with Big Data
9/21/2018 3:41 AM BRK3180 Architect your big data solutions with SQL Data Warehouse & Azure Analysis Services Josh Caplan & Matt Usher Program Managers.
Operationalize your data lake Accelerate business insight
Scalable and Cost-Effective Azure Platform Empowers Complex Document Data Extraction “After serious investigation, it became apparent that the finest cloud.
Strong Security for Your Weak Link:
Business Intelligence
The Only Digital Asset Management System on Microsoft Azure, MediaValet Is Uniquely Equipped to Meet Any Company’s Needs MICROSOFT AZURE ISV PROFILE: MEDIAVALET.
Accelerate Your Self-Service Data Analytics
Introducing Qwory, a Business-to-Business Search Engine That’s Powered by Microsoft Azure and Detects Vital Contact Information for Businesses MICROSOFT.
One-Stop Shop Manages All Technical Vendor Data and Documentation and is Globally Deployed Using Microsoft Azure to Support Asset Owners/Operators MICROSOFT.
Cloud Analytics for Microsoft Azure
XtremeData on the Microsoft Azure Cloud Platform:
FileFacets Information Governance Solution Performs High-Quality Automated Enterprise Content Management Migration, Built on Azure MICROSOFT AZURE APP.
Data Discovery Change Committee.
Improve Patient Experience with Saama and Microsoft Azure
Nuvolex and Microsoft Azure Combine to Deliver a Multitenant Office 365 Management Platform that Ranks Among Most Advanced in the Industry MICROSOFT AZURE.
Metadata The metadata contains
How To Identify and Reduce Business Risk
Big DATA.
Data Governance & Management Skills and Experience
KEY INITIATIVE Financial Data and Analytics
MAZARS’ CONSULTING PRACTICE Helping your Business Venture Further
Analytics, BI & Data Integration
Microsoft Data Insights Summit
AI Discovery Template IBM Cloud Architecture Center
Data Wrangling as the key to success with Data Lake
Built on the Powerful Azure Platform, Angoss Helps Businesses Turn Data into Actionable Insights That Reduce Risk, Increase Organizational Performance.
Make it real: Help your customers comply with the GDPR
Integrated Statistical Production System WITH GSBPM
Presentation transcript:

Stop Data Wrangling, Start Transforming Data to Intelligence BABAR BHATTI 09.21.17 DAMA linkedin.com/in/bbhatti @thebabar

EXCITING TIMES FOR DATA SCIENCE + AI AI is for real Market Adoption Public Awareness Software, Hardware Improvements Data Everywhere DIVERGENCE.AI

Phases of Analytics Work Every Analytics Project has 4 phases DIVERGENCE.AI

Data Wrangling Is Costly Source: Forbes Survey DIVERGENCE.AI

DATA PREPARATION Process of cleaning, structuring, and enriching raw data into a desired output for analysis. DATA PREPARATION AS A SERVICE vs Do It Yourself vs Self-service products such as Alteryx, Datawatch, Tamr, Google/Trifecta etc Data Prep Is Data Access Without The Data Management Overhead - Forrester DIVERGENCE.AI

Data Prep Tools Accelerate Insights Source: Forrester, Vendor Landscape, Data Preparation Tools, Feb 2016 DIVERGENCE.AI

DATA PREPARATION Import / Ingest Data Cleanse and Normalize 3 stages and 12-step process to prepare data. Stages are Prepare, Enrich, and Publish 1 2 3 4 Import / Ingest Data Cleanse and Normalize Schema Detection Duplicate Identification Sensitive Data Discovery Data Profiling Data Classification Data Enrichment Attribute Extraction Schema Discovery Source / Target Definition Export Formatting PREPARE PREPARE PREPARE PREPARE 5 6 7 8 PREPARE ENRICH ENRICH ENRICH 9 10 11 12 ENRICH ENRICH PUBLISH PUBLISH DIVERGENCE.AI

USE CASES Risk, Compliance, and Security Retail & Commerce Starting with Risk, Compliance, and Security Risk, Compliance, and Security Retail & Commerce Customer Behavior Analytics Churn Analysis Customer 360 DIVERGENCE.AI

RISK, COMPLIANCE, AND SECURITY Integrating on our Data science and Cybersecurity capabilities Detect Corporate Fraud Wrangle comprehensive and complex data, such as multi-layered and multi-party emails or web chats, to better understand what constitutes deviant behavior. Enable Information Security Keep pace with the billions of security events your institution receives each day by empowering non-technical users to wrangle, investigate and clean datasets faster. Risk Modeling Standardize and quantify structured and unstructured data types quickly to ensure accurate and replicable modeling results. Improve Compliance Track and isolate compliance-sensitive data, from transactions to emails, to ensure that industry and government standards are met. DIVERGENCE.AI

RETAIL & COMMERCE Supplier Onboarding Product Integration Supplier Onboarding and Product Integration Supplier Onboarding Integrate and map data from different suppliers into a single schema Identify and flag products with identical attributes, but have differing or incorrect article numbers Enrich product information with attributes from other data sets, e.g. package dimensions, barcodes, etc. Product Integration Normalize key attributes such as color, weight, measurement, units, size, part numbers, etc. Standardize all product and brand names Identify configurable attributes and cluster product variants Categorize products according to your own taxonomy and sub- categories DIVERGENCE.AI

DATA PREPARATION AND REPAIR Stage One - Preparation Statistical Profiling – standard statistical analysis of numerical data and frequency and term analysis of text data. Cleansing, Normalization – removing non-essential characters, standardizing content such as dates. Data Repair – identifying and fixing where possible inconsistencies in the data. Data Enrichment – Knowledge Service based enrichments on related data. Explicit Schema Detection – identifying the schema/metadata that is explicitly defined in header, field, tag, or other information. Duplicate Identification – identifying duplicates in data. DIVERGENCE.AI

SEMANTIC METADATA DISCOVERY, ENRICHMENT, AND CORRELATION Stage Two - Enrichment Classification, Attribute Extraction – identifying categories in the data and identify characteristics of the data in terms of attributes, properties, schemata. Implicit Schema Detection – often it is possible to identify schema by the instances associated with the schema such as email address, postal address, name, date, time, etc. The service provides this out-of-the-box capability for many standard classes in structured and semi-structured data. DIVERGENCE.AI

PUBLISHING Stage Three - Publish Source/Targets – the system supports a rich set of sources and targets including Oracle Storage Cloud, other external Cloud Stores, and URL sources.  Formats – the service provides the ability to export the curated datasets to commonly used formats which enables downstream and on-premises BI, Analytics, and ETL processes. DIVERGENCE.AI

NEXT STEPS - DATA PREPARATION ASSESSMENT Complexity (Size vs Sources) and Transformation (Enhancement vs Enrichment) BIG COMPLEX BIG ADVANCED Large Unique Data Size Data Enrich SIMPLE DIVERSIFIED SIMPLE DIVERSIFIED Small General Few Many Few Many # of Sources (Tables) Data Enhance DIVERGENCE.AI

ENGAGEMENT MODEL - Example Assess, Provision, and Deliver Data Preparation Service Assess Two-day Data Complexity Assessment Provision Team (Shared or Dedicated) Data Preparation Infrastructure Setup (Onetime or Ongoing) Adhere to corporate information access policies over network Deliver Share transformed data DIVERGENCE.AI

BABAR BHATTI 09.21.17 DAMA linkedin.com/in/bbhatti @thebabar THANK YOU BABAR BHATTI 09.21.17 DAMA linkedin.com/in/bbhatti @thebabar