Architecture Services Big Data – Big Changes Business Analyst Professional Development Day September 2013.

Slides:



Advertisements
Similar presentations
©2011, Cognizant Fraud Control - IT Interventions and Solutions.
Advertisements

Chapter 1 Business Driven Technology
1 Goals using SAS As a financial service provider, we do: Banking: subtract / add amounts by accounts Insurance: getting paid for insurance policy’s And.
Supporting End-User Access
C6 Databases.
New Technologies Supporting Technical Intelligence Anthony Trippe, 221 st ACS National Meeting.
Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
SAS solutions SAS ottawa platform user society nov 20th 2014.
Enhancing Decision Making. ◦ Unstructured: Decision maker must provide judgment, evaluation, and insight to solve problem ◦ Structured: Repetitive and.
1. Abstract 2 Introduction Related Work Conclusion References.
Shipi Kankane Prashanth Nakirekommula.  Applying analytics and risk- management capabilities to health insurance through LexisNexis data platforms. 
DATA WAREHOUSING.
Business Intelligence System September 2013 BI.
Business Intelligence components Introduction. Microsoft® SQL Server™ 2005 is a complete business intelligence (BI) platform that provides the features,
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.
Chapter 4 Data, Text, and Web Mining
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
© 2013 IBM Corporation Version 1.0 The New Eye Insight through Big Data and Analytics: A Case Study on Citizen Sentiment Analysis Sandipan Sarkar, Executive.
Efficient BI Solution Presented by: Leo Khaskin, PowerCubes Lab Value of Information as Business Asset.
Analytics and Business Process Effectiveness Session at Silicon India 30 Jul 2011 Rajgopal Kishore Vice President and India Head of Financial Services,
© 2010 IBM Corporation © 2011 IBM Corporation September 6, 2012 NCDHHS FAMS Overview for Behavioral Health Managed Care Organizations.
SUPPORTING A MODELING CONTINUUM IN SCALATION John A. Miller Michael E. Cotterell Stephen J. Buckley University of Georgia IBM Thomas J. Watson Research.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Department of Business Information & Analytics MSMESB: Experience with Adding Analytics to the Academic Program Kellie Keeling University of Denver.
Arben Asllani University of Tennessee at Chattanooga Business Analytics with Management Science Models and Methods Chapter 1 Business Analytics with Management.
Datawarehouse Objectives
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Guest Lecture Introduction to Data Mining Dr. Bhavani Thuraisingham September 17, 2010.
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Information systems and management in business Chapter 8 Business Intelligence (BI)
© 2007 IBM Corporation IBM Information Management Accelerate information on demand with dynamic warehousing April 2007.
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Data Mining and ERP Presented by: Abhineet Malviya Ankesh Jindal Mayur Shinde.
“Innovation through Prediction” - Hybrid Cloud Big Data Platform John Andrew Oracle Enterprise Architect Learn. Predict. Influence.
Big Data – Big Opportunity Mohammad Khansari ITRC President Jan 2015 ITRC, Tehran, Iran.
© 2012 IBM Corporation Converting Big Data into Big Knowledge.
IoT Meets Big Data Standardization Considerations
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
Data Mining Copyright KEYSOFT Solutions.
MAR Capability Overview Deck Protean Analytics.
BUSINESS INTELLIGENCE & ADVANCED ANALYTICS DISCOVER | PLAN | EXECUTE JANUARY 14, 2016.
Mining of Massive Datasets Edited based on Leskovec’s from
Chapter 2 Data, Text, and Web Mining. Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
What we mean by Big Data and Advanced Analytics
What Business Analytics Can Do For You!
01-Business intelligence
Oracle Advanced Analytics
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Platform and Analytics Foundational Training
Decision Support Systems
Management Information Systems
ANOMALY DETECTION FRAMEWORK FOR BIG DATA
Accelerating intelligent automation for competitive advantage
Business Intelligence
Dr. Morgan C. Wang Department of Statistics
OMIS 665, Big Data Analytics
Overview of Machine Learning
Supporting End-User Access
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Big DATA.
Presentation transcript:

Architecture Services Big Data – Big Changes Business Analyst Professional Development Day September 2013

Contents Architecture Services 1 What is Advanced Analytics & Big Data? Business Intelligence, Advanced Analytics and Big Data seem to be used synonymously – they are different and build on each other from a maturity perspective Big Data & Analytics Continuum Leveraging “Big Data” should be done on a stable foundation - Examples Skills of the Data Analyst / Scientist N ew skills and levels of maturity, certifications and training Contents

Architecture Services What is Advanced Analytics? 2 Hindsight Current Sight Foresight What happened? When did it happen? Standard Reports How many? How often? Where? Adhoc Reports Where exactly is the problem? How do I find the answers? Query Drilldown When should I react? What actions are needed now? Alerts Why is this happening? What opportunities am I missing? Statistical Analysis What if these trends continue? How much is needed? When will it be needed? Forecasting What will happen next? How will it affect my business? Predictive Analytics How can we get better? What is the best decision? Optimization Advanced Analytics is comprised of both Business Intelligence technologies and complex analytic practices that are used to uncover relationships and patterns within large volumes of historical data that can be used to predict future behavior and events or improve operational results.

Architecture Services Definition: “Dealing with information management challenges that don’t natively fit with traditional approaches to handling the problem.” – Tom Deutsch (IBM) What is Big Data? Volume, Variety, Velocity (and sometimes Veracity and Value)

Architecture Services Industry estimates suggest that 80% of enterprise data is in unmodeled/unstructured forms where it is nearly inaccessible and traditional modeling does not fit. Integrating text extraction techniques to varieties and large volumes of data such as SEC filings can be combined with traditional BI data to create new structured metrics for analysis and exploration. Text is also trapped in large description fields in our operational data stores like the Claims DW. Internal Data captured or streamed today in Systems and Data Warehouses (e.g., policy admin, claims) NEW internal Data not previously captured (e.g., s, clickstream, mobile, telematics, unstructured notes from agents or claims adjusters) NEW External Data from non-traditional sources (e.g., internet, social networks, demographic, local economy, price elasticity, mobile location stream, localized competitor intelligence) Comprehensive advanced analytics have been built around marketing, product and pricing, and other areas of the business – mostly disconnected, some using rudimentary technologies that are inefficient and focused mainly on data movement and not getting value out of the data. Where has Nationwide been, and where can we go?

Architecture Services Big Data & Analytics Continuum 5 Cognitive Reasoning Learning Natural Language What is the most likely answer? What is the right question? Prescriptive Predictive Descriptive Information Layer Optimization Rules Constraints Machine Learning Forecasting Statistical Analysis Alerts & Drill Down Ad hoc Reports Standard Reports Big Data Platforms Content Management RDBMS and Integration What’s the next best action? What will happen when and why? What could happen? What if these trends continue? What has happened and why? How many, how often, who & where? How do I integrate new data sources? How is data managed and stored? Business Value When entering the Big Data space, be cautious of your foundational competencies. Information Management capabilities such as data integration, extensible data modeling, data quality and data governance become even more important when dealing with these new, uncertain, high volume data sources. Additionally, to achieve the full ROI, you must have mature analytics methodology, appropriately skilled resources and technology.

Architecture Services Selected Results Accrual Score (Bankruptcy) Prediction The machine learning technique called Support Vector Machine (SVM) was selected. This supervised learning technique takes a set of factors in a training set of labeled results and constructs a model. Cross Business Interest Freedom Specialty Insurance Enterprise Applications Investments NF opportunities just beginning to be explored Cross Business Interest Freedom Specialty Insurance Enterprise Applications Investments NF opportunities just beginning to be explored Open Source R was chosen to accelerate the model development process for the intern. Several external R packages were added to complete the SVM capability in R as a desktop tool. Supplemental data preparation of the S&P financial data was handled with various scripts and spreadsheets. The project will provide knowledge transfer to Freedom Specialty where they currently intend to implement it in SAS. positive precision 0.81 positive recall 0.70 positive F1 score 0.75 negative precision 0.74 negative recall 0.83 negative F1 score 0.78 accuracy 0.77 Model Validation Results (Jan 2013) Further Optimization Pending Although Freedom’s project was a predictive modeling effort, the business is anxious to pursue analyzing the “fine print” of unstructured text in filings and media reports looking for red flags to help triage the workload for analysts. Use Case – Machine Learning: Advanced Analytics, Structured Principle: Start with solid advanced analytics capabilities and add “Big Data” for added ROI

Architecture Services Use Case – Speech Analytics: Volume, Variety (Unstructured) 7 Hypothesis: Determine if there are certain words used more prevalently during a first notice of loss call which would indicate a fraudulent claim.  Convert first notice of loss call history to text and store in big data platform.  Associate call text into two categories: those that resulted in fraud and those that did not.  Mine data for word patterns. Determine if there are differences in word usage between fraudulent and non-fraudulent claims.  Build model / rules to execute against call in real time using streaming technology. This will result in false positives! Should be combined with claims, billing, contact history to enhance accuracy of model. Principle: “Big data” does not replace your existing analytics using your structured data warehouse. Big Data is simply an additional data set which enhances an existing set of capabilities and should not be used out of context.

Architecture Services Data Analyst / Data Scientist What is Data Analysis? How do you recognize patterns in data? What is the process for inspecting the data? How do you identify data cleansing and transformation rules? Why / How do you visualize your findings and information? How do you manage, manipulate and query large, complex data on Hadoop as an analyst? What statistical model is most appropriate for the problem scenario? What other type of model is appropriate? Data Analyst / Data Scientist What is Data Analysis? How do you recognize patterns in data? What is the process for inspecting the data? How do you identify data cleansing and transformation rules? Why / How do you visualize your findings and information? How do you manage, manipulate and query large, complex data on Hadoop as an analyst? What statistical model is most appropriate for the problem scenario? What other type of model is appropriate? New Roles, New Skills Types of Tools Used R SPSS Tableau Data Mining tools such as Teradata Miner Hadoop implementation specific tools such as BigSQL & BigSheets (IBM) Other Considerations Certifications: Certified Analytics Professional from Informs Nationwide / IBM Client Center for Advanced Analytics Types of Tools Used R SPSS Tableau Data Mining tools such as Teradata Miner Hadoop implementation specific tools such as BigSQL & BigSheets (IBM) Other Considerations Certifications: Certified Analytics Professional from Informs Nationwide / IBM Client Center for Advanced Analytics

Architecture Services Appendix 9

Architecture Services More Terminology to Learn 10 Classes of Advanced Analytics Problems With a wide range of advanced modeling techniques… ARMA CART CIR++ Compression Nets Decision Trees Discrete Time Survival Analysis D-Optimality Ensemble Model Gaussian Mixture Model Genetic Algorithm Gradient Boosted Trees Hierarchical Clustering Kalman Filter K-Means KNN Linear Regression Logistic Regression Monte Carlo Simulation Multinomial Logistic Regression Neural Networks Optimization: LP; IP; NLP Poisson Mixture Model Restricted Boltzmann Machine Sensitivity Trees SVD, A-SVD, SVD++ SVM Projection on Latent Structures Spectral Graph Theory Regression Classification Clustering Forecasting Optimization Simulation Sparse Data Inference Anomaly Detection Natural Language Processing Intelligent Data Design

Architecture Services Presentation The technologies that deal with the big data problems are broad and diverse, it is not just Hadoop Big Data Analytics – The Landscape

Architecture Services Touchpoints – Just Two Use Cases