Appliance-based architectures for high performance data intensive applications Session at Silicon India Rajgopal Kishore Vice President and Global Head.

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

Supporting End-User Access
Supercharging Analytics on Big Data Announcing MapReduce-ready Advanced Analytic Functions June 21 st
BigBench: Big Data Benchmark Proposal Ahmad Ghazal, Tilmann Rabl, Minqing Hu, Francois Raab, Meikel Poess, Alain Crolotte, Hans-Arno Jacobsen.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Data Mining (and Machine Learning) With Microsoft Tools Michael Lisin, Plaster Group May 8, 2014.
Text mining Extract from various presentations: Temis, URI-INIST-CNRS, Aster Data …
A Fast Growing Market. Interesting New Players Lyzasoft.
SAS solutions SAS ottawa platform user society nov 20th 2014.
Digital Marketing Optimization Randy Lea, Vice President, Aster Data Center of Innovation Teradata September 2011 Randy Lea, Vice President, Aster Data.
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Chapter 14 The Second Component: The Database.
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Sales forecasting with SAS Advanced Analytics for the Pharmaceutical sector. A business case.
Advanced Analytics The next wave of Business Intelligence Shankar Radhakrishnan Senior Solutions Architect HCL Technologies.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Extreme Performance Data Warehousing
Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Oracle10g for Data Warehousing Jiangang Luo
Dr. Awad Khalil Computer Science Department AUC
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
Chapter 5: Data Mining for Business Intelligence
Shilpa Seth.  What is Data Mining What is Data Mining  Applications of Data Mining Applications of Data Mining  KDD Process KDD Process  Architecture.
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
Data Mining Dr. Chang Liu. What is Data Mining Data mining has been known by many different terms Data mining has been known by many different terms Knowledge.
1 Data Mining DT211 4 Refer to Connolly and Begg 4ed.
Analytics and Business Process Effectiveness Session at Silicon India 30 Jul 2011 Rajgopal Kishore Vice President and India Head of Financial Services,
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Creating New Business Value with Big Data Attivio Active Intelligence Engine®
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
Trends in Business Intelligence & Analytics Keynote at Silicon India Rajgopal Kishore Vice President and Global Head of BI & Analytics, HCL Technologies.
Fox MIS Spring 2011 Data Mining Week 9 Introduction to Data Mining.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
© 2007 IBM Corporation IBM Information Management Accelerate information on demand with dynamic warehousing April 2007.
Machine Learning Extract from various presentations: University of Nebraska, Scott, Freund, Domingo, Hong,
1 Melanie Alexander. Agenda Define Big Data Trends Business Value Challenges What to consider Supplier Negotiation Contract Negotiation Summary 2.
MIS2502: Data Analytics Advanced Analytics - Introduction.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
Academic Year 2014 Spring Academic Year 2014 Spring.
Chapter 2 Data, Text, and Web Mining. Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial.
Introducing Teradata Aster Discovery Platform Getting Started Ahsan Nabi Khan September 25 th, 2015.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Data Mining With SQL Server Data Tools Mining Data Using Tools You Already Have.
Foundations of Business Intelligence: Databases and Information Management Chapter 6 VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
INTRODUCTION TO INFORMATION SYSTEMS LECTURE 9: DATABASE FEATURES, FUNCTIONS AND ARCHITECTURES PART (2) أ/ غدير عاشور 1.
Data Mining - Introduction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Bhakthi Liyanage SQL Saturday Atlanta 15 July 2017
Data Mining – Intro.
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
WHY IDEAL ANALYTICS?.
What’s coming? Sneak peek.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Creating New Business Value with Big Data
Hadoop Market
Data Warehousing and Data Mining
Supporting End-User Access
Big DATA.
Built on the Powerful Azure Platform, Angoss Helps Businesses Turn Data into Actionable Insights That Reduce Risk, Increase Organizational Performance.
Presentation transcript:

Appliance-based architectures for high performance data intensive applications Session at Silicon India Rajgopal Kishore Vice President and Global Head of BI & Analytics, HCL Technologies

State of data Challenges Need of the day Rise of the machines Features & Advantages Key Players Agenda

What all these Applications have in Common Federal Cyber defense Fraud analysis Watch list analysis Internet / Social Media User behavioral analysis Graph analysis Pattern analysis Context-based click- stream analysis Retail Packaging optimization Consumer buying patterns Advertising and attribution analysis Telecommunications Service personalization Call Data Record (CDR) analysis Network analysis Financial Services and Insurance Credit and risk analysis Value at risk calculation Fraud analysis Common Use Cases Forecasting Modeling Customer segmentation Clickstream analysis Speed Frequent analysis of all data with insights in seconds/minutes Scale Analysis that must scale to terabytes to petabytes of data Richness Deep data exploration Ad hoc, interactive analysis rather than simple reports

Data driven business – Businesses have been collecting information all the time Mine more == Collect more (& vice-versa) Challenges State of data

Applications – Social Data, , Blogs, Video clips, Product Listings – ERP, CRM, Databases, Internal Applications, Customer/Consumer facing products – Mobile Context – Web, Customers, Products, Business Systems, Process and Services Support Systems – CRM, SOA, Recommendation Systems/Processes, Data warehouses, Business Intelligence, BPM Data driven business

Drivers – ROI – Customer Retention – Product Affinity – Market Trends – Research Analysis – Customer/Consumer Analytics Data Intensive Processes – Clustering – Classification – Build Relationship – Regression Types – Structured – Semi-structured – Unstructured Mine more, Collect More

Growth is constant Application complexities Workload Requirements Data growth Infrastructure Meet SLA’s Delivery ROI Reduce Risk Challenges

System that can handle high volume data System that can perform complex, analytical operations Scalable Rapid Accessibility Rapid Deployment Highly Available Fault Tolerant Secure Need of the day

“A data warehouse appliance is an integrated system, which has hardware (processors and storage) and software(operating systems and database system) components, specifically optimized for data warehousing” Rise of the machines

Designed to do one thing and one thing only Processing optimized to handle high-volume of data Data is process in parallel operations (mostly massively parallel operating units) System is resilient to data-growth and operations Highly tolerant to hardware and database failures Highly available Server units operates in isolation, so risk is local or less Pre-tuned for high query performance Features

Integrated architecture More reporting and analytical capabilities Flexibility Less management (tuning and optimization) Operational BI Cost Reductions Advantages

Key Players

Continuing Challenge While traditional DW appliances speeded up data access by 100x, processing times still remained a challenge. Two ways out of this - – Take data closer to processing – in-memory! – Take the processing closer to data – in-database!

Look at this scenario… Complex processing on large dataset of a bank using Teradata – 17 hours Same processing using Teradata’s SAS apis – 3 minutes

100% of analytics processing runs in-database, so processing is co-located with data Eliminates need for massive data movement 100% Processing In-database Automatic Parallelization Automatically parallelizes applications using Aster’s integrated analytics engines and SQL-MapReduce Parallelization is key for processing large volumes of data An Example of such a DW Appliances - AsterData

Aster Data Analytic Foundation (1 of 2) Examples of Business-Ready SQL-MapReduce Functions Modules Select Examples of Delivered, Business- ready SQL-MapReduce Functions Path Analysis Discover patterns in rows of sequential data nPath: complex sequential analysis for time series analysis and behavioral pattern analysis Sessionization: identifies sessions from time series data in a single pass over the data Statistical Analysis High-performance processing of common statistical calculations Correlation: calculation that characterizes the strength of the relation between different columns Regression: p erforms linear or logistic regression between an output variable and a set of input variables Relational Analysis Discover important relationships among data Basket analysis: c reates configurable groupings of related items from transaction records in single pass Graph analysis: f inds shortest path from a distinct node to all other nodes in a graph

Aster Data Analytic Foundation (2 of 2) Examples of Business-Ready SQL-MapReduce Functions Modules Select Examples of Delivered, Business- ready SQL-MapReduce Functions Text Analysis Derive patterns in textual data Text Processing: counts occurrences of words, identifies roots, & tracks relative positions of words & multi-word phrases Text Partition: analyzes text data over multiple rows Cluster Analysis Discover natural groupings of data points k-Means: clusters data into a specified number of groupings Minhash: buckets highly-dimensional items for cluster analysis Data Transformation Transform data for more advanced analysis Unpack: extracts nested data for further analysis Multicase: case statement that supports row match for multiple cases

Example: nPath Function for time-series analysis What this gives you: - Pattern detection via single pass over data -Allows you to understand any trend that needs to be analyzed over a continuous period of time Example use cases: - Web analytics– clickstream, golden path - Telephone calling patterns - Stock market trading sequences Uncovering patterns in sequential steps Complete Aster Data Application: Sessionization required to prepare data for path analysis nPath identifies marketing touches that drove revenue nPath in Use: Marketing Attribution

Example: Basket Generator Function What this gives you? -Creates groupings of related items via single pass over data -Allows you to increase or decrease basket size with a single parameter change Example use cases: -Retail market basket analysis -People who bought x also bought y Extensible market basket analysis Complete Aster Data Application: Evaluate effectiveness of marketing programs Launch customer recommendations feature Evaluate and improve product placement Basket Generator in Use

Example: Unpack Function What this gives you: -Translates unstructured data from a single field into multiple structured columns -Allows business analysts access to data with standard SQL queries Example use cases: -Sales data -Stock transaction logs -Gaming play logs Transforming hidden data into analyst accessible columns Complete Aster Data Application: Text processing required to transform/unpack third party sales data Sessionization required to prepare data for path analysis Statistical analysis of pricing Unpack in Use: Pricing Analysis

Questions!