Supercharging Analytics on Big Data Announcing 1000+ MapReduce-ready Advanced Analytic Functions June 21 st. 2010.

Slides:



Advertisements
Similar presentations
Supporting End-User Access
Advertisements

C6 Databases.
BigBench: Big Data Benchmark Proposal Ahmad Ghazal, Tilmann Rabl, Minqing Hu, Francois Raab, Meikel Poess, Alain Crolotte, Hans-Arno Jacobsen.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Data Mining (and Machine Learning) With Microsoft Tools Michael Lisin, Plaster Group May 8, 2014.
Text mining Extract from various presentations: Temis, URI-INIST-CNRS, Aster Data …
SAS solutions SAS ottawa platform user society nov 20th 2014.
FAST FORWARD WITH MICROSOFT BIG DATA Vinoo Srinivas M Solutions Specialist Windows Azure (Hadoop, HPC, Media)
Digital Marketing Optimization Randy Lea, Vice President, Aster Data Center of Innovation Teradata September 2011 Randy Lea, Vice President, Aster Data.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Clementine Server Clementine Server A data mining software for business solution.
Data Mining – Intro.
Business Intelligence components Introduction. Microsoft® SQL Server™ 2005 is a complete business intelligence (BI) platform that provides the features,
Lecture-8/ T. Nouf Almujally
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | OFSAAAI: Modeling Platform Enterprise R Modeling Platform Gagan Deep Singh Director.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Page 1 © Hortonworks Inc – All Rights Reserved Hortonworks Naser Ali UK Building Energy Management Group Hadoop: A Data platform for businesses.
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
Data Mining Techniques
Data Mining Dr. Chang Liu. What is Data Mining Data mining has been known by many different terms Data mining has been known by many different terms Knowledge.
Data Mining Chun-Hung Chou
Appliance-based architectures for high performance data intensive applications Session at Silicon India Rajgopal Kishore Vice President and Global Head.
Training Workshop Windows Azure Platform. Presentation Outline (hidden slide): Technical Level: 200 Intended Audience: Developers Objectives (what do.
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
Copyright © 2009 Pearson Education, Inc. Slide 6-1 Chapter 6 E-commerce Marketing Concepts.
CS525: Big Data Analytics Machine Learning on Hadoop Fall 2013 Elke A. Rundensteiner 1.
MapReduce April 2012 Extract from various presentations: Sudarshan, Chungnam, Teradata Aster, …
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
MapReduce With a SQL-MapReduce focus by Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
Data Warehousing at Acxiom Paul Montrose Data Warehousing at Acxiom Paul Montrose.
Information Explosion. Reality: New Machine-Generated Data Non-relational and relational data outside of the EDW † Source: Analytics Platforms – Beyond.
Faster and Smarter Data Warehouses with Oracle OLAP 11g.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Fox MIS Spring 2011 Data Mining Week 9 Introduction to Data Mining.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
© 2007 IBM Corporation IBM Information Management Accelerate information on demand with dynamic warehousing April 2007.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Machine Learning Extract from various presentations: University of Nebraska, Scott, Freund, Domingo, Hong,
1 Melanie Alexander. Agenda Define Big Data Trends Business Value Challenges What to consider Supplier Negotiation Contract Negotiation Summary 2.
MIS2502: Data Analytics Advanced Analytics - Introduction.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Academic Year 2014 Spring Academic Year 2014 Spring.
Data Mining Copyright KEYSOFT Solutions.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Chapter 2 Data, Text, and Web Mining. Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial.
Introducing Teradata Aster Discovery Platform Getting Started Ahsan Nabi Khan September 25 th, 2015.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Data Mining With SQL Server Data Tools Mining Data Using Tools You Already Have.
Foundations of Business Intelligence: Databases and Information Management Chapter 6 VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Ahmed K. Ezzat, SQL Server 2008 and Data Mining Overview 1 Data Mining and Big Data.
Bhakthi Liyanage SQL Saturday Atlanta 15 July 2017
Data Mining – Intro.
Decision Support Systems
DATA MINING © Prentice Hall.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Warehousing and Data Mining
Supporting End-User Access
Big DATA.
Built on the Powerful Azure Platform, Angoss Helps Businesses Turn Data into Actionable Insights That Reduce Risk, Increase Organizational Performance.
Presentation transcript:

Supercharging Analytics on Big Data Announcing MapReduce-ready Advanced Analytic Functions June 21 st. 2010

Confidential and proprietary. Copyright © 2010 Aster Data Systems 2 Aster Data’s Solution A Data-Analytics Server for Big Data Management 2.Integrated analytics engine, that uniquely leverages MapReduce for rich, scalable big data analytics 1.A highly-scalable MPP database running on commodity hardware Rich, advanced analytics on large data volumes

Confidential and proprietary. Copyright © 2010 Aster Data Systems 3 Examples of Advanced Analytic Applications Federal Cyber defense Fraud analysis Watch list analysis Internet / Social Media User behavioral analysis Graph analysis Pattern analysis Context-based click- stream analysis Retail Packaging optimization Consumer buying patterns Advertising and attribution analysis Telecommunications Service personalization Call Data Record (CDR) analysis Network analysis Financial Services and Insurance Credit and risk analysis Value at risk calculation Fraud analysis Common Use Cases Forecasting Modeling Customer segmentation Clickstream analysis

Confidential and proprietary. Copyright © 2010 Aster Data Systems 4 What all these Applications have in Common Federal Cyber defense Fraud analysis Watch list analysis Internet / Social Media User behavioral analysis Graph analysis Pattern analysis Context-based click- stream analysis Retail Packaging optimization Consumer buying patterns Advertising and attribution analysis Telecommunications Service personalization Call Data Record (CDR) analysis Network analysis Financial Services and Insurance Credit and risk analysis Value at risk calculation Fraud analysis Common Use Cases Forecasting Modeling Customer segmentation Clickstream analysis Speed Frequent analysis of all data with insights in seconds/minutes Scale Analysis that must scale to terabytes to petabytes of data Richness Deep data exploration Ad hoc, interactive analysis rather than simple reports

Confidential and proprietary. Copyright © 2010 Aster Data Systems 5 Extensive Suite of Ready Functions Extensive suite of pre-built advanced analytics functions that are MapReduce-enabled, e.g. time-series, clustering, graph, market basket etc. 100% of analytics processing runs in-database, so processing is co-located with data Eliminates need for massive data movement 100% Processing In-database Automatic Parallelization Automatically parallelizes applications using Aster’s integrated analytics engines and SQL-MapReduce Parallelization is key for processing large volumes of data Easily Useable by Business Analysts Ultra-simple formulation of advanced queries by coupling SQL with MapReduce Brings the power of MapReduce to any business analyst with SQL skills Aster Data: Big Data Analytics & Bringing MapReduce to the Enterprise

Confidential and proprietary. Copyright © 2010 Aster Data Systems 6 -Business Analyst Ready: 30+ SQL-MapReduce functions, fully parallelized and available as part of ‘Aster Analytic Foundation’ library Example Functions include: Text processing k-Means cluster analysis Unpack data transformations -Power User Functions: 40+ MapReduce-ready, automatically parallelized packages with functions, available in java or C All functions are available in native languages without learning curve of a separate procedural language Example Functions include: Monte Carlo simulation Histograms Linear algebra Statistics New: Expanded Suite of MapReduce-ready Analytics Totaling Functions NEW

Confidential and proprietary. Copyright © 2010 Aster Data Systems 7 Aster Data Analytic Foundation (1 of 2) Examples of Business-Ready SQL-MapReduce Functions Modules Select Examples of Delivered, Business-ready SQL-MapReduce Functions Path Analysis Discover patterns in rows of sequential data nPath: complex sequential analysis for time series analysis and behavioral pattern analysis Sessionization: identifies sessions from time series data in a single pass over the data Statistical Analysis High-performance processing of common statistical calculations Correlation: calculation that characterizes the strength of the relation between different columns Regression: performs linear or logistic regression between an output variable and a set of input variables Relational Analysis Discover important relationships among data Basket analysis: creates configurable groupings of related items from transaction records in single pass Graph analysis: finds shortest path from a distinct node to all other nodes in a graph

Confidential and proprietary. Copyright © 2010 Aster Data Systems 8 Aster Data Analytic Foundation (2 of 2) Examples of Business-Ready SQL-MapReduce Functions Modules Select Examples of Delivered, Business-ready SQL-MapReduce Functions Text Analysis Derive patterns in textual data Text Processing: counts occurrences of words, identifies roots, & tracks relative positions of words & multi-word phrases Text Partition: analyzes text data over multiple rows Cluster Analysis Discover natural groupings of data points k-Means: clusters data into a specified number of groupings Minhash: buckets highly-dimensional items for cluster analysis Data Transformation Transform data for more advanced analysis Unpack: extracts nested data for further analysis Multicase: case statement that supports row match for multiple cases

Confidential and proprietary. Copyright © 2010 Aster Data Systems 9 Example: nPath Function for time-series analysis What this gives you: - Pattern detection via single pass over data -Allows you to understand any trend that needs to be analyzed over a continuous period of time Example use cases: - Web analytics– clickstream, golden path - Telephone calling patterns - Stock market trading sequences Uncovering patterns in sequential steps Complete Aster Data Application: Sessionization required to prepare data for path analysis nPath identifies marketing touches that drove revenue nPath in Use: Marketing Attribution

Confidential and proprietary. Copyright © 2010 Aster Data Systems 10 Example: Basket Generator Function What this gives you? -Creates groupings of related items via single pass over data -Allows you to increase or decrease basket size with a single parameter change Example use cases: -Retail market basket analysis -People who bought x also bought y Extensible market basket analysis Complete Aster Data Application: Evaluate effectiveness of marketing programs Launch customer recommendations feature Evaluate and improve product placement Basket Generator in Use

Confidential and proprietary. Copyright © 2010 Aster Data Systems 11 Example: k-Means Function What this gives you: -Organizes data into groupings or clusters based on shared attributes -Allows you to understand natural segments Example use cases: -Marketing segmentation -Fraud detection -Computer vision-- object recognition One call for clustering items into natural segments Complete Aster Data Application: Text processing required to prepare data for customer support analysis K-Means identifies hot product issues for proactive response K-Means in Use: Contact Center

Confidential and proprietary. Copyright © 2010 Aster Data Systems 12 Example: Unpack Function What this gives you: -Translates unstructured data from a single field into multiple structured columns -Allows business analysts access to data with standard SQL queries Example use cases: -Sales data -Stock transaction logs -Gaming play logs Transforming hidden data into analyst accessible columns Complete Aster Data Application: Text processing required to transform/unpack third party sales data Sessionization required to prepare data for path analysis Statistical analysis of pricing Unpack in Use: Pricing Analysis

Confidential and proprietary. Copyright © 2010 Aster Data Systems 13 4 New analytic application development partners building on Aster Data nCluster Fuzzy Logix In-database quantitative library DB Lytix™, including mathematical and statistical methods, data mining algorithms and Monte Carlo simulation techniques Cobi Systems End-to-end analytic applications across financial services and retail Impetus Big data management applications integrating Aster Data nCluster and Hadoop Ermas Consulting In-database SAS and R applications PLUS – Announcing Additional Partners NEW

Page 14 Large Data Volume Fast Processing High Accuracy Aster Data & Fuzzy Logix: Advancing In-Database Analytics on Big Data  Balancing between large volumes of data, throughput and accuracy has always been a challenge- typically sacrifice one or more of these for practical considerations.  Fuzzy Logix is providing an analytical platform on Aster Data nCluster using SQL-MR wherein one can achieve all these three objectives simultaneously.  Traditional constraints of data analysis are almost non-existent in this platform. Powered by in-database analytics on Aster Data nCluster

Page 15 Introducing DB Lytix  on Aster Data nCluster Runs In-database & Uses SQL-MapReduce for high performance analytics on big data volumes “DB Lytix is the most noteworthy in-database analytics tool” Forrester Report, Nov 2009 Analytical Functions in DB Lytix

Confidential and proprietary. Copyright © 2010 Aster Data Systems 16 Stores & analyzes TB’s to PB’s of data Highly scalable massively parallel DBMS Runs on commodity servers with incremental scaling Enables new class of analytics and data-rich applications Aster Data – Big Data Management & Analytics