WHT/082311 HPCC Systems Flavio Villanustre VP, Products and Infrastructure HPCC Systems Risk Solutions.

Slides:



Advertisements
Similar presentations
Quantitative Research and Analytics, Proprietary and Confidential1 Ryan Michaluk
Advertisements

Text mining Extract from various presentations: Temis, URI-INIST-CNRS, Aster Data …
IBM SPSS Solutions A SELECT INTERNATIONAL COMPANY.
Data warehouse example
Shipi Kankane Prashanth Nakirekommula.  Applying analytics and risk- management capabilities to health insurance through LexisNexis data platforms. 
Integration and Insight Aren’t Simple Enough Laura Haas IBM Distinguished Engineer Director, Computer Science Almaden Research Center.
Data Mining and Data Warehousing – a connected view.
Chapter 14 The Second Component: The Database.
Business Driven Technology Unit 3 Streamlining Business Operations Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution.
Chapter 2: Business Intelligence Capabilities
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Big data analytics with R and Hadoop Chapter 5 Learning Data Analytics with R and Hadoop 데이터마이닝연구실 김지연.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Cyber Basics and Big Data. 2 Semantic Extraction Sentiment Analysis Entity Extraction Link Analysis Temporal Analysis Geospatial Analysis Time Event Matrices.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Understanding Data Warehousing
Investigative Analytics New techniques in data exploration Curt A. Monash, Ph.D. President, Monash Research Editor, DBMS2
Data Mining GyuHyeon Choi. ‘80s  When the term began to be used  Within the research community.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Opening Keynote Presentation An Architecture for Intelligent Trading  Alessandro Petroni – Senior Principal Architect, Financial Services, TIBCO Software.
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Turning Audio Search and Speech Analytics into Business Intelligence.
Creating New Business Value with Big Data Attivio Active Intelligence Engine®
Information Explosion. Reality: New Machine-Generated Data Non-relational and relational data outside of the EDW † Source: Analytics Platforms – Beyond.
Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Business Plug-In B18 Business Intelligence.
Guest Lecture Introduction to Data Mining Dr. Bhavani Thuraisingham September 17, 2010.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
Advanced Analytics on Hadoop Spring 2014 WPI, Mohamed Eltabakh 1.
Machine Learning Extract from various presentations: University of Nebraska, Scott, Freund, Domingo, Hong,
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
IT Architectures for Handling Big Data in Official Statistics: the Case of Scanner Data in Istat Gianluca D’Amato, Annunziata Fiore, Domenico Infante,
Freedom to think: The Science of Data Dr Quentin Williams.
MIS2502: Data Analytics Advanced Analytics - Introduction.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
IoT Meets Big Data Standardization Considerations
Axis AI Solves Challenges of Complex Data Extraction and Document Classification through Advanced Natural Language Processing and Machine Learning MICROSOFT.
Table of Contents Introduction Why Data Analytics Data Analytics Terminology Predictive Analytics Data Analytics challenges Data Analytics Platform Data.
Extreme Content Management at LexisNexis Alfresco Summit 2013 Presenter: Flavio Villanustre LexisNexis Risk Solutions, Reed Elsevier November 13 th, 2013.
Big Data Javad Azimi May First of All… Sorry about the language  Feel free to ask any question Please share similar experiences.
Data Mining With SQL Server Data Tools Mining Data Using Tools You Already Have.
BUSINESS INTELLIGENCE. The new technology for understanding the past & predicting the future … BI is broad category of technologies that allows for gathering,
1 © 2014 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Understanding unstructured texts via Latent Dirichlet Allocation Raphael Cohen DSaaS, EMC IT June 2015.
Data Analytics Challenges Some faults cannot be avoided Decrease the availability for running physics Preventive maintenance is not enough Does not take.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Makes Insurance Smarter.
SNS COLLEGE OF TECHNOLOGY
MIS2502: Data Analytics Advanced Analytics - Introduction
April 25, 2012 The Three R’s Are Old School – Now It Is All About Volume, Velocity & Variety Peter Guest Alberta Public Sector Client Technical Advisor.
Insights driven Customer Experience
Introduction C.Eng 714 Spring 2010.
Data and Applications Security Introduction to Data Mining
Creating New Business Value with Big Data
April 25, 2012 The Three R’s Are Old School – Now It Is All About Volume, Velocity & Variety Peter Guest Alberta Public Sector Client Technical Advisor.
THE ENTERPRISE ANALYTICAL JOURNEY
Introduction to Azure Machine Learning Studio
Cognitive Search Industry Trends.
Term Definition Examples Data Science Statistics with large data sets
Data Warehousing and Data Mining
Course Introduction CSC 576: Data Mining.
AGMLAB Information Technologies
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Bespoke Analytics Leveraging Advanced Analytics for growth and sustainability in the Financial Services Sector.
PolyAnalyst Web Report Training
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Business Intelligence
Welcome! Knowledge Discovery and Data Mining
Presentation transcript:

WHT/ HPCC Systems Flavio Villanustre VP, Products and Infrastructure HPCC Systems Risk Solutions

WHT/ Risk Solutions INTRODUCTION Strata 2012 Keynote 2 LexisNexis Risk Solutions  More than 15 years of Big Data experience  Provides information solutions to enterprise customers  Generates about $1.4 billion in revenue  Has been using the HPCC Systems platform for over 10 years HPCC Systems  Launched in June 2011  Open source, and enterprise-proven distributed Big Data analytics platform  To help enterprises manage Big Data at every step in the Complete Big Data Value Chain 2

WHT/ Risk Solutions THE COMPLETE BIG DATA VALUE CHAIN Strata 2012 Keynote 3 Collection – Structured, unstructured and semi-structured data from multiple sources Ingestion – loading vast amounts of data onto a single data store Discovery & Cleansing – understanding format and content; clean up and formatting Integration – linking, entity extraction, entity resolution, indexing and data fusion Analysis – Intelligence, statistics, predictive and text analytics, machine learning Delivery – querying, visualization, real time delivery on enterprise-class availability CollectionIngestion Discovery & Cleansing IntegrationAnalysisDelivery 3

WHT/ Risk Solutions Strata 2012 Keynote 4  How do you extract value from big data?  You surely can’t glance over every record;  And it may not even have records…  What if you wanted to learn from it?  Understand trends  Classify into categories  Detect similarities  Predict the future based on the past… (No, not like Nostradamus!)  Machine learning is quickly establishing as an emerging discipline.  But there are challenges with ML in big data:  Thousands of features  Billions of records  The largest machine that you can get, may not be large enough…  Get the picture? MACHINE LEARNING IN BIG DATA

WHT/ Risk Solutions Strata 2012 Keynote 5  A fully distributed and extensible set of Machine Learning techniques for Big Data  State of the art algorithms in each of the Machine Learning domains, including supervised and unsupervised learning:  Correlation  Classifiers  Clustering  Statistics  Document manipulation  N-gram extraction  Histogram computation  Natural Language Processing  Distributed and parallel underlying linear algebra library ECL-ML: HPCC SYSTEMS MACHINE LEARNING

WHT/ Risk Solutions Strata 2012 Keynote 6  A fully parallel set of Machine Learning algorithms on Big Data gives you full insight  Outliers matter, especially when those outliers are the exact reason for the discovery effort (for example, in anomaly detection)  Dimensionality reduction can conduce to information loss: why risk losing valuable information when you can have it all?  Leveraging a fully parallel machine learning solution on Big Data will help you identify fraud, bring products to market faster, and become more competitive  Organizations that don’t leverage the big data that they have, risk losing ground to their competitors  Get on it, now! TAKE AWAYS