2HP Confidential Welcome/agenda The Big Data Ecosystem – Columnar DBMS – Unstructured Data – Hadoop Big Data Trends – Data Governance – Cloud – Mobile How to Survive in the Big Data World TCO Comparison Q&A Introduction – Announcement and what it means Big Data – Definition – Market Place – 3 rd Party Validation Columnar DBMS - What is it? – Where is it in the market place?
4HP Confidential DEFINITION ‘Big Data’ is a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. ‘Big data’ sizes are a constantly moving target currently ranging from a few dozen terabytes to many petabytes of data in a single data set. EXAMPLES Web logs, RFID, sensor networks, social networks, Internet text and documents, Internet search indexing, call detail records, genomics, astronomy, biological research, military surveillance, medical records, photography archives, video archives, and large-scale eCommerce. Source: Wikipedia
5HP Confidential What’s going on in the industry? 5 billion mobile phones in use in 2010 30 billion pieces of content shared every month on Facebook 40% projected growth in global data generated per year Budgets and IT staff relatively flat or declining Source: McKinsey Global Institute – Big Data: The Next Frontier for Innovation, Competition and Productivity. Figure 1: The Digital Universe 2009 - 2020 Growing by a Factor of 44 2009 0.8 ZB* *Zettabyte = 1 trillion gigabytes 2020 35 ZB*
6HP Confidential Where are we seeing “Big Data” FINANCIAL SERVICES COMMUNICATIONS CONSUMER MARKETINGHEALTHCARE RETAILONLINE WEB AND GAMING
7HP Confidential There is strategic value in big data; with real-time analytics, organizations are able to maximize business value and efficiencies What’s the value of Big Data? Opportunity to monetize ‘Big Data’ is everywhere TECHNOLOGY Sensors XMLLOBs IPV6 SOCIAL MEDIA HEALTHCARE Electronic Patient Record Medical Imaging Gene Sequencing COMPLIANCE Sarbanes-Oxley HIPPA Basel II MOBILITY GEOPHYSICAL EXPLORATION ENTERPRISE ERP CRM Products Customers Suppliers Partners FINANCIAL SERVICES Algorithmic Trading High-frequency Trading COMMUNICATIONS Call Detail Records
8HP Confidential 2011 Top Strategic Initiatives 1.Cloud Computing 2.Mobile Applications 3.Social Collaboration 4.Video 5.Next Generation Analytics 6.Social Analytics 7.Context Aware Computing 8.Storage Class Memory 9.Ubiquitous Computing 10.Fabric Based Infrastructure
9HP Confidential We live in an analytics world More data, and it comes in continuously No more overnight batch loading Mixed workloads and user variety accessing Must retain long history of data for compliance and analysis Need to customize and analyze diverse data/relations New Forms of Data for Mining (Logs, Social Media, Etc) Creates a great opportunity!
10HP Confidential Power and benefits of real-time analytics Create competitive differentiation via information and rich analytics – Optimize user experiences via real-time campaign updates/management – Customize interactions with constituents, clients, prospects via real-time engagement Reduce operational expense while improving critical Key Performance Indicators (KPIs) – Drastically reduce exposure to fraud and other nefarious business activities Understand brand sentiment and social trends – Proactively manage customer satisfaction and brand recognition
11HP Confidential11HP Confidential WHAT ARE COLUMNAR DBMS AND HOW DO THEY SOLVE BIG DATA CHALLENGES?
12HP Confidential Traditional Business and Technology Gap Business Workload How IT is Deployed One size does not fit all!!!! Cost UP Complexity UP Performance DOWN Scale LIMITED
14HP Confidential The Problem: Data, Access, Performance Data volumes are growing at increasing rates Users ask questions iteratively on their own “Classic” DBMS are 30 years old and slowing you down
15HP Confidential Why Next Generation Analytics? Legacy analytic methodologies are becoming obsolete - Analysis based on summary data - Poor performance - Application down-time - Batch-style loading and querying - DB as a place to park the data - Canned SQL queries - 100+ control knobs to be tweaked Next-gen business models require next- gen analytics!
16HP Confidential Column-Store is Transformational & Shortest Time to Value
17HP Confidential The Forrester Wave ™ Enterprise Data Warehousing Platforms, Q1 2011 With an increasing focus on performance, scalability, optimized storage, and in-database analytics, Vertica Systems positions its EDW offerings as a robust platform for the most demanding enterprise analytics. Vertica’s customer momentum, coupled with its focus on enhancing its columnar-based EDW architecture, gives it a competitive advantage. Expect that Vertica will leverage these strengths, … to grow its share of the market among large enterprises looking for a high- performance massively parallel EDW. Source: Forrester Research, Inc., “The Forrester Wave™ Enterprise Data Warehousing Platforms, Q1 2011,” James G. Kobielus, 10 February 2011.
18HP Confidential Gartner Magic Quadrant for data warehouse database management systems – 2011 Source: Gartner, “Magic Quadrant for Data Warehouse Database Management Systems,” Donald Feinberg, Mark A. Beyer, 28 January 2011. This Magic Quadrant graphic was published by Gartner, Inc. as part of a larger research note and should be evaluated in the context of the entire report. The Gartner report is available upon request from HP. The Magic Quadrant is copyrighted January 28, 2011, by Gartner, Inc. and is reused with permission. The Magic Quadrant is a graphical representation of a marketplace at and for a specific time period. It depicts Gartner's analysis of how certain vendors measure against criteria for that marketplace, as defined by Gartner. Gartner does not endorse any vendor, product or service depicted in the Magic Quadrant, and does not advise technology users to select only those vendors placed in the "Leaders" quadrant. The Magic Quadrant is intended solely as a research tool, and is not meant to be a specific guide to action. Gartner disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
20HP Confidential Breaking traditional barriers to entry to managing Big Data Vertica Analytics Platform Founded in 2005 by Michael Stonebraker – ‘Purpose Built’ analytic platform Low-latency “Real Time” analytics Powerful UDxF framework 50–1000x faster performance than traditional row-stores at ¼ the cost Simple install/use with auto setup and tuning Industry standard x86 hardware Hybrid in-memory/on-disk architecture Rich analytics – GIS, Event Series, GFI, Regression Large scale, multi-use workloads SPEEDSCALABILITYSIMPLICITY TCO