April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman.

Slides:



Advertisements
Similar presentations
Relational and Non-Relational Data Living in Peace and Harmony
Advertisements

2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN TechTalk Beste Skalierbarkeit dank massiv.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. 1.
Living with Exadata Presented by: Shaun Dewberry, OS Administrator, RDC Tom de Jongh van Arkel, Database Administrator, RDC Komaran Hansragh, Data Warehouse.
Big Data Working with Terabytes in SQL Server Andrew Novick
April 10-12, Chicago, IL Deep Dive into PowerPivot in Office and SharePoint Diego Oppenheimer, Microsoft Kay Unkroth, Microsoft.
FAST Radar System Engineering Overview. FAST Radar Overview –What’s Required? IIS 6.0  With Microsoft.NET Framework 1.1 and SMTP for MS SQL Server.
Microsoft Data Warehouse Vision Massive Scalability at Low Cost Improved Business Agility and Alignment Democratized Business Intelligence Hardware.
Doug Lanman Data Warehousing SSP North Central, Midwest and Heartland Districts SQL Server Data Warehousing.
High Performance Analytical Appliance MPP Database Server Platform for high performance Prebuilt appliance with HW & SW included and optimally configured.
A Fast Growing Market. Interesting New Players Lyzasoft.
1HP Confidential THE BIG DATA ECOSYSTEM AND YOU!.
April 10-12, Chicago, IL Parallelizing Large Excel-Based Calculations on Windows HPC Server & Azure.
Danny Tambs Solution Architect. VOLUME (Size) VARIETY (Structure) VELOCITY (Speed)
Microsoft Ignite /16/2017 5:47 PM
Microsoft SQL Server x 46% 900+ For Hosting Service Providers
Business Intelligence components Introduction. Microsoft® SQL Server™ 2005 is a complete business intelligence (BI) platform that provides the features,
10-fold increase in data volume every 5 years “DW has shifted almost entirely towards the appliance model due to speed of the balanced appliance and.
Fast Track, Microsoft SQL Server 2008 Parallel Data Warehouse and Traditional Data Warehouse Design BI Best Practices and Tuning for Scaling SQL Server.
© Hitachi Data Systems Corporation All rights reserved. 1 1 Det går pænt stærkt! Tony Franck Senior Solution Manager.
State of Connecticut Core-CT Project Query 4 hrs Updated 1/21/2011.
April 10-12, Chicago, IL Drab to Dynamite! Managed Self-Service BI Using Real-World Data Riccardo Muti, Sandy Rivas.
STEALTH Content Store for SharePoint using Windows Azure  Boosting your SharePoint to the MAX! "Optimizing your Business behind the scenes"
Analytics Map Reduce Query Insight Hive Pig Hadoop SQL Map Reduce Business Intelligence Predictive Operational Interactive Visualization Exploratory.
Using Excel, Excel Service and PerformancePoint
SPONSORS. Microsoft PowerPivot for SQL Server, Excel 2010, and SharePoint 2010 Michael Herman Syntergy, Inc.
Overview of SQL Server Alka Arora.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information herein is subject to change without notice. HP Restricted. HP AppSystem for.
April | Chicago, IL Office as your BI Platform Ashvini Sharma Principal Group Manager Microsoft Office Division | Program Management Seayoung Rhee.
Virtual techdays INDIA │ November 2010 PowerPivot for Excel 2010 and SharePoint 2010 Joy Rathnayake │ MVP.
STEALTH Content Store for SharePoint using Caringo CAStor  Boosting your SharePoint to the MAX! "Optimizing your Business behind the scenes"
April 10-12, Chicago, IL Driving Smarter Decisions with Microsoft Big Data Tim Mallalieu Group Program Manager, HDInsight.
Criteria for D/W Platform Selection Simple Architecture –Easy to deploy the solution with minimal efforts Scalable (Scale Out - Scale Up) –Ability to handle.
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN SQL Server 2012 Parallel Data Warehouse.
 2009 Calpont Corporation 1 Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009 MySQL User Conference Santa.
BACKUP/MASTER: Strategies for Archiving Dianne McAdam Senior Analyst and Partner Data Mobility Group.
April 10-12, Chicago, IL Microsoft Data Explorer for Excel Faisal Mohamood, Lead PM, Microsoft.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN Welcome November 2012 Vorstellung Parallel.
By N.Gopinath AP/CSE.  The data warehouse architecture is based on a relational database management system server that functions as the central repository.
Rushabh Mehta Managing Director (India) | Solid Quality Mentors
Scalable data access with Impala Zbigniew Baranowski Maciej Grzybek Daniel Lanza Garcia Kacper Surdy.
Axis AI Solves Challenges of Complex Data Extraction and Document Classification through Advanced Natural Language Processing and Machine Learning MICROSOFT.
SMP MPP with PDW ** Workload requirements usually drive the architecture decision.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Database Growth: Problems & Solutions.
October 15-18, 2013 Charlotte, NC Being the DBA of the Future A World of On-Premises and Cloud Dandy Weyn, Snr. Technical Marketing Product Manager Microsoft.
Microsoft Analytics Platform System Stefan Cronjaeger, Microsoft.
Modern Data Warehousing Symmetric Multi-Processing SQL (SMP) vs Massive Parallel Processing SQL (MPP) Alain Dormehl P-Cubed Session Level : Intermediary.
Making Data Work for Everyone Gordon Phillips May 28, 2014.
SQL Server 2008 R2 Parallel Data Warehouse: Under the Hood Brian Mitchell Senior Premier Field Engineer.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
Clint Kunz Data Platform Technology Specialist
SQL Server 2008 R2 Report Builder 3.0 SQL Server 2008 Feature Pack Report Builder 2.0 SQL Server 2008 General Availability Authoring & Collaboration (Acquisition:
PHD Virtual Technologies “Reader’s Choice” Preferred product.
Supervisor : Prof . Abbdolahzadeh
Lucidchart Extends Collaborative, Cross-Platform Diagramming Solution for Individuals, SMBs, and Enterprises with New Microsoft Office 365 Add-Ins OFFICE.
Data Platform and Analytics Foundational Training
Data Platform Modernization
Business Critical Application Platform
SQL 2016 new Hosting Offers Secure Database Hybrid HyperScale
Installation and database instance essentials
Business Critical Application Platform
A developers guide to Azure SQL Data Warehouse
Fast Track Data Warehouse for SQL SERVER 2012
Data Platform Modernization
A developers guide to Azure SQL Data Warehouse
Delivering an End-to-End Business Intelligence Solution
XtremeData on the Microsoft Azure Cloud Platform:
Moving your on-prem data warehouse to cloud. What are your options?
Presentation transcript:

April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman | Microsoft Corporation SQL Customer Advisory Team

April 10-12, Chicago, IL Please silence cell phones

3 Agenda

5 Introducing Parallel Data Warehouse Pre-Built Hardware + Software Appliance Co-engineered with HP and Dell Pre-built Hardware Pre-installed Software Appliance installed in 1-2 days Support - Microsoft provides first call support Hardware partner provides onsite break/fix support Plug and PlayBuilt-in Best Practices Save Time

6 The Power of PDW Massively Parallel Processing (MPP)Symmetric Multi-Processing (SMP)

7 The Basic Full Rack 1 RACK Infiniband & Ethernet 128 cores on 8 compute nodes 2TB of RAM on compute Up to 168 TB of temp DB Up to 1PB of user data Reduce hardware footprint by virtualizing the entire control server rack down to a few nodes 1.5x lower price/TB providing the one of the lowest price/TB in the industry Save up to 70% of storage with up to ~15x compression via the xVelocity columstore Resilient, scalable, and high performance storage features in Windows Server 2012 replace SAN with high density, low cost SAS JBODS 70% more disk I/O bandwidth over SQL Server PDW 2008 R2 SQL Server PDW 2012

8 Dimensional Model Date Dim Date Dim ID Calendar Year Calendar Qtr Calendar Mo Calendar Day Date Dim ID Calendar Year Calendar Qtr Calendar Mo Calendar Day Store Dim Store Dim ID Store Name Store Mgr Store Size Store Dim ID Store Name Store Mgr Store Size Item Dim Prod Dim ID Prod Category Prod Sub Cat Prod Desc Prod Dim ID Prod Category Prod Sub Cat Prod Desc Sls Fact Date Dim ID Store Dim ID Prod Dim ID Mktg Camp Id Qty Sold Dollars Sold Promo Dim Mktg Camp ID Camp Name Camp Mgr Camp Start Camp End I I D D S S I I P P F2F2 F2F2 D D S S I I P P F3F3 F3F3 D D S S I I P P S4S4 S4S4 D D S S I I P P F5F5 F5F5 D D s s P P F1F1 F1F1 Compute Nodes PDW Data Layout

9 Seamlessly Add Capacity Smallest (53TB) To Largest (6PB) Start small with a few Terabyte warehouse Add capacity up to 6 Petabytes 53 TB6 PB Add Capacity Add Capacity Largest Warehouse PB Start Small And Grow Start Small Linearly Scale OUT

10 Any Size : Next-Gen Performance Columnstore Provides Dramatic Performance Updateable and clustered xVelocity columnstore Stores data in columnar format Memory-optimized for next-generation performance Updateable to support bulk and/or trickle loading Up to 50X Faster Up to 15x compression Save Time and Costs Batch Processing xVelocity - Fast Data Query Processing CustomerSalesCountrySupplierProducts

12 Any Data: Hadoop Integration External Tables and full SQL query access to data stored in HDFS HDFS bridge for direct & fully parallelized access of data in HDFS Joining ‘on-the-fly’ PDW data with data from HDFS Parallel import of data from HDFS in PDW tables for persistent storage Parallel export of PDW data into HDFS including ‘round-tripping’ of data Polybase Details Unstructured data HDFS Data Nodes Structured data Enhanced PDW Query Engine Enhanced PDW Query Engine Regular T-SQL Results PDW 2012 External Table HDFS Bridge

13 Hadoop Data Structured Data Existing Excel Skillset With Big Data Familiar Tools Analyse Big Data Native Microsoft BI Integration to PDW Structured and unstructured data in same spreadsheet Widely adopted and familiar user tools No IT Intervention Analyze All Data Types High Adoption Of Excel Familiar Tools To Analyze Structured/Unstructured Data

14

15

16 Benefits “…basic queries that previously took 20 minutes only took seconds using the SQL Server 2008 R2 Parallel Data Warehouse.” -Tom Settle, Assistant VP, Data Warehousing, Hy-Vee Upgrading to PDW Gains 100x Improvement

17 Business Objectives Provide Broader Range of Critical Customer Purchasing Data - Current system only supported 2 years of data – Business required 7 years Critical Enable Self-Service Reporting - SSAS/SSRS/SharePoint/Excel Save Time Enable User Ad hoc Reporting - Leveraging Excel/SharePoint Query Improve Performance of Complex Transformations - Faster delivery of data within specified SLAs Load Speed Reduced IT Costs - Creating self-sufficient end users – Frees IT to focus on delivering new data Save Costs Provide solution that Scales to Meet Future Data Needs - Expansion of history, point of sale detail, and expansion into social media Scale

18 Shift from ETL to ELT Move their complex transformations and calculations to SQL Server Parallel Data Warehouse from ETL server PDW has allowed Hy-Vee to create an enterprise data warehouse centralizing data from many sources Archiving point of sale source files for later data extraction Using the Power of MPP Complex Transformations

19 Upgrade to PDW 2012 Improves their opportunity to further analyze social media data Query data without having to move it into a relational database Provides an alternative archive solution for point of sale data Future Option

Data Archive Challenge – Financial Customer Reporting Services Archive Servers Centralized EDW Business only actively analyzes a rolling 12 months of data Regulations require data is on-line and accessible for extended period Data > 12 months is pushed to a farm of SQL servers to meet regulatory requirements Current Solution

Data Archive Challenge – Financial Customer Reporting Services Archive Servers Centralized EDW HDFS Data Nodes Unstructured data HDFS bridge Replace archive farm with Hadoop cluster PDW provides single point of access Allows analyst to leverage existing SQL skills Much lower maintenance and administration Meets regulatory requirements Future Solution

22 AMD is also processing more reporting queries than it previously could—between 10,000 and 13,000 a day—with an average runtime of a few seconds and virtually no performance issues. Because of the user complaints about the previous system, the data warehouse team had one employee devoted full time to addressing performance-related support tickets. With Parallel Data Warehouse, AMD has reduced support work to just a few hours a week. AMD runs an average of 1,500 loads per day, and data loads to a given table range from four- minute to four-hour intervals. AMD averages about 500,000 file loads a day. 22 Benefits “We used to worry about backlogs, but no more,” - Rajarao Chitturi, Database and Applications Manager at AMD AMD Boosts Performance with PDW

23 AMD Business Challenges Only supported 6 month data retention Issues loading concurrently with high query volume Obstacles With SMP Oracle Loading data always lagged behind by days Analyst couldn’t access recent data Continuous data loads throughout the day while users were querying the system Load Demand Custom reporting tools hosted on Linux uses JDBC and ODBC drivers Linux Based Reporting

24 Project Overview Wafer Quality Assurance Data - 42 TB on PDW Space Saving PDW Index Lite Approach - Oracle required excessive non-clustered indexes to get any performance Improved Loading Speed GB/hr. throughput 10,000 – 13,000 Analytic Queries per Day - Most are scan intensive Faster Backups – Complete in 1~2 hours per Database - Compared to a week on Oracle Reduced Support Costs by 90% - No more chopping up queries to fit the data warehouse Critical Save Time Query Save Space Load Speed Save Costs

25 Parallel Data Warehouse 2012

26 Other PDW Sessions Online Advertising: Hybrid Approach to Large-Scale Data Analysis Online Advertising: Hybrid Approach to Large-Scale Data Analysis (DAV-303-M) Data Analytics and Visualization Breakout Session (60 minutes) Fri April 12, 2013, 2:45 PM - 3:45 PM in Sheraton 3 Anna Skobodzinski Christian Bonilla Dmitri Tchikatilov Trevor Attridge

27 Win a Microsoft Surface Pro! Complete an online SESSION EVALUATION to be entered into the draw. Draw closes April 12, 11:59pm CT Winners will be announced on the PASS BA Conference website and on Twitter. Go to passbaconference.com/evals or follow the QR code link displayed on session signage throughout the conference venue. Your feedback is important and valuable. All feedback will be used to improve and select sessions for future events.

April 10-12, Chicago, IL Thank you! Diamond Sponsor Platinum Sponsor