Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modernizing Business with BIG DATA Aashish Chandra Divisional VP, Sears Holdings Global Head, Legacy Modernization, MetaScale.

Similar presentations


Presentation on theme: "Modernizing Business with BIG DATA Aashish Chandra Divisional VP, Sears Holdings Global Head, Legacy Modernization, MetaScale."— Presentation transcript:

1 Modernizing Business with BIG DATA Aashish Chandra Divisional VP, Sears Holdings Global Head, Legacy Modernization, MetaScale

2 2 Big Data fueling Enterprise Agility Harvard Business Review refers Sears Holdings Hadoop use case - Big Data's Management Revolution! Sears eschews IBM/Oracle for open source and self build Sears’ Big Data Swap Lesson: Functionality over price? How banks can benefit from real-time Big Data analytics?

3 3 Legacy Rides The Elephant Hadoop has changed the enterprise big data game. Are you languishing in the past or adopting outdated trends?

4 4 Journey to the world with NO Mainframes.. Mainframe Migration Optimize PiG / Hadoop Rewrites Convert High TCO Resource Crunch Inert Business Practice s II. Mainframe ONLINE Tool based Conversion Convert COBOL & JCL to Java I. Mainframe Optimization 5% ~ 10% MIPS Reduction Quick Wins with Low hanging fruits III. Mainframe BATCH ETL Modernization Move Batch Processing to Hadoop Cost Savings Open Source Platform Simpler & Easier Code Business Agility Business & IT Transformation Modernized Systems IT Efficiencies Cost Savings Open Source Platform Simpler & Easier Code Business Agility Business & IT Transformation Modernized Systems IT Efficiencies

5 5 Why Hadoop and Why Now? THE ADVANTAGES: Cost reduction Alleviate performance bottlenecks ETL too expensive and complex Mainframe and Data Warehouse processing  Hadoop THE CHALLENGE: Traditional enterprises lack of awareness THE SOLUTION: Leverage the growing support system for Hadoop Make Hadoop the data hub in the Enterprise Use Hadoop for processing batch and analytic jobs

6 6 The Classic Enterprise Challenge The Challenge Growing Data Volumes Shortened Processing Windows Escalating Costs Hitting Scalability Ceilings Demanding Business Requirements ETL Complexity Latency in Data Tight IT Budgets

7 7 The Sears Holdings Approach Implement a Hadoop- centric reference architecture Move enterprise batch processing to Hadoop Make Hadoop the single point of truth Massively reduce ETL by transforming within Hadoop Move results and aggregates back to legacy systems for consumption Retain, within Hadoop, source files at the finest granularity for re-use 112233445566 Key to our Approach: 1)allowing users to continue to use familiar consumption interfaces 2)providing inherent HA 3)enabling businesses to unlock previously unusable data

8 8 Enterprise solutions using Hadoop must be an eco-system Large companies have a complex environment: Transactional system Services EDW and Data marts Reporting tools and needs We needed to build an entire solution The Architecture

9 9 The Sears Holdings Architecture

10 10 PiG/Hadoop Ecosystem MetaScale

11 11 The Learning HADOOP We can dramatically reduce batch processing times for mainframe and EDW We can retain and analyze data at a much more granular level, with longer history Hadoop must be part of an overall solution and eco-system IMPLEMENTATION We can reliably meet our production deliverable time-windows by using Hadoop We can largely eliminate the use of traditional ETL tools New Tools allow improved user experience on very large data sets UNIQUE VALUE We developed tools and skills – The learning curve is not to be underestimated We developed experience in moving workload from expensive, proprietary mainframe and EDW platforms to Hadoop with spectacular results Over two years of Hadoop experience using Hadoop for Enterprise legacy workload.

12 Some Examples Use-Cases at Sears Holdings

13 13 Sales: 8.9B Line Items Stores: 3200 Sites Items: 11.3M SKUs Inventory: 1.8B rows Inventory: 1.8B rows Price Sync: Daily Price Sync: Daily The Challenge – Use-Case #1 Intensive computational and large storage requirements Needed to calculate item price elasticity based on 8 billion rows of sales data Could only be run quarterly and on subset of data – Needed more often Business need - React to market conditions and new product launches Elasticity: 12.6B Parameters Timing: Weekly Timing: Weekly Offers: 1.4B SKUs Offers: 1.4B SKUs

14 14 Sales: 8.9B Line Items Stores: 3200 Sites Items: 11.3M SKUs Inventory: 1.8B rows Inventory: 1.8B rows Price Sync: Daily Price Sync: Daily Business Problem: Hadoop The Result – Use-Case #1 Intensive computational and large storage requirements Needed to calculate store-item price elasticity based on 8 billion rows of sales data Could only be run quarterly and on subset of data Business missing the opportunity to react to changing market conditions and new product launches Price elasticity calculated weekly 100% of data set and granularity New business capability enabled Meets all SLAs Elasticity: 12.6B Parameters Timing: Weekly Timing: Weekly Offers: 1.4B SKUs Offers: 1.4B SKUs

15 15 Input Records: Billions Mainframe: 100 MIPS on 1% of data Hadoop The Challenge – Use-Case #2 Mainframe Scalability: Unable to Scale 100 fold Data Sources: 30+ Page 15 Mainframe batch business process would not scale Needed to process 100 times more detail to handle business critical functionality Business need required processing billions of records from 30 input data sources Complex business logic and financial calculations SLA for this cyclic process was 2 hours per run

16 16 Processing Met Tighter SLA $600K Annual Savings 6000 Lines Reduced to 400 Lines of PIG Input Records: Billions Mainframe: 100 MIPS on 1% of data Hadoop The Result – Use-Case #2 Mainframe batch business process would not scale Needed to process 100 times more detail to handle rollout of high value business critical functionality Time sensitive business need required processing billions of records from 30 input data sources Complex business logic and financial calculations SLA for this cyclic process was 2 hours per run Teradata & Mainframe Data on Hadoop JAVA UDFs for financial calculations Implemented PIG for Processing Scalable Solution in 8 Weeks Mainframe Scalability: Unable to Scale 100 fold Data Sources: 30+ Business Problem:

17 17 Mainframe Jobs: 64 Data Storage: Mainframe DB2 Tables Hadoop The Challenge – Use-Case #3 Mainframe unable to meet SLAs on growing data volume Processing Window: 3.5 Hours Price Data: 500M Records

18 18 Mainframe Jobs: 64 Data Storage: Mainframe DB2 Tables Business Problem: Hadoop The Result – Use-Case #3 Mainframe unable to meet SLAs on growing data volume Source Data in Hadoop $100K in Annual Savings Job Runs Over 100% faster – Now in 1.5 hours Maintenance Improvement – <50 Lines PIG code Processing Window: 3.5 Hours Price Data: 500M Records

19 19 Transformation: On Teradata History Retained: No New Report Development: Slow Hadoop The Challenge – Use-Case #4 Needed to enhance user experience and ability to perform analytics at granular data Restricted availability of data due to space constraint Needed to retain granular data Needed Excel format interaction on data sources of 100 millions of records with agility User Experience: Unacceptable Batch Processing Output:.CSV Files Teradata via Business Objects

20 20 Granular History Retained Business’s Single Source of Truth Datameer for Additional Analytics Over 50 Data Sources Retained in Hadoop PIG Scripts to Ease Code Maintenance Transformation: On Teradata History Retained: No New Report Development: Slow Business Problem: Hadoop The Result – Use-Case #4 Needed to enhance user experience and ability to perform analytics at granular data Restricted availability of data due to space constraint Needed to retain granular data Needed Excel format interaction on data sources of 100 millions of records with agility Sourcing Data Directly to Hadoop Transformation Moved to Hadoop Redundant Storage Eliminated User Experience Expectations Met User Experience: Unacceptabl e Batch Processing Output:.CSV Files Teradata via Business Objects

21 21 Summary of Benefits Readily available resources & commodity skills Access to latest technologies IT Operational Efficiencies Moved 7000 lines of COBOL code to under 50 lines in PiG Ancient systems no longer bottleneck for business Faster time to Market Mission critical “Item Master” application in COBOL/JCL being converted by our tool in Java (JOBOL) Modernized COBOL, JCL, DB2, VSAM, IMS & so on Reduced batch processing in COBOL/JCL from over 6 hrs to less than 10 min in PiG Latin on Hadoop Simpler, and easily maintainable code Massively Parallel Processing Significant reduction in ISV costs & mainframe software licenses fees Open Source platform Saved ~ $2MM annually within 13 weeks by MIPS Optimization efforts Reduced 1000+ MIPS by moving batch processing to Hadoop Cost Savings Transform I.T. Skills & Resources Business Agility

22 22 Summary Hadoop can revolutionize Enterprise workload and make business agile Can reduce strain on legacy platforms Can reduce cost Can bring new business opportunities Must be an eco-system Must be part of an data overall strategy Not to be underestimated

23 23 Automation tools and techniques that ease the Enterprise integration of Hadoop Educate traditional Enterprise IT organizations about the possibilities and reasons to deploy Hadoop Continue development of a reusable framework for legacy workload migration The Horizon – What do we need next?

24 24 For more information, visit: www.metascale.com Follow us on Twitter Follow us on Twitter @LegacyModernizationMadeEasy Join us on LinkedIn: Join us on LinkedIn: www.linkedin.com/company/metascale-llc Legacy Modernization Made Easy! Contact: Kate Kostan National Solutions Kate.Kostan@MetaScale.com

25


Download ppt "Modernizing Business with BIG DATA Aashish Chandra Divisional VP, Sears Holdings Global Head, Legacy Modernization, MetaScale."

Similar presentations


Ads by Google