The BigFrame Team Duke University, Hong Kong Polytechnic University, and HP Labs.

Slides:



Advertisements
Similar presentations
Syncsort Data Integration Update Summary Helping Data Intensive Organizations Across the Big Data Continuum Hadoop – The Operating System.
Advertisements

© 2013 IBM Corporation Complexity must be conquered (Data is more than “just” big) Laura Haas IBM Almaden Research Center 1.
OULU ADVANCED RESEARCH ON SOFTWARE AND INFORMATION SYSTEMS Teppo Räisänen | Oulu University of Applied Sciences Facebook API Teppo Räisänen
NHibernate Object/Relational Persistence for.NET.
How to Architect Big Data Apps with the Lambda Architecture
Nokia Technology Institute Natural Partner for Innovation.
Transforming Business with Advanced Analytics: Introducing the New Intel® Xeon® Processor E7 v2 Family Seetha Rama Krishna Director, APAC HPC Solutions.
OnContact CRM Customer Relationship Management. CRM 7 Benefits Rich "client" experience, completely web-based Access data anytime, anywhere. Ease of navigation.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Big Data: Analytics Platforms Donald Kossmann Systems Group, ETH Zurich 1.
BigBench: Big Data Benchmark Proposal Ahmad Ghazal, Tilmann Rabl, Minqing Hu, Francois Raab, Meikel Poess, Alain Crolotte, Hans-Arno Jacobsen.
Spark Web 2.0 Tools for Communication and Collaboration David Grogan Manager, Curricular Technology Group UIT Academic Technology Tufts University What.
Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
A Fast Growing Market. Interesting New Players Lyzasoft.
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.
Platinum Sponsors Titanium Sponsors. ETL Tool (SSIS, etc) EDW (SQL Svr, Teradata, etc) Extract Original Data Load Transformed Data Transform BI Tools.
Inventory Management System With Berkeley DB 1. What is Berkeley DB? Berkeley DB is an Open Source embedded database library that provides scalable, high-
Java Stack 4 Providing Robust Back-end Web Services For Your Solution.
Stefan Glover, Business Relationship Manager, OEM, Business Objects
New Innovations in social media & digital marketing
Web-Enabling the Warehouse Chapter 16. Benefits of Web-Enabling a Data Warehouse Better-informed decision making Lower costs of deployment and management.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Big Data Use Cases in the cloud Peter Sirota, GM Elastic
David Besemer, CTO On Demand Data Integration with Data Virtualization.
Billing Your Business Solution Overview May 2015.
Geek Night Nima Ben Tramchester & Graph Databases.
Tyson Condie.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
` tuplejump The data engineering platform. A startup with a vision to simplify data engineering and empower the next generation of data powered miracles!
© Copyright 2010 Hewlett-Packard Development Company, L.P. 1 Swati Saxena Senior Research Manager August RESEARCH AND ANALYTICS: AN SMB CASE STUDY.
SharePoint Enterprise Aggregation Caching Feature Product Overview Nimrod Geva Product Group Manager, KWizCom
Business Analytics Examples
Show me the money Succeeding with the recurring revenue model Mark Emanuelson Atlantic Technologies.
Creating New Business Value with Big Data Attivio Active Intelligence Engine®
Meet DEBI… and you’ll be a Blabbermouth too! DEBI Provides a simple, but visually rich dashboard of complex metrics. This highly customizable dashboard.
FI-CORE Data Context Media Management Chapter Release 4.1 & Sprint Review.
PLEASE READ THE NOTES!  Important information and instructions are provided in the Notes section of the slides.
USING MULTIPLE PERSISTENCE LAYERS IN SPARK TO BUILD A SCALABLE PREDICTION ENGINE Richard Williamson
1 Agenda 7 Hints from the field: how to make BI- Accelerator work for you  Sizing and Implementation  Management and Costs.
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
Big Data Analytics Platforms. Our Team NameApplication Viborov MichaelApache Spark Bordeynik YanivApache Storm Abu Jabal FerasHPCC Oun JosephGoogle BigQuery.
Information Systems in Organizations Managing the business: decision-making Growing the business: knowledge management, R&D, and social business.
Big Data Yuan Xue CS 292 Special topics on.
External Data Access Adam Rauch, 6/05/08 Team: Geoff Snyder, Kevin Beverly, Cory Nathe, Matthew Bellew, Mark Igra, George Snelling.
Harnessing Big Data with Hadoop Dipti Sangani; Madhu Reddy DBI210.
Course : Study of Digital Convergence. Name : Srijana Acharya. Student ID : Date : 11/28/2014. Big Data Analytics and the Telco : How Telcos.
Dato Confidential 1 Danny Bickson Co-Founder. Dato Confidential 2 Successful apps in 2015 must be intelligent Machine learning key to next-gen apps Recommenders.
MarkLogic The Only Enterprise NoSQL Database Presented by: Aashi Rastogi ( ) Sanket Patel ( )
@nmoneypenny Innovating New Products & Services with Enterprise Social Graphing: Naomi Moneypenny.
What if your app could put the power of analytics everywhere decisions are made? Modern apps with data visualizations built-in have the power to inform.
Unlock your Big Data with Analytics and BI on Office365 Brian Culver ● SharePoint Fest Seattle● BI102 ● August 18-20, 2015.
Microsoft Ignite /28/2017 6:07 PM
Qlik + Cloudera 10 Points of Integration
Background Information: Big Data Systems Vs Relational Database:
Data Analytics 1 - THE HISTORY AND CONCEPTS OF DATA ANALYTICS
Information Systems in Organizations
5/7/ :44 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
5/9/2018 7:28 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS.
International Conference on Data Engineering (ICDE 2016)
Pathology Spatial Analysis February 2017
Physical Network (L1-L4)
Microsoft Ignite /22/2018 3:27 PM BRK2121
Twitter & NoSQL Integration with MVC4 Web API
Shubha Vijayasarathy Program Manager, Azure Event Hubs - Microsoft
BusinessObjects 4.2 SP3 What's new for System Administration in CMC
Analytics in the Cloud using Microsoft Azure
Toolbox Benchmarking data session BDVe Meetup, Sofia May 15, 2018
Committed to delivering winning solutions
Presentation transcript:

The BigFrame Team Duke University, Hong Kong Polytechnic University, and HP Labs

Analytics System Landscape

What does this mean for Big Data Practitioners?

Gives them a lot of power! From:

Even the mighty may need a little help

Challenges for Practitioners Which system to use for the app that I am developing? Features (e.g., graph data) Performance (e.g., claims like System A is 50x faster than B) Resource efficiency Growth and scalability Multi-tenancy App Developers, Data Scientists

Different parts of my app have different requirements Compose best of breed systems OR Use one size fits all system? Managing many systems is hard! System Admins Challenges for Practitioners Which system to use for the app that I am developing? App Developers, Data Scientists

Managing many systems is hard! Different parts of my app have different requirements Total Cost of Ownership (TCO)? CIO System Admins Challenges for Practitioners Which system to use for the app that I am developing? App Developers, Data Scientists

One Approach

Useful, But …

How a user uses BigFrame BigFrame Interface BigFrame Interface Benchmark Generator Benchmark Generator HBase Hive Map Reduce Benchmark Driver for System Under Test Benchmark Driver for System Under Test

bspec: Benchmark Specification HBase Hive Map Reduce 2. Data refresh pattern 3. Query streams 4. Evaluation metrics 1. Data for initial load

What does the user (want to) specify? BigFrame Interface BigFrame Interface

The 3 Vs

bigif: BigFrames InputFormat Data Variety Relational, text, array, graph Small, medium, large Data Volume Query Volume Query concurrency & classes Data Velocity At rest, slow, fast Micro, Macro Query Variety Exploratory, Continuous Query Velocity

Benchmark Generation Benchmark Generator Benchmark Generator

Application Domain Modeled Currently E-commerce sales, promotions, recommendations Social media sentiment & influence Social media sentiment & influence

Application Domain Modeled Currently Item Customer Web_sales Promotion Tweets Relationships

Application Domain Modeled Currently Item Web_sales Promotion

Application Domain Modeled Currently

Benchmark Generation Benchmark Generator Benchmark Generator

Use Case I: Exploratory BI Large volumes of relational data Mostly aggregation and few joins Can Sparks performance match that of an MPP DB?

Use Case II: Complex BI Large volumes of relational data Even larger volumes of text data Combined analytics

Large volume and velocity of relational and text data Use Case III: Dashboards Continuously-updated Dashboards

Use Case IV: Does One Size Fit All? Growing set of applications have to process relational, text, & graph data Compose best of breed systems or use a one size fits all system?

Use Case V: Multi-tenancy and SLAs Big data deployments are increasingly multi-tenant and need to meet SLAs

Working with the Community First release of BigFrame planned for August 2013 With feedback from benchmark developers (BigBench) Open-source with extensibility APIs Benchmark Drivers for more systems Utilities (accessed through the Benchmark Driver to drill down into system behavior during benchmarking) Instantiate the BigFrame pipeline for more app domains

Benchmarks shape a field (for better or worse) … -- David Patterson, Univ. of California, Berkeley Benchmarks meet different needs for different people End customers, application developers, system designers, system administrators, researchers, CIOs BigFrame helps users generate benchmarks that best meet their needs