Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

Slides:



Advertisements
Similar presentations
SkewReduce YongChul Kwon Magdalena Balazinska, Bill Howe, Jerome Rolia* University of Washington, *HP Labs Skew-Resistant Parallel Processing of Feature-Extracting.
Advertisements

LIBRA: Lightweight Data Skew Mitigation in MapReduce
Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.
ENERGY AND POWER CHARACTERIZATION OF PARALLEL PROGRAMS RUNNING ON THE INTEL XEON PHI JOAL WOOD, ZILIANG ZONG, QIJUN GU, RONG GE {JW1772, ZILIANG,
Erhan Erdinç Pehlivan Computer Architecture Support for Database Applications.
Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of.
Performance Analysis of Multiprocessor Architectures
Benchmarking Parallel Code. Benchmarking2 What are the performance characteristics of a parallel code? What should be measured?
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
Energy Efficient Web Server Cluster Andrew Krioukov, Sara Alspaugh, Laura Keys, David Culler, Randy Katz.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
Chapter 4 Assessing and Understanding Performance
Parallel Computation in Biological Sequence Analysis Xue Wu CMSC 838 Presentation.
1 The Problem of Power Consumption in Servers L. Minas and B. Ellison Intel-Lab In Dr. Dobb’s Journal, May 2009 Prepared and presented by Yan Cai Fall.
Distributed Systems Meet Economics: Pricing In The Cloud Authors: Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping He, Lidong Zhou Presenter:
By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and
11 Establishing the Framework for Datacenter of the Future Richard Curran Director Product Marketing, Intel EMEA.
CMSC 611: Advanced Computer Architecture Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Lecture 2: Technology Trends and Performance Evaluation Performance definition, benchmark, summarizing performance, Amdahl’s law, and CPI.
Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Ekrem Kocaguneli 11/29/2010. Introduction CLISSPE and its background Application to be Modeled Steps of the Model Assessment of Performance Interpretation.
INSTITUTE OF COMPUTING TECHNOLOGY BigDataBench: a Big Data Benchmark Suite from Internet Services Lei Wang, Jianfeng Zhan, Chunjie Luo, Yuqing Zhu, Qiang.
Energy Issues in Data Analytics Domenico Talia Carmela Comito Università della Calabria & CNR-ICAR Italy
CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA
Accelerating Knowledge-based Energy Evaluation in Protein Structure Modeling with Graphics Processing Units 1 A. Yaseen, Yaohang Li, “Accelerating Knowledge-based.
Energy Profiling And Analysis Of The HPC Challenge Benchmarks Scalable Performance Laboratory Department of Computer Science Virginia Tech Shuaiwen Song,
Storage in Big Data Systems
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Temperature Aware Load Balancing For Parallel Applications Osman Sarood Parallel Programming Lab (PPL) University of Illinois Urbana Champaign.
A performance evaluation approach openModeller: A Framework for species distribution Modelling.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
Srihari Makineni & Ravi Iyer Communications Technology Lab
Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters
Computer Science Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs Min Yeol Lim Computer Science Department Sep.
IIIT Hyderabad Scalable Clustering using Multiple GPUs K Wasif Mohiuddin P J Narayanan Center for Visual Information Technology International Institute.
VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.
An Energy-efficient Task Scheduler for Multi-core Platforms with per-core DVFS Based on Task Characteristics Ching-Chi Lin Institute of Information Science,
Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych,
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
PERFORMANCE STUDY OF BIG DATA ON SMALL NODES. Ομάδα: Παναγιώτης Μιχαηλίδης Αντρέας Σόλου Instructor: Demetris Zeinalipour.
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
ApproxHadoop Bringing Approximations to MapReduce Frameworks
Accounting for Load Variation in Energy-Efficient Data Centers
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
Sunpyo Hong, Hyesoon Kim
E-MOS: Efficient Energy Management Policies in Operating Systems
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Efficient Opportunistic Sensing using Mobile Collaborative Platform MOSDEN.
1 Potential for Parallel Computation Chapter 2 – Part 2 Jordan & Alaghband.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Online Parameter Optimization for Elastic Data Stream Processing Thomas Heinze, Lars Roediger, Yuanzhen Ji, Zbigniew Jerzak (SAP SE) Andreas Meister (University.
Measuring Performance II and Logic Design
Lecture 2: Performance Evaluation
Jacob R. Lorch Microsoft Research
A modified hyperplane clustering algorithm allows for efficient and accurate clustering of extremely large datasets Ashok Sharma, Robert Podolsky, Jieping.
High Performance Computing on an IBM Cell Processor --- Bioinformatics
Green Software Engineering Prof
Hadoop Clusters Tess Fulkerson.
Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
Software Acceleration in Hybrid Systems Xiaoqiao (XQ) Meng IBM T. J
Presentation transcript:

Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS

BPOE 2013 | HPCChina 2013 Goals of Big Data Systems Larger GreenerFaster

BPOE 2013 | HPCChina 2013 Performance V.S. Energy Efficiency Perfor mance Energy Efficien cy Faster & More Powerful Greener & Cheaper More servers Bigger clusters Powerful processors Sophisticated processing algorithms … Lightweight servers Efficient processors Simpler processing algorithms … Tradeoff Evaluation

BPOE 2013 | HPCChina 2013 Evaluation of Performance & Energy Efficiency Tradeoff How to measure? AxPUE: Application Level Metrics for Power Usage Effectiveness in Big Data Systems How to get balance? The Implications from Benchmarking Three Big Data Systems

BPOE 2013 | HPCChina 2013 Motivation If you can not measure it, you can not improve it. – Lord Kelvin PUE(Power usage effectiveness): a measure of how efficiently a computer data center uses its power; specifically, how much of the power is actually used by the information technology equipment.

BPOE 2013 | HPCChina 2013 PUE & Its Variants MetricTimeOrganizationComputing Formulas PUE2007GreenGrid DCiE2008GreenGrid DCeP2008GreenGrid pPUE2012GreenGrid PUE Scalability 2013GreenGrid

BPOE 2013 | HPCChina 2013 Motivation Scenario1 Data Management Researcher An Improved Data Classification Algorithm Does it contribute to greening the data centers? Run the Algorithms on Data Center Compare the PUEs No Obvious Variations! PUE can not measure the effectiveness of any changes made upon the data center infrastructure!

BPOE 2013 | HPCChina 2013 Motivation Scenario2 Data Center Administrators Give a budget plan of the data center energy consumption in the next year Estimate the data volume based on the business development How to estimate the energy increasement? PUE provides little reference information for data center planning according to data scale and application complexity

BPOE 2013 | HPCChina 2013 Calculation Framework PUE AxPUE

BPOE 2013 | HPCChina 2013 Definition - ApPUE ApPUE (Application Performance Power Usage Effectiveness): a metric that measures the power usage effectiveness of IT equipments, specifically, how much of the power entering IT equipments is used to improve the application performance. Computation Formulas: Data processing performance of applications The average rate of IT Equipment Energy consumed

BPOE 2013 | HPCChina 2013 Definition - AoPUE AoPUE (Application Overall Power Usage Effectiveness ): a metric that measures the power usage effectiveness of the overall data center system, specifically, how much of the total facility power is used to improve the application performance. Computation Formulas: The average rate of Total Facility Energy Used

BPOE 2013 | HPCChina 2013 Acquisition – Application Performance Application Category ExamplesMetric Service ApplicationSearch engine, Ad-hoc queries Number of requests answered in unit time Data Analysis Application Data mining, Reporting, Decision support, Log analysis Volume of data processed in unit time Interactive Real-time Application E-commerce, Profile data management Number of transactions completed in unit time High Performance Computing Scientific ComputingNumber of floating-point operations in unit time

BPOE 2013 | HPCChina 2013 Acquisition – Benchmark Requirements of Benchmarks –Provide representative workloads for big data applications –Provide a scalable data generation tool BigDataBench –A big data benchmark suite open-sourced recently and publicly available –All the requirements are well fullfilled

BPOE 2013 | HPCChina 2013 Experiment Overview Testbed –Data center of 18 racks,362 servers –Sample 8 servers Workloads Two experiments –Different Applications –Different Implementation Algorithms

BPOE 2013 | HPCChina 2013 Experiments on Different Applications BigDataBench SVMSortGrepLinpack

BPOE 2013 | HPCChina 2013 Experiments on Different Algorithms Two Implementations for Sort –Several reducers with random sampling partitioning –One reducer without partitioning

BPOE 2013 | HPCChina 2013 Conclusions We analyze the requirements of application-level energy effectiveness metrics AxPUE in data centers. We propose two novel application-level metrics ApPUE and AoPUE to measure the energy consumed to improve the application performance. The experiment results show that AxPUE could provide meaningful guidance to data center design and optimization.

BPOE 2013 | HPCChina 2013 Evaluation of Performance & Energy Efficiency Tradeoff How to measure? AxPUE: Application Level Metrics for Power Usage Effectiveness in Data Centers How to get balance? The Implications from Benchmarking Three Big Data Systems

BPOE 2013 | HPCChina 2013 New Solutions ……

BPOE 2013 | HPCChina 2013 Users’ Concerns Diverse big data systems under different applications and data volumes How is the performance? How is the energy consumption? What are the differences between them ? Evaluating three respective big data systems using BigDataBench Comparing two of them from performance and energy efficiency Analyzing the running features of three big data systems

BPOE 2013 | HPCChina 2013 Experimental Platforms Xeon (Common processor) Atom ( Low power processor) Tilera (Many core processor) CPU Type Intel Xeon E5310 Intel Atom D510Tilera TilePro36 CPU Core 4 1.6GHz GHz MHz L1 I/D Cache 32KB24KB16KB/8KB L2 Cache4096KB512KB64KB OoO Execution FPU Connection Mode Buffer Sharing TDP Hyper Threading Xeon E5310 Yes BUSNo80WNo Atom D510 NoYesBUSNo13WYes TilePro36YesNoIMESHYes16WNo Basic Information Brief Comparison Hadoop Cluster Information Xeon VS AtomXeon VS Tilera Master/Slaves1/71/7 and 1/1 Comprison Having the same hardware thread number Having the same core number Hadoop setting Following the guidance on Hadoop official website

BPOE 2013 | HPCChina 2013 Benchmark Selection BigDataBench A big data benchmark suite from big data applications Respective applications An innovative data generation tool

BPOE 2013 | HPCChina 2013 Metrics Performance: Data processed per second (DPS) Energy Efficiency: Application Performance Power Usage Effectiveness(DPJ) Data Input Size DPS = Run Time Data Input Size DPJ = Energy Consumption

BPOE 2013 | HPCChina 2013 XeonAtomTilera DPS DPJ General Observations

BPOE 2013 | HPCChina 2013 General Observations Data scale has a significant impact on the performance and energy efficiency of big data systems. The performance and energy efficiency trends of different applications are diverse. XeonAtomTilera

BPOE 2013 | HPCChina 2013 Xeon VS Atom – DPS

BPOE 2013 | HPCChina 2013 Xeon VS Atom – DPJ

BPOE 2013 | HPCChina 2013 Xeon VS Atom – DPS & DPJ 500MB1GB10GB25GB50GB 100G B Sort DPS DPJ Wordcount DPS DPJ Grep DPS DPJ Naïve Bayes DPS DPJ SVM DPS DPJ Xeon is more powerful than Atom on processing capacity. Atom is more energy –saving than Xeon when dealing with simple computation logic applications.

BPOE 2013 | HPCChina 2013 Xeon VS Atom --Speedup Atom doesn’t show energy advantage when dealing with complex application

BPOE 2013 | HPCChina 2013 Xeon VS Atom -- Summary Xeon is more powerful than Atom on processing capacity. Atom is energy conservation than Xeon when dealing with applications with simple computation logic. Atom doesn’t show energy advantage when dealing with complex applications.

BPOE 2013 | HPCChina 2013 Xeon VS Tilera – DPS

BPOE 2013 | HPCChina 2013 Xeon VS Tilera – DPJ

BPOE 2013 | HPCChina 2013 Xeon VS Tilera – DPS & DPJ 500MB1GB10GB25GB Sort DPS DPJ Wordcount DPS DPJ Grep DPS DPJ Naïve Bayes DPS DPJ Xeon is more powerful than Tilera on processing capacity Tilera is more energy-saving than Xeon when dealing with the simple computation logic and I/O intensive applications Tilera don’t show energy advantage when dealing with complex applications

BPOE 2013 | HPCChina 2013 Xeon VS Tilera The DPS of Xeon The DPS of AtomThe DPS of Tilera

BPOE 2013 | HPCChina 2013 Xeon VS Tilera The DPS of Tilera Tilera is more suitable to process I/O intensive applications

BPOE 2013 | HPCChina 2013 Xeon VS Tilera -- Summary 36 Xeon is more powerful than Tilera on processing capacity. Tilera is more energy conservation than Xeon when dealing with simple computation logic and I/O intensive applications. Tilera don’t show energy advantage when dealing with complex applications. Tilera is more suitable to process I/O intensive applications.

BPOE 2013 | HPCChina 2013 Implications The performance of a big data system is not only related to the hardware itself, but also the application type and data volume of workloads. The weak processors aren’t suitable to deal with complex applications. Even they have lower TDP, they don’t show energy cost advantage.

BPOE 2013 | HPCChina 2013 Implications Cont. Xeon generally has better processing capacity accompanied with high energy consumption, especially to some light scale-out applications. Atom and Tilera show energy consumption advantage when dealing with light scale-out applications. Tilera exerts energy advantage on processing I/O intensive application.

BPOE 2013 | HPCChina 2013