Forecasting with Cyber-physical Interactions in Data Centers (part 3)

Slides:



Advertisements
Similar presentations
FUNNEL: Automatic Mining of Spatially Coevolving Epidemics Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Willem G. van Panhuis (University of.
Advertisements

1 * Other names and brands may be claimed as the property of others. Copyright © 2010, Intel Corporation. Data Center Efficiency with Optimized Cooling.
Model predictive control for energy efficient cooling and dehumidification Tea Zakula Leslie Norford Peter Armstrong.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Streaming Pattern Discovery in Multiple Time-Series Spiros Papadimitriou Jimeng Sun Christos Faloutsos Carnegie Mellon University VLDB 2005, Trondheim,
Utility-Function-Driven Energy- Efficient Cooling in Data Centers Authors: Rajarshi Das, Jeffrey Kephart, Jonathan Lenchner, Hendrik Hamamn IBM Thomas.
WindMine: Fast and Effective Mining of Web-click Sequences SDM 2011Y. Sakurai et al.1 Yasushi Sakurai (NTT) Lei Li (Carnegie Mellon Univ.) Yasuko Matsubara.
Civil and Environmental Engineering Carnegie Mellon University Sensors & Knowledge Discovery (a.k.a. Data Mining) H. Scott Matthews April 14, 2003.
Ph.D. DefenceUniversity of Alberta1 Approximation Algorithms for Frequency Related Query Processing on Streaming Data Presented by Fan Deng Supervisor:
Context Compression: using Principal Component Analysis for Efficient Wireless Communications Christos Anagnostopoulos & Stathes Hadjiefthymiades Pervasive.
1 Using A Multiscale Approach to Characterize Workload Dynamics Characterize Workload Dynamics Tao Li June 4, 2005 Dept. of Electrical.
Privacy Preservation for Data Streams Feifei Li, Boston University Joint work with: Jimeng Sun (CMU), Spiros Papadimitriou, George A. Mihaila and Ioana.
A Search-based Method for Forecasting Ad Impression in Contextual Advertising Defense.
Dynamic Reduced-order Model for the Air Temperature Field Inside a Data Center G.W. Woodruff School of Mechanical Engineering Georgia Institute of Technology.
Parsimonious Linear Fingerprinting for Time Series Lei Li joint work with B. Aditya Prakash, Christos Faloutsos School of Computer Science Carnegie Mellon.
CoolAir Temperature- and Variation-Aware Management for Free-Cooled Datacenters Íñigo Goiri, Thu D. Nguyen, and Ricardo Bianchini 1.
Optimal Fan Speed Control for Thermal Management of Servers UMass-Amherst Green Computing Seminar September 21 st, 2009.
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
Overview of Model Predictive Control in Buildings
An Adaptive Modeling for Robust Prognostics on a Reconfigurable Platform Behrad Bagheri Linxia Liao.
Folklore Confirmed: Compiling for Speed = Compiling for Energy Tomofumi Yuki INRIA, Rennes Sanjay Rajopadhye Colorado State University 1.
Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie.
Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters Q. Tang, T. Mukherjee, Sandeep K. S. Gupta Department of Computer.
David Bendit System Administrator Mars Space Flight Facility Arizona State University.
Evaluation Methods and Challenges. 2 Deepak Agarwal & Bee-Chung ICML’11 Evaluation Methods Ideal method –Experimental Design: Run side-by-side.
Summer Report Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY
ConSil Jeff Chase Duke University. Collaborators Justin Moore –received PhD in April, en route to Google. Did this research. Wrote this paper. Named the.
BRAID: Discovering Lag Correlations in Multiple Streams Yasushi Sakurai (NTT Cyber Space Labs) Spiros Papadimitriou (Carnegie Mellon Univ.) Christos Faloutsos.
Xiao Liu, Jinjun Chen, Ke Liu, Yun Yang CS3: Centre for Complex Software Systems and Services Swinburne University of Technology, Melbourne, Australia.
Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), Christos Faloutsos (CMU), Tomoharu.
AutoPlait: Automatic Mining of Co-evolving Time Sequences Yasuko Matsubara (Kumamoto University) Yasushi Sakurai (Kumamoto University) Christos Faloutsos.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
Carnegie Mellon School of Computer Science Forecasting with Cyber-physical Interactions in Data Centers Lei Li PDL Seminar 9/28/2011.
Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.
Lei Li Computer Science Department Carnegie Mellon University Pre Proposal Time Series Learning completed work 11/27/2015.
Software Architecture for Dynamic Thermal Management in Datacenters Tridib Mukherjee Graduate Research Assistant IMPACT Lab ( Department.
Thermal Aware Data Management in Cloud based Data Centers Ling Liu College of Computing Georgia Institute of Technology NSF SEEDM workshop, May 2-3, 2011.
Green Computing Metrics: Power, Temperature, CO2, … Computing system: Many-cores, Clusters, Grids and Clouds Algorithm and model: task scheduling, CFD.
ATAC: Ambient Temperature- Aware Capping for Power Efficient Datacenters Sungkap Yeo Mohammad M. Hossain Jen-cheng Huang Hsien-Hsin S. Lee.
Stream Monitoring under the Time Warping Distance Yasushi Sakurai (NTT Cyber Space Labs) Christos Faloutsos (Carnegie Mellon Univ.) Masashi Yamamuro (NTT.
Consensus Group Stable Feature Selection
Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS
Streaming Pattern Discovery in Multiple Time-Series Jimeng Sun Spiros Papadimitrou Christos Faloutsos PARALLEL DATA LABORATORY Carnegie Mellon University.
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
D YNA MM O : M INING AND S UMMARIZATION OF C OEVOLVING S EQUENCES WITH M ISSING V ALUES Christos Faloutsos joint work with Lei Li, James McCann, Nancy.
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
D YNA MM O : M INING AND S UMMARIZATION OF C OEVOLVING S EQUENCES WITH M ISSING V ALUES Lei Li joint work with Christos Faloutsos, James McCann, Nancy.
Facets: Fast Comprehensive Mining of Coevolving High-order Time Series Hanghang TongPing JiYongjie CaiWei FanQing He Joint Work by Presenter:Wei Fan.
Arizona State University1 Fast Mining of a Network of Coevolving Time Series Wei FanHanghang TongPing JiYongjie Cai.
Best detection scheme achieves 100% hit detection with
All content in this presentation is protected – © 2008 American Power Conversion Corporation Row Cooling.
Date of download: 5/27/2016 Copyright © ASME. All rights reserved. From: Experimentally Validated Computational Fluid Dynamics Model for a Data Center.
Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.
Carnegie Mellon School of Computer Science Forecasting with Cyber-physical Interactions in Data Centers Lei Li PDL Seminar 9/28/2011.
Thermal-aware Task Placement in Data Centers (part 4)
Data Driven Resource Allocation for Distributed Learning
Sofus A. Macskassy Fetch Technologies
Date of download: 11/2/2017 Copyright © ASME. All rights reserved.
Non-linear Mining of Competing Local Activities
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
A Time Series Representation Framework Based on Learned Patterns
Pre Proposal Time Series Learning completed work
Kijung Shin1 Mohammad Hammoud1
Mining Frequent Itemsets over Uncertain Databases
Faraz Ahmad and T. N. Vijaykumar Purdue University
Smita Vijayakumar Qian Zhu Gagan Agrawal
Sequential Data Cleaning: A Statistical Approach
Finding Periodic Discrete Events in Noisy Streams
Thermal Management of Heterogeneous Data Centers
Presentation transcript:

Forecasting with Cyber-physical Interactions in Data Centers (part 3) Lei Li leili@cs.cmu.edu 9/28/2011 PDL Seminar

Big Picture: Predictive AC Control and Server Management Server/workload management Computing energy model Sensor measuring Model of computing energy Temperature prediction Cooling energy model CRAC control (c) Lei Li 2012

Outline Overview of time series mining Motivation Experimental setup Time series examples What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms (c) Lei Li 2012

Experimental setup Tested in JHU data center with 171 1U servers, instrumented with a network of 80 sensors (c) Lei Li 2012

Sample measurements (c) Lei Li 2012

Observations Temperature difference cycle (max/min temp. on the same rack) is in anti-phase with air velocity cycle. Middle and bottom sections are coldest; Top is hottest Shutting down under-utilized servers could reduce energy consumption. (c) Lei Li 2012

What happens when shutting down servers? Shut down (c) Lei Li 2012

Outline Overview of time series mining Motivation Experimental setup Time series examples What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms (c) Lei Li 2012

ThermoCast [Li et al, KDD 2011] Given: intake temperatures, outtake temperatures, workload for each server , and floor air speed Goal: forecasting temperature distribution and thermal aware placement of workload Approach: a zonal forecasting model divide the machine room into zones, and each rack into sections. (c) Lei Li 2012

Assumptions A0: incompressible air A1: environmental temperature is constant A2: supply air temperature is constant within a period A3: constant server fan speed A4: vertical air flow at the outtake is negligible A5: vertical air flow at the intake is linear to height (c) Lei Li 2012

Sensor measurements & Air interactions (c) Lei Li 2012

ThermoCast (c) Lei Li 2012

ThermoCast Model outlet temp Inlet temp floor air speed Derived from fluid dynamics and thermodynamics together with assumptions [Li et al, KDD 2011] (c) Lei Li 2012

Parameter Learning s.t. (c) Lei Li 2012

Outline Overview of time series mining Motivation Experimental setup Time series examples What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms (c) Lei Li 2012

ThermoCast Results Q1: How accurately can a server learn its local thermal dynamics for prediction? 2x better using 90 minutes as training, predicting 5 minutes away AR ThermoCast 75% 100% shutdown (c) Lei Li 2012

ThermoCast Results Q2: How long ahead can ThermoCast forecast thermal alarms? 2x faster Baseline ThermoCast Recall 62.8% 71.4% FAR 45% 43.1% MAT 2.3min 4.2 min FAR=false alarm rate MAT=mean look-ahead time (c) Lei Li 2012

Implication on Capacity Gain Preliminary results comparing workload placement strategies: 5 minutes forecast length With the same cooling: Inlet temp with ThermoCast: 13.75 C Inlet temp with Static profiling: 16.5 C Assume the servers consume 200W on average (Dell PowerEdge 1950), we gain extra 26% computing power with the same cooling (c) Lei Li 2012

Contributions and Impact Predictability: a hybrid approach to integrate the thermodynamics and sensor data Scalable learning/training thanks to the zonal thermal model Real data and instrument in a data center with practical workload Projected impact: can handle extra 26% workload (e.g. PUE 1.5  PUE 1.4) (c) Lei Li 2012

Outline Overview of time series mining Motivation Experimental setup Time series examples What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms (c) Lei Li 2012

DynaMMo: imputation/forecasting Time sensor 1 sensor 2 … sensorm blackout Goal: recover the missing values Details in [Li et al, KDD 2009] (c) Lei Li 2012

DynaMMo result Ideal Reconstruction error Our DynaMMo better Average missing length Spline MSVD [Srebro’03] Linear Interpolation Our DynaMMo better Average length of successive missing values, Why there is drop at 100? Because it is average of 10 repeats, and each time we make random missing values, there is variance. Ideal Dataset: CMU Mocap #16 mocap.cs.cmu.edu harder (c) Lei Li 2012 more results in [Li et al, KDD 2009]

PLiF and CLDS for clustering BGP data: hierarchical clustering + PLiF features Details in [Li et al, VLDB 2010] and [Li & Prakash, ICML 2011] (c) Lei Li 2012

CLDS Clustering Mocap Data CLDS two features PCA top 2 components Accuracy = 93.9% Accuracy = 51.0% (c) Lei Li 2012  walking motion running motion

WindMine Goal: find patterns and anomalies from user-click streams (c) Lei Li 2012

Discoveries by WindMine Job website weather kids health (c) Lei Li 2012

Conclusion time series mining with many applications Numbers for energy consumption in DC, and cooling costs much Sensor networks find use in data center monitoring ThermoCast: the forecasting model Other time series models and algorithms DynaMMo for imputation PLiF & CLDS for clustering WindMine for web clicks

References Lei Li, et al. ThermoCast: A Cyber-Physical Forecasting Model for Data Centers KDD 2011 Lei Li, et al. Time Series Clustering: Complex is Simpler. ICML 2011 Yasushi Sakurai, Lei Li, et al, WindMine: Fast and Effective Mining of Web-click Sequences, SDM, 2011. Lei Li, et al. Parsimonious Linear Fingerprinting for Time Series. VLDB 2010. Lei Li, et al. DynaMMo: Mining and Summarization of Coevolving Sequences with Missing Values. ACM KDD 2009. (c) Lei Li 2012

Thanks! contact: Lei Li (leili@cs.cmu.edu) papers, software, datasets on http://www.cs.cmu.edu/~leili