Carnegie Mellon School of Computer Science Forecasting with Cyber-physical Interactions in Data Centers Lei Li PDL Seminar 9/28/2011.

Slides:



Advertisements
Similar presentations
Data Center Design Issues Bill Tschudi, LBNL
Advertisements

FUNNEL: Automatic Mining of Spatially Coevolving Epidemics Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Willem G. van Panhuis (University of.
1 * Other names and brands may be claimed as the property of others. Copyright © 2010, Intel Corporation. Data Center Efficiency with Optimized Cooling.
Model predictive control for energy efficient cooling and dehumidification Tea Zakula Leslie Norford Peter Armstrong.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Efficient Distribution Mining and Classification Yasushi Sakurai (NTT Communication Science Labs), Rosalynn Chong (University of British Columbia), Lei.
Streaming Pattern Discovery in Multiple Time-Series Spiros Papadimitriou Jimeng Sun Christos Faloutsos Carnegie Mellon University VLDB 2005, Trondheim,
Green Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of Science and Technology,
Tru-Alarm: Trustworthiness Analysis of Sensor Network in Cyber Physical Systems Lu-An Tang, Xiao Yu, Sangkyum Kim, Jiawei Han, Chih-Chieh Hung, Wen-Chih.
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
Yanxin Shi 1, Fan Guo 1, Wei Wu 2, Eric P. Xing 1 GIMscan: A New Statistical Method for Analyzing Whole-Genome Array CGH Data RECOMB 2007 Presentation.
Utility-Function-Driven Energy- Efficient Cooling in Data Centers Authors: Rajarshi Das, Jeffrey Kephart, Jonathan Lenchner, Hendrik Hamamn IBM Thomas.
WindMine: Fast and Effective Mining of Web-click Sequences SDM 2011Y. Sakurai et al.1 Yasushi Sakurai (NTT) Lei Li (Carnegie Mellon Univ.) Yasuko Matsubara.
Ph.D. DefenceUniversity of Alberta1 Approximation Algorithms for Frequency Related Query Processing on Streaming Data Presented by Fan Deng Supervisor:
Privacy Preservation for Data Streams Feifei Li, Boston University Joint work with: Jimeng Sun (CMU), Spiros Papadimitriou, George A. Mihaila and Ioana.
Thermal Management Solutions from APW President Systems
Parsimonious Linear Fingerprinting for Time Series Lei Li joint work with B. Aditya Prakash, Christos Faloutsos School of Computer Science Carnegie Mellon.
CoolAir Temperature- and Variation-Aware Management for Free-Cooled Datacenters Íñigo Goiri, Thu D. Nguyen, and Ricardo Bianchini 1.
Optimal Fan Speed Control for Thermal Management of Servers UMass-Amherst Green Computing Seminar September 21 st, 2009.
Overview of Model Predictive Control in Buildings
Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie.
Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters Q. Tang, T. Mukherjee, Sandeep K. S. Gupta Department of Computer.
David Bendit System Administrator Mars Space Flight Facility Arizona State University.
Energy Usage in Cloud Part2 Salih Safa BACANLI. Cooling Virtualization Energy Proportional System Conclusion.
Fast and Exact Monitoring of Co-evolving Data Streams Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Naonori Ueda (NTT) Masatoshi Yoshikawa (Kyoto.
Evaluation Methods and Challenges. 2 Deepak Agarwal & Bee-Chung ICML’11 Evaluation Methods Ideal method –Experimental Design: Run side-by-side.
HPDC 2014 Supporting Correlation Analysis on Scientific Datasets in Parallel and Distributed Settings Yu Su*, Gagan Agrawal*, Jonathan Woodring # Ayan.
Summer Report Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY
ConSil Jeff Chase Duke University. Collaborators Justin Moore –received PhD in April, en route to Google. Did this research. Wrote this paper. Named the.
BRAID: Discovering Lag Correlations in Multiple Streams Yasushi Sakurai (NTT Cyber Space Labs) Spiros Papadimitriou (Carnegie Mellon Univ.) Christos Faloutsos.
Fast Mining and Forecasting of Complex Time-Stamped Events Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), Christos Faloutsos (CMU), Tomoharu.
EXPLORING ALTERNATE WAYS OF COOLING SPACES. Addressing Current Issues:  Rising electric consumption  Suffering Environment  Need for more Action &
Yang Hu University of Pittsburgh Department of Computer Science.
AutoPlait: Automatic Mining of Co-evolving Time Sequences Yasuko Matsubara (Kumamoto University) Yasushi Sakurai (Kumamoto University) Christos Faloutsos.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
IntelliSense.io Beyond the hype - Real World Applications / Solutions of Internet of Things.
Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.
Benjamin Stephens Carnegie Mellon University Monday June 29, 2009 The Linear Biped Model and Application to Humanoid Estimation and Control.
Lei Li Computer Science Department Carnegie Mellon University Pre Proposal Time Series Learning completed work 11/27/2015.
Software Architecture for Dynamic Thermal Management in Datacenters Tridib Mukherjee Graduate Research Assistant IMPACT Lab ( Department.
Thermal Aware Data Management in Cloud based Data Centers Ling Liu College of Computing Georgia Institute of Technology NSF SEEDM workshop, May 2-3, 2011.
ATAC: Ambient Temperature- Aware Capping for Power Efficient Datacenters Sungkap Yeo Mohammad M. Hossain Jen-cheng Huang Hsien-Hsin S. Lee.
Stream Monitoring under the Time Warping Distance Yasushi Sakurai (NTT Cyber Space Labs) Christos Faloutsos (Carnegie Mellon Univ.) Masashi Yamamuro (NTT.
Overview and Comparison of Software Tools for Power Management in Data Centers Msc. Enida Sheme Acad. Neki Frasheri Polytechnic University of Tirana Albania.
1 Thermal Management of Datacenter Qinghui Tang. 2 Preliminaries What is data center What is thermal management Why does Intel Care Why Computer Science.
Streaming Pattern Discovery in Multiple Time-Series Jimeng Sun Spiros Papadimitrou Christos Faloutsos PARALLEL DATA LABORATORY Carnegie Mellon University.
Computer Science and Engineering Power-Performance Considerations of Parallel Computing on Chip Multiprocessors Jian Li and Jose F. Martinez ACM Transactions.
Increasing DC Efficiency by 4x Berkeley RAD Lab
Model Based Event Detection in Sensor Networks Jayant Gupchup, Andreas Terzis, Randal Burns, Alex Szalay.
D YNA MM O : M INING AND S UMMARIZATION OF C OEVOLVING S EQUENCES WITH M ISSING V ALUES Christos Faloutsos joint work with Lei Li, James McCann, Nancy.
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
D YNA MM O : M INING AND S UMMARIZATION OF C OEVOLVING S EQUENCES WITH M ISSING V ALUES Lei Li joint work with Christos Faloutsos, James McCann, Nancy.
Facets: Fast Comprehensive Mining of Coevolving High-order Time Series Hanghang TongPing JiYongjie CaiWei FanQing He Joint Work by Presenter:Wei Fan.
Arizona State University1 Fast Mining of a Network of Coevolving Time Series Wei FanHanghang TongPing JiYongjie Cai.
Date of download: 5/27/2016 Copyright © ASME. All rights reserved. From: Experimentally Validated Computational Fluid Dynamics Model for a Data Center.
Carnegie Mellon School of Computer Science Forecasting with Cyber-physical Interactions in Data Centers Lei Li PDL Seminar 9/28/2011.
CANOVATE MOBILE (CONTAINER) DATA CENTER SOLUTIONS
Forecasting with Cyber-physical Interactions in Data Centers (part 3)
A Scalable Approach to Architectural-Level Reliability Prediction
Date of download: 11/2/2017 Copyright © ASME. All rights reserved.
Non-linear Mining of Competing Local Activities
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
A Time Series Representation Framework Based on Learned Patterns
Pre Proposal Time Series Learning completed work
Faraz Ahmad and T. N. Vijaykumar Purdue University
The Greening of IT November 1, 2007.
Thermal Management of Heterogeneous Data Centers
Online Analytical Processing Stream Data: Is It Feasible?
Presentation transcript:

Carnegie Mellon School of Computer Science Forecasting with Cyber-physical Interactions in Data Centers Lei Li PDL Seminar 9/28/2011

Outline Overview of time series mining –Time series examples –What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms 2(c) Lei Li 2012

What is co-evolving time series? 3 Correlated multidimensional time sequences with joint temporal dynamics (c) Lei Li 2012

Goal: generate natural human motion –Game ($57B) –Movie industry Challenge: –Missing values –“naturalness” 4 Motion Capture Right hand Left hand walking motion [Li et al 2008a] (c) Lei Li 2012

Environmental Monitoring Problem: early detection of leakage & pollution Challenge: noise & large data 5 Chlorine level in drinking water systems [Li et al 2009] (c) Lei Li 2012

Network Security Challenge: Anomaly detection in computer network & online activity 6 BGP # updates on backbone from Webclick for news from NTT Webclick for TV (c) Lei Li 2012

Time Series Mining Problems Forecasting Imputation (missing values) Compression Segmentation, change/anomaly detection Clustering Similarity queries Scalable/Parallel/Distributed algorithms 7 See my thesis for algorithms covering these problems (c) Lei Li 2012

Outline Overview of time series mining –Time series examples –What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms 8(c) Lei Li 2012

Datacenter Monitoring & Management Temperature in datacenter Goal: save energy in data centers –US alone, $7.4B power consumption (2011) Challenge: –Huge data (1TB per day) –Complex cyber physical systems 9(c) Lei Li 2012

Typical Data Center Energy Consumption LBL data center Google data center [Barroso 09] [LBNL/PUB-945] 10(c) Lei Li 2012

Towards Thermal Aware DC Management Data centers are often over provisioned, with ≈40% of energy spent for cooling (total=$7.4B) How can we improve energy efficiency in modern multi-MegaWatt data centers? 11 JHU data center with Genomote (c) Lei Li 2012

Air cycle in DC 12(c) Lei Li 2012

Possible Ways for Saving Cooling and Computing Cost Challenges: –airflow interaction, spatial placement, SLA, … Possible direction: –Shutdown unused machine according to workload Example MSN workload 13(c) Lei Li 2012

Towards Data Driven AC control and server management Reactive energy saving: –slow down cooling fan in CRAC –raise AC temperature set points Proactive data center management: –predicting temperature distribution and thermal aware placement of workload 14 supply air temperature < threshold max(active inlet air temperature)< threshold (c) Lei Li 2012

Big Picture: Predictive AC Control and Server Management Temperature prediction Sensor measuring Server/workload management Cooling energy model Computing energy model CRAC control 15(c) Lei Li 2012

Outline Overview of time series mining –Time series examples –What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms 16(c) Lei Li 2012

Experimental setup Tested in JHU data center with 171 1U servers, instrumented with a network of 80 sensors 17(c) Lei Li 2012

Sample measurements 18(c) Lei Li 2012

Observations Temperature difference cycle (max/min temp. on the same rack) is in anti-phase with air velocity cycle. Middle and bottom sections are coldest; Top is hottest Shutting down under- utilized servers could reduce energy consumption. 19(c) Lei Li 2012

What happens when shutting down servers? 20 Shut down (c) Lei Li 2012

Outline Overview of time series mining –Time series examples –What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms 21(c) Lei Li 2012

ThermoCast [Li et al, KDD 2011] Given: intake temperatures, outtake temperatures, workload for each server, and floor air speed Goal: forecasting temperature distribution and thermal aware placement of workload Approach: a zonal forecasting model –divide the machine room into zones, and each rack into sections. 22(c) Lei Li 2012

Assumptions A0: incompressible air A1: environmental temperature is constant A2: supply air temperature is constant within a period A3: constant server fan speed A4: vertical air flow at the outtake is negligible A5: vertical air flow at the intake is linear to height 23(c) Lei Li 2012

Sensor measurements & Air interactions 24(c) Lei Li 2012

ThermoCast 25(c) Lei Li 2012

ThermoCast Model 26 floor air speed Inlet temp outlet temp Derived from fluid dynamics and thermodynamics together with assumptions [Li et al, KDD 2011] (c) Lei Li 2012

Parameter Learning 27 s.t. (c) Lei Li 2012

Outline Overview of time series mining –Time series examples –What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms 28(c) Lei Li 2012

ThermoCast Results 29 AR ThermoCast 75%  100%  shutdown Q1: How accurately can a server learn its local thermal dynamics for prediction? 2x better using 90 minutes as training, predicting 5 minutes away (c) Lei Li 2012

ThermoCast Results Q2: How long ahead can ThermoCast forecast thermal alarms? 2x faster 30 BaselineThermoCast Recall62.8%71.4% FAR45%43.1% MAT2.3min4.2 min FAR=false alarm rate MAT=mean look-ahead time (c) Lei Li 2012

Implication on Capacity Gain Preliminary results comparing workload placement strategies: –5 minutes forecast length –With the same cooling: Inlet temp with ThermoCast:  C Inlet temp with Static profiling: 16.5  C Assume the servers consume 200W on average (Dell PowerEdge 1950), we gain extra 26% computing power with the same cooling 31(c) Lei Li 2012

Contributions and Impact Predictability: a hybrid approach to integrate the thermodynamics and sensor data Scalable learning/training thanks to the zonal thermal model Real data and instrument in a data center with practical workload Projected impact: can handle extra 26% workload (e.g. PUE 1.5  PUE 1.4) 32(c) Lei Li 2012

Outline Overview of time series mining –Time series examples –What problems do we solve Motivation Experimental setup ThermoCast: the forecasting model Results Other time series models and algorithms 33(c) Lei Li 2012

DynaMMo: imputation/forecasting 34 Time sensor 1 sensor 2 … sensor m blackout Goal: recover the missing values Details in [Li et al, KDD 2009] (c) Lei Li 2012

DynaMMo result 35 Reconstructionerror Average missing length Ideal Our DynaMMo MSVD [Srebro’03] Linear Interpolation Spline Dataset: CMU Mocap #16 mocap.cs.cmu.edu more results in [Li et al, KDD 2009] better harder (c) Lei Li 2012

PLiF and CLDS for clustering 36 BGP data: hierarchical clustering + PLiF features Details in [Li et al, VLDB 2010] and [Li & Prakash, ICML 2011] (c) Lei Li 2012

CLDS Clustering Mocap Data 37 Accuracy = 93.9% Accuracy = 51.0% PCA top 2 components CLDS two features  walking motion running motion (c) Lei Li 2012

WindMine Goal: find patterns and anomalies from user- click streams 38(c) Lei Li 2012

Discoveries by WindMine 39 Job website Job website weather kids health (c) Lei Li 2012

Conclusion time series mining with many applications Numbers for energy consumption in DC, and cooling costs much Sensor networks find use in data center monitoring ThermoCast: the forecasting model Other time series models and algorithms –DynaMMo for imputation –PLiF & CLDS for clustering –WindMine for web clicks 40

References Lei Li, et al. ThermoCast: A Cyber-Physical Forecasting Model for Data Centers KDD 2011 Lei Li, et al. Time Series Clustering: Complex is Simpler. ICML 2011 Yasushi Sakurai, Lei Li, et al, WindMine: Fast and Effective Mining of Web-click Sequences, SDM, Lei Li, et al. Parsimonious Linear Fingerprinting for Time Series. VLDB Lei Li, et al. DynaMMo: Mining and Summarization of Coevolving Sequences with Missing Values. ACM KDD (c) Lei Li 2012

Thanks! contact: Lei Li papers, software, datasets on