FAWN: A Fast Array of Wimpy Nodes Authors: David G. Andersen et al. Offence: Jaime Espinosa Chunjing Xiao.

Slides:



Advertisements
Similar presentations
Trends in Memory Computing. Changes in memory technology Once upon a time, computers could not store very much data. The first electronic memory storage.
Advertisements

WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
FAWN: Fast Array of Wimpy Nodes Developed By D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, V. Vasudevan Presented by Peter O. Oliha.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems New Trends in Distributed Storage Steve Ko Computer Sciences and Engineering University at Buffalo.
FAWN: Fast Array of Wimpy Nodes A technical paper presentation in fulfillment of the requirements of CIS 570 – Advanced Computer Systems – Fall 2013 Scott.
10 REASONS Why it makes a good option for your DB IN-MEMORY DATABASES Presenter #10: Robert Vitolo.
Availability in Globally Distributed Storage Systems
Towards Decoupling Storage and Computation in Hadoop with SuperDataNodes George Porter * Center for Networked Systems U. C. San Diego LADIS 2009 / Big.
International Conference on Supercomputing June 12, 2009
System on a Chip (SoC) An Overview David Cheung Christopher Shannon.
Energy-efficient Cluster Computing with FAWN: Workloads and Implications Vijay Vasudevan, David Andersen, Michael Kaminsky*, Lawrence Tan, Jason Franklin,
Authors: David G. Andersen et al. Offense: Chang Seok Bae Yi Yang.
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
Hybrid Hard Disk Drive Radhika Patel. Basic Terms  HDD (Hard Disk Drive): storage center for data  SSD (Solid State Drive): same thing as a hard drive,
FAWN: A Fast Array of Wimpy Nodes Presented by: Aditi Bose & Hyma Chilukuri.
1: Operating Systems Overview
Parallel and distributed databases R & G Chapter 22.
Exploring The Green Blade Ken Lutz University of California, Berkeley LoCal Retreat, June 8, 2009.
FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.
1 The Problem of Power Consumption in Servers L. Minas and B. Ellison Intel-Lab In Dr. Dobb’s Journal, May 2009 Prepared and presented by Yan Cai Fall.
CSE 451: Operating Systems Winter 2010 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura.
SQL Server 2008 & Solid State Drives Jon Reade SQL Server Consultant SQL Server 2008 MCITP, MCTS Co-founder SQLServerClub.com, SSC
Just a really fast drive Jakub Topič, I3.B
Intel® Solid-State Drive Data Center TCO Calculator The data in this presentation is based on your analysis and business assumptions when using the Intel®
Physical Memory By Gregory Marshall. MEMORY HIERARCHY.
Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of.
Computing Hardware Starter.
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel.
/38 Lifetime Management of Flash-Based SSDs Using Recovery-Aware Dynamic Throttling Sungjin Lee, Taejin Kim, Kyungho Kim, and Jihong Kim Seoul.
Improving Network I/O Virtualization for Cloud Computing.
Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.
Mayuresh Varerkar ECEN 5613 Current Topics Presentation March 30, 2011.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
Data Storage Systems: A Survey Abdullah Aldhamin July 29, 2013 CMPT 880: Large-Scale Multimedia Systems and Cloud Computing Course Project.
Challenges towards Elastic Power Management in Internet Data Center.
Directed Reading 2 Key issues for the future of Software and Hardware for large scale Parallel Computing and the approaches to address these. Submitted.
Lots of hype, little science But – lots of this hype is real, and there are many challenging engineering problems Initially, we focus on data centers:
RAID SECTION (2.3.5) ASHLEY BAILEY SEYEDFARAZ YASROBI GOKUL SHANKAR.
The Memristor.
Web Search Using Mobile Cores Presented by: Luwa Matthews 0.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
Price Performance Metrics CS3353. CPU Price Performance Ratio Given – Average of 6 clock cycles per instruction – Clock rating for the cpu – Number of.
Amar Phanishayee,LawrenceTan,Vijay Vasudevan
L/O/G/O Input Output Chapter 4 CS.216 Computer Architecture and Organization.
Resilience at Scale: The importance of real world data Bianca Schroeder Computer Science Department University of Toronto.
CLOUD BASED STORAGE Amy. Cloud Based Storage Cloud based storage is “the storage of data online in the cloud”
GreenCloud: A Packet-level Simulator of Energy-aware Cloud Computing Data Centers Dzmitry Kliazovich ERCIM Fellow University of Luxembourg Apr 16, 2010.
1 Efficient Mixed-Platform Clouds Phillip B. Gibbons, Intel Labs Michael Kaminsky, Michael Kozuch, Padmanabhan Pillai (Intel Labs) Gregory Ganger, David.
Tackling I/O Issues 1 David Race 16 March 2010.
Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.
Jennifer Rexford Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks Energy.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Transactional Flash V. Prabhakaran, T. L. Rodeheffer, L. Zhou (MSR, Silicon Valley), OSDI 2008 Shimin Chen Big Data Reading Group.
Rethinking RAID for SSD based HPC Systems Yugendra R. Guvvala, Yong Chen, and Yu Zhuang Department of Computer Science, Texas Tech University, Lubbock,
System on a Chip (SoC) An Overview David Cheung Christopher Shannon.
1 Paolo Bianco Storage Architect Sun Microsystems An overview on Hybrid Storage Technologies.
- Pritam Kumat - TE(2) 1.  Introduction  Architecture  Routing Techniques  Node Components  Hardware Specification  Application 2.
Trends in Memory Computing.
Activity 1 6 minutes Research Activity: What is RAM? What is ROM?
Green cloud computing 2 Cs 595 Lecture 15.
Past, Present and Future
Database Management Systems (CS 564)
FAWN: A Fast Array of Wimpy Nodes
Lesson Objectives Aims You should be able to:
Be Fast, Cheap and in Control
CLUSTER COMPUTING.
FAWN: A Fast Array of Wimpy Nodes
2.C Memory GCSE Computing Langley Park School for Boys.
Presentation transcript:

FAWN: A Fast Array of Wimpy Nodes Authors: David G. Andersen et al. Offence: Jaime Espinosa Chunjing Xiao

Why FAWN Not Increasing CPU-I/O Gap CPU power consumption grows super-linearly with speed. Dynamic power scaling on traditional systems is surprisingly inefficient 2 A lot of research in parallel I/O They focus on workloads that are I/O, not computation, intensive. Electric cars consumes less power, but why you don’t buy it?

Poor scaling characteristics 3 The system includes a number of relatively high powered front-end systems Analysis has shown that for data-intensive workloads, large wimpy node clusters suffer from poor scaleup effects, –Because they are more affected by a diminishing return scaleup effect than a smaller traditional cluster* *Wimpy Node Clusters: What About Non-Wimpy Workloads (3.5.4 Discussion)

Limitations(1) Only focus on read-mostly workloads (simple key-value workloads). They can not provide complex processing workload and it is bad for write-most workloads. 4

Limitations(2) Works only for small data and small CPU work-loads Conclusions from author: not going to replace current data-center, does not work for real- time applications (ie. gaming) Does not have ACID property that is desired in data bases (Atomicity Consistency Isolation Durability) 5

Reliability problems More nodes & hardware components leads to more failures –less memory per node than traditional systems –conversely more nodes are required for the same capacity. Communication, link and switch failure not considered

Flash Problems (cost) Why did they only examine 3-year total cost of ownership (TCO) in Section 5? flash storage has short lifetime –Flash is times more expensive than HDD.* –the smaller flash cells are less reliable and less durable.** * ** RETHINKING FLASH IN THE DATA CENTER

Flash Problems (Size) The amount of physical space per megabyte is a problem –Thermodynamically requires more energy It takes longer to heat a large room than a small one –Environmental foot-print is relative to area needed * RETHINKING FLASH IN THE DATA CENTER

Flash Problems (translation layer) Through heroic engineering and daunting complexity, the flash translation layer masks these problems, but its performance impact can be significant. –Intel’s Extreme SSDs have a read latency of 85 ms, but the flash chips the drive uses internally have a read latency of just 25 to 35 ms.* –Flash translation layer is part of the flash controller and is embedded in flash chips and drives * RETHINKING FLASH IN THE DATA CENTER

Race Conditions Another study* from CMU found that the system leads to race conditions *dBug: Systematic evaluation of Distributed Systems

Conclusion It is a great system for quickly finding tiny amounts of data provided you have a lot of real-estate and don’t mind the high probability of failure.

Thank You