Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 S06: Open-Source Stack for Cloud Computing Milind Bhandarkar Yahoo! Michael Ryan Intel Michael Kozuch Intel Richard Gass Intel.

Similar presentations


Presentation on theme: "1 S06: Open-Source Stack for Cloud Computing Milind Bhandarkar Yahoo! Michael Ryan Intel Michael Kozuch Intel Richard Gass Intel."— Presentation transcript:

1 1 S06: Open-Source Stack for Cloud Computing Milind Bhandarkar Yahoo! Michael Ryan Intel Michael Kozuch Intel Richard Gass Intel

2 2 Agenda Sessions: (A) Introduction 8.30-9.00 (B) Hadoop 9.00-10.00 Break 10.00-10.30 Hadoop/Pig 10.30-12:00 Lunch 12.00-1.30 (C) Pig 1.30-2.00 (D) Tashi 2.00-3.00 Break 3.00-3.30 (E) PRS 3.30-4.45 Wrapup 4.45-5.00 I.Speaker intros II.Motivation III.Open Cirrus IV.Open Cirrus software stack V.Getting involved Zoni

3 3 Session A: Introduction

4 4 Michael Kozuch (Intro) Michael Kozuch is a Principal Engineer with Intel Labs Pittsburgh and manager of the ILP Systems Research and Engineering group –Manages the Intel Open Cirrus cluster and is the PI for the Tashi research project Michael is a 12-year veteran of Intel and contributed to the development of Intel’s VT and TXT technologies He has published 25+ scientific papers and 20+ patents

5 5 Milind Bhandarkar (Hadoop) Lead Yahoo! Grid Solutions Team since June 2005 Contributor to Hadoop since January 2006 Trained 1000+ Hadoop users at Yahoo! & elsewhere 20+ years of experience in Parallel Programming

6 6 Michael Ryan (Tashi) Michael is a research engineer with Intel Labs Pittsburgh Lead developer for Tashi Serves as sysadmin for the Intel Open Cirrus site Coordinates the Global Monitoring service for Open Cirrus

7 7 Richard Gass (Zoni) Richard is currently a research engineer with Intel Labs Pittsburgh –Lead developer for Zoni –Serves as sysadmin for the Intel OpenCirrus site Richard has published 9+ scientific papers and is also an (imminent) PhD candidate with University Pierre and Marie Curie LIP6 in Paris University Pierre and Marie Curie LIP6

8 8 Motivation

9 9 Why Open and Cloud makes sense Cloud Computing is a new, critical technology –Efficiency: Admin costs aggregated –Scalability: From 1 to 1000 servers in 10 sec. flat –Empowerment: Anyone can buy a cluster Open Communities enable rapid innovation –Exchange of ideas: Knowledge grows –Constructive Darwinism: Best tools survive/evolve –Empowerment: Anyone can build a LAMP stack Rapidly developing and deploying innovative computing technologies

10 10 Research Interest: Big Data Interesting applications are data hungry The data grows over time The data is immobile –100 TB @ 1Gbps ~= 10 days Compute comes to the data Big Data clusters are the new libraries (Data-Rich Computing theme proposal. J. Campbell, et al., 2007) The value of a cluster is its data

11 11 Open Cirrus

12 12 Open Cirrus * Cloud Computing Testbed MIMOS* ETRI* ISPRAS* KIT* UIUC* IDA* Sponsored by HP, Intel, and Yahoo! (with additional support from NSF) 9 sites currently, target of around 20 in the next two years Collaboration between industry and academia, sharing hardware infrastructure software infrastructure research applications and data sets

13 13 Open Cirrus * Objectives – Foster systems research around cloud computing – Vendor-neutral open-source stacks and APIs for the cloud – Expose research community to enterprise level requirements – Provide realistic traces of cloud workloads How are we unique – Support for systems research and applications research – Federation of heterogeneous datacenters – Collection of interesting data sets Independently-managed sites… providing a cooperative research testbed

14 14 User Access to Open Cirrus * User access is organized around Research Projects –Led by Principal Investigator (PI) Project PIs apply to each site separately –Identifying additional team members Contact information for applications to each site are available on the Open Cirrus Web site (http://opencirrus.org)http://opencirrus.org Each Open Cirrus site decides which users and projects get access to its site.

15 15 Open Cirrus * Research Projects Example research areas of interest Datacenter federation Datacenter management Web services Data-intensive systems Projects typically not of interest Traditional HPC app development Production apps looking for “free” cycles Closed-source system development

16 16 Software Stack

17 17 Open Cirrus* Software Components Compute Node Services Global Services Site Services Single Sign-On Global Monitoring Global User Directories Data Location Resource Telemetry Billing/ Accounting

18 18 Physical Machine Allocation: Zoni Open service research Tashi development Proprietary service research Apps running in a VM mgmt infrastructure (e.g., Tashi, Eucalyptus) Open workload monitoring and trace collection Production storage service Provides each project with a mini-datacenter Isolation of experiments Zoni dynamically divides compute nodes into isolated subdomains

19 19 Cluster Storage: HDFS Storage system aggregating standard devices –High-performance, parallel access –High data reliability through replication Exposing location information enables intelligent placement of computation Node Storage Service Node

20 20 Virtual Machine Allocation: Tashi An open source Apache Software Foundation incubator project – Infrastructure for cloud computing on Big Data – http://incubator.apache.org/projects/tashi http://incubator.apache.org/projects/tashi – Support for AWS* interface – OS, FS, and VMM agnostic Research focus: – Location-aware co-scheduling of compute, storage, and power – Seamless physical/virtual migration

21 21 Application Service: Hadoop An open-source Apache Software Foundation project sponsored by Yahoo! – http://hadoop.apache.org Provides a scalable, parallel programming model (MapReduce) and the associated runtime

22 22 Getting Involved

23 23 Summary Open Communities can shape the development of Cloud Computing Open Cirrus* is a multi-partner test bed for research in Cloud Computing The Open Cirrus software stack provides a good starting point for open-source cloud computing software development

24 24 Getting Involved Contact Open Cirrus* with research proposals Contribute to the Open Cirrus software stack –Zoni, Tashi, Hadoop –Apache Software Foundation* http://opencirrus.org

25 25 The Rest of the Day

26 26 Ground Rules Questions? –Please ask, we’d love an interactive day –But, if the answer is not of general interest, we may defer until the break Need to step out? –That’s OK, but please take your belongings –Including the lunch Please be considerate –And keep conversations focused on the topic


Download ppt "1 S06: Open-Source Stack for Cloud Computing Milind Bhandarkar Yahoo! Michael Ryan Intel Michael Kozuch Intel Richard Gass Intel."

Similar presentations


Ads by Google