10 -1  The Term Project demands in-depth research and investigated reporting. All reported contents, figures, and tables must be originally generated.

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

Refining High Performance FORTRAN Code from Programming Model Dependencies Ferosh Jacob University of Alabama Department of Computer Science
Suggested Course Outline Cloud Computing Bahga & Madisetti, © 2014Book website:
SLA-Oriented Resource Provisioning for Cloud Computing
The Who, What, Why and How of High Performance Computing Applications in the Cloud Soheila Abrishami 1.
Virtual Machine Usage in Cloud Computing for Amazon EE126: Computer Engineering Connor Cunningham Tufts University 12/1/14 “Virtual Machine Usage in Cloud.
Esma Yildirim Department of Computer Engineering Fatih University Istanbul, Turkey DATACLOUD 2013.
UNCLASSIFIED: LA-UR Data Infrastructure for Massive Scientific Visualization and Analysis James Ahrens & Christopher Mitchell Los Alamos National.
Authors: Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox Publish: HPDC'10, June 20–25, 2010, Chicago, Illinois, USA ACM Speaker: Jia Bao Lin.
EXTENDING SCIENTIFIC WORKFLOW SYSTEMS TO SUPPORT MAPREDUCE BASED APPLICATIONS IN THE CLOUD Shashank Gugnani Tamas Kiss.
The Origin of the VM/370 Time-sharing system Presented by Niranjan Soundararajan.
Applied Architectures Eunyoung Hwang. Objectives How principles have been used to solve challenging problems How architecture can be used to explain and.
Understanding and Managing WebSphere V5
Project Proposal (Title + Abstract) Due Wednesday, September 4, 2013.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,
U.S. Department of the Interior U.S. Geological Survey David V. Hill, Information Dynamics, Contractor to USGS/EROS 12/08/2011 Satellite Image Processing.
Utility Computing Casey Rathbone 1http://cyberaide.org.edu.
Ch 4. The Evolution of Analytic Scalability
1 1 Hybrid Cloud Solutions (Private with Public Burst) Accelerate and Orchestrate Enterprise Applications.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
1 1 Hybrid Cloud Solutions (Private with Public Burst) Accelerate and Orchestrate Enterprise Applications.
Assignment 3: A Team-based and Integrated Term Paper and Project Semester 1, 2012.
MobSched: An Optimizable Scheduler for Mobile Cloud Computing S. SindiaS. GaoB. Black A.LimV. D. AgrawalP. Agrawal Auburn University, Auburn, AL 45 th.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
Location-aware MapReduce in Virtual Cloud 2011 IEEE computer society International Conference on Parallel Processing Yifeng Geng1,2, Shimin Chen3, YongWei.
DISTRIBUTED COMPUTING
Authors: Jiann-Liang Chenz, Szu-Lin Wuy,Yang-Fang Li, Pei-Jia Yang,Yanuarius Teofilus Larosa th International Wireless Communications and Mobile.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Improving Network I/O Virtualization for Cloud Computing.
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France
A performance evaluation approach openModeller: A Framework for species distribution Modelling.
An Architecture for Distributed High Performance Video Processing in the Cloud Speaker : 吳靖緯 MA0G IEEE 3rd International Conference.
High Performance Computing on Virtualized Environments Ganesh Thiagarajan Fall 2014 Instructor: Yuzhe(Richard) Tang Syracuse University.
An Architecture for Distributed High Performance Video Processing in the Cloud 作者 :Pereira, R.; Azambuja, M.; Breitman, K.; Endler, M. 出處 :2010 IEEE 3rd.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Grid and Cloud Computing Globus Provision Dr. Guy Tel-Zur.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
Grid Computing Framework A Java framework for managed modular distributed parallel computing.
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
7. Grid Computing Systems and Resource Management
Grid Appliance The World of Virtual Resource Sharing Group # 14 Dhairya Gala Priyank Shah.
Toward Efficient and Simplified Distributed Data Intensive Computing IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 22, NO. 6, JUNE 2011PPT.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
CSci6702 Parallel Computing Andrew Rau-Chaplin
Authors: Jiann-Liang Chenz, Szu-Lin Wuy, Yang-Fang Li, Pei-Jia Yang,
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
Background Computer System Architectures Computer System Software.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
IMPROVEMENT OF COMPUTATIONAL ABILITIES IN COMPUTING ENVIRONMENTS WITH VIRTUALIZATION TECHNOLOGIES Abstract We illustrates the ways to improve abilities.
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Dag Toppe Larsen UiB/CERN CERN,
Dag Toppe Larsen UiB/CERN CERN,
Hybrid Cloud Architecture for Software-as-a-Service Provider to Achieve Higher Privacy and Decrease Securiity Concerns about Cloud Computing P. Reinhold.
Grid Computing.
Recap: introduction to e-science
Distributed System Concepts and Architectures
Introduction to Apache
Reading and Evaluating Research Reports
Presentation transcript:

10 -1  The Term Project demands in-depth research and investigated reporting. All reported contents, figures, and tables must be originally generated.  Ten topics are for students to choose from, different topics for multiple disjoint groups of students to work on.  You have only 1 month to report the work first through a proposal and then a complete written report at the end of the semester and present it.  The proposal which will become your report at the end should follow the IEEE Conference paper format of about 10 pages, including original figure illustrations and tabulations plus a reference listing of papers. (Template link: ates.html) Introduction to Grid and Cloud Computing Term Project Specification:

How to Write a Good Technical Paper on 10 pages ? 1. Title ( < 8 words) must hit the hot topic - short, clear and eye-catching, Authors and Affiliations (in 1-2 lines after the title) 2. Abstract (< 50 ~ 100 words) must state the research objectives, summarize the findings, and highlight the innovative contributions. 3. Introduction (including the title, abstract) on 1 page must motivate the readers to read the rest of the paper and prepare them with the necessary background 4. Problem Statement and Formulation (2 pages) of the problem being solved, basic assumptions, formulate the problem with technical specifications 5. Architecture, algorithms, solution methods, protocols, analytical results and illustrated example, etc. (2 pages) 6. Experimental setting (computer simulators, benchmarks, and datasets used (1 page) 7. Experimental Results in plotted figures or tabulations plus their interpretations and performance analysis ( 2 pages) 8. Related Work and Conclusions (1 page) 9. References – List of 15 relevant papers (1 page)

10 -3 Topic Project Title Assignments 1 Use of XEN to create virtual machines, conduct some VM experiments and report performance measured 2 Exploring Amazon EC2, S3, or MapReduce, or virtual cluster, or private cloud for HPC scientific applications 3 Parallelization of a novel application idea using MPI or OpenMP, analyze the performance improvements. 4 Using Hadoop or node.js for a distributed Web Application 5 Integration of Globus Online by using CLI or REST API for an application that needs data transfer capabilities Candidate Project Topics :

Candidate Project Topics : Topic No. Topic Title Assignment 6Stork – Globus Online Comparison through different metrics 7Application of a scientific problem with a workflow in Condor scheduler 8Development of a client/server application that does performance improvements on a high-performance data transfer protocol (GridFTP, UDT) 9MPI- Hadoop Comparison 10A survey on Parallel File System Comparison

Topic 1: Use of XEN for virtual machine (VM) creation and resource management through some VM application experiments  You are asked to port the XEN hypervisor on a local computer or on your own notebook.  Create the Domain 0 (control VM) and some User Domains (VM applications) for some selected benchmarks  Collect the performance results. Discuss lessons learned from the XEN application experiments.

Prof. Kai Hwang Suggested References for Topic 1: 1.M. Rosenblum, “Recent Advances in Virtual Machines and Operating Systems”, Keynote Address, ACM ASPLOS J. Smith and R. Nair, Virtual Machines: Versatile Platforms for Systems and Processes, Morgan Kaufmann, B. Sotomayor, R. Montero, and I. Foster, “Virtual Infrastructure Management in Private and Hybrid Clouds”, IEEE Internet Computing, Sept P. Barham, et al, “XEN and the Art of Virtualization”, Proc.of the 9th ACM Symp. on OS Principles (SOSP19), ACM Press, A. Menon, et al, “Diagnosing Performance Overheads in the XEN Virtual Machine Environment”, Proc. of the 1st Int’l Conf. on Virtual Execution Environments. 2005

Topic 2: Exploring the use of Amazon EC2, S3, MapReduce, or virtual cluster, or private cloud in HPC scientific applications  This project requires to use available AWS virtual clusters (EC2, S3 instances), or the MapReduce Cluster, or the private cloud offered on the AWS platform. A cluster of 64 to 120 nodes are desired  You need to perform some benchmark experiments on these VM clusters. You need to measure the performance and analyze the performance attributes and identify performance bottlenecks.  Select some well-known high-performance scientific benchmark programs to carry out your experiments or write your own testing program such as for large-scale matrix multiplication

Key References for Topic 2 : 1.K. Hwang, G. Fox and J. Dongarra, Distributed and Cloud Computing, Chapters 2, 4, 6, Morgan Kaufmann, Oct K. Hwang and Z. Xu: Scalable Parallel Computing, McGraw- Hill, Chapter 2 and 12, E. Walker, “Benchmarking Amazon EC2 for High-Performance Scientific Computing,” login, vol. 33, no. 5, pp. 18–23, D. Kirk and W. Hwu, Programming Massively Parallel Processors: A Hands-on Approach, Morgan Kaufmann, 2010.

Topic 3: Parallelization of a novel application idea using MPI or OpenMP, analyze the performance improvements  You are asked to find a computationally intensive application and parallelize it by using MPI or OpenMP.  Conduct a thorough performance analysis test using multiple machines (multi core computer in absence of multiple machines)  Test your code by running it on an SMP(A single computer with mult- cores) and DSM(Multiple computers connected via LAN) environment  Prepare different test case by differentiating machine architecture, problem size, etc.

Topic 4: Using Hadoop or node.js for a distributed Web Application  Design a web application that serves thousands of users  Each user asks for a computationally intensive service.  Distribute the load of the service given by the application to multiple machines at the back end by using technologies like Hadoop or node.js.  Analyze the performance of your application with the increasing number of users

Topic 5: Integration of Globus Online by using CLI or REST API for an application that needs data transfer capabilities  You are asked to design or use an existing application that needs transfer capabilities  Your application will integrate Globus Online as the data transfer capability and provide monitoring of the jobs as well.  The CLI could be used in a complex job that needs data transfers between nodes before starting execution  The REST API could be used for any type of application.

Topic 6: Stork – Globus Online Comparison through different metrics  You are asked to install two GridFTP servers in two machines and integrate these with Globus Online  Then install the Stork scheduler in one of the machines  Design data transfer test cases and make a full comparison of the two tools.  Some of the performance metrics could be dataset characteristics, ease of use(Stork doesnot have an interface so compare it with GO CLI), individual transfer speed, throughput.  Use Stork features like concurrent transfers, optimization.

Topic 7: Application of a scientific problem with a workflow in Condor scheduler  Find a scientific problem that requires complex computational and data transfer needs.  Design a workflow for the solution of the problem  Apply the workflow by using the Condor scheduler

Topic 8: Development of a client/server application that does performance improvements on a high-performance data transfer protocol (GridFTP, UDT)  By using GridFTP or UDT APIs, design a client/server model that does optimization to the data transfers  Ex: For UDT: Use the same connection for multiple file transfers, apply a threaded server/client model to do concurrent file transfers for multiple sockets  Ex: For GridFTP: Use the java or C APIs to dynamically change the parallel stream numbers or concorrency numbers for a directory transfer  Test your implementation to see any improvements.

Topic 9: MPI-Hadoop Comparison  Find an application of algorithm that can be parallelized but does not need any communications in between the parallel processes  Implement it using Hadoop and MPI  Compare their performances

Topic 10: A survey Report on Parallel and Distributed File Systems  You are asked to write an extensive report on popular currently available parallel and distributed file systems (GPFS, Lustre, HDFS, PVFS, WheelFS, GFS, AFS)  Research performance comparison metrics for these file system  Open source file systems could be installed and by using performance benchmarking tools, conduct test cases where you measure the read/write speeds  Write a paper presenting a multdimensional comparison study and provide test case results with selected sample file systems