NCBI Grid Presentation. NCBI Grid Structure NetCache NetSchedule Load Balancer (LBSM) Load Balancer (LBSM) Worker Nodes CGI Gateway.

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

EHarmony in Cloud Subtitle Brian Ko. eHarmony Online subscription-based matchmaking service Available in United States, Canada, Australia and United Kingdom.
Remote Desktop Services
Bookshelf.EXE - BX A dynamic version of Bookshelf –Automatic submission of algorithm implementations, data and benchmarks into database Distributed computing.
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
ISYS 546 Client/Server Database Application Development.
Overview Of Microsoft New Technology ENTER. Processing....
Client/Server Architecture
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Grid Computing Meets the Database Chris Smith Platform Computing Session #
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
MapReduce.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
Yavor Todorov. Introduction How it works OS level checkpointing Application level checkpointing CPR for parallel programing CPR functionality References.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 2: System Structures.
Operating System. Architecture of Computer System Hardware Operating System (OS) Programming Language (e.g. PASCAL) Application Programs (e.g. WORD, EXCEL)
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
HTML+JavaScript M2M Applications Viewbiquity Public hybrid cloud platform for automating and visualizing everything.
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
Performance Concepts Mark A. Magumba. Introduction Research done on 1058 correspondents in 2006 found that 75% OF them would not return to a website that.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 3: Operating Systems Computer Science: An Overview Tenth Edition.
ORGANIZING AND ADMINISTERING OF VOLUNTEER DISTRIBUTED COMPUTING PROJECT Oleg Zaikin, Nikolay Khrapov Institute for System Dynamics and Control.
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
Informix IDS Administration with the New Server Studio 4.0 By Lester Knutsen My experience with the beta of Server Studio and the new Informix database.
Windows Azure Conference 2014 Deploy your Java workloads on Windows Azure.
Chapter 34 Java Technology for Active Web Documents methods used to provide continuous Web updates to browser – Server push – Active documents.
GraphLab: how I understood it with sample code Aapo Kyrola, Carnegie Mellon Univ. Oct 1, 2009.
Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing BOINC workshop 2013.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 Chapter 10: File-System.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting.
Oracle 10g Database Administrator: Implementation and Administration Chapter 2 Tools and Architecture.
Introduction to the Adapter Server Rob Mace June, 2008.
Mainframe (Host) - Communications - User Interface - Business Logic - DBMS - Operating System - Storage (DB Files) Terminal (Display/Keyboard) Terminal.
Module 10 Administering and Configuring SharePoint Search.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Satisfy Your Technical Curiosity Specialists Enterprise Desktop -
Saving State on the WWW. The Issue  Connections on the WWW are stateless  Every time a link is followed is like the first time to the server — it has.
SMBL and Blast Joe Rinkovsky Unix Systems Support Group Indiana University.
Windows Azure. Azure Application platform for the public cloud. Windows Azure is an operating system You can: – build a web application that runs.
Page 1 Printing & Terminal Services Lecture 8 Hassan Shuja 11/16/2004.
C-Store: MapReduce Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 22, 2009.
1 State and Session Management HTTP is a stateless protocol – it has no memory of prior connections and cannot distinguish one request from another. The.
User Scenarios in VENUS-C Focus on Structural Analysis Ignacio Blanquer I3M - UPV.
CLIENT (Web browser, GET, POST) WEB Server GRID Infrastructure GRID Worker Node my_cgi.cgi cgi2rcgi NetSchedule NetCache remote_cgi Original CGI executable.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.
(re)-Architecting cloud applications on the windows Azure platform CLAEYS Kurt Technology Solution Professional Microsoft EMEA.
Next Generation of Apache Hadoop MapReduce Owen
CLIENT SERVER COMPUTING. We have 2 types of n/w architectures – client server and peer to peer. In P2P, each system has equal capabilities and responsibilities.
Active-HDL Server Farm Course 11. All materials updated on: September 30, 2004 Outline 1.Introduction 2.Advantages 3.Requirements 4.Installation 5.Architecture.
StratusLab is co-funded by the European Community’s Seventh Framework Programme (Capacities) Grant Agreement INFSO-RI Demonstration StratusLab First.
1 An unattended, fault-tolerant approach for the execution of distributed applications Manuel Rodríguez-Pascual, Rafael Mayo-García CIEMAT Madrid, Spain.
Architecture NetSchedule -Queue 1 -Queue 2 -…. Client-submitter Client Waiting for job Worker Node (Active) Job 1Job 2Job 3 Worker Node (Waiting for a.
NetSchedule Push-Pull Model Queue 1Queue 2 Job 1 Job 2 ….. Job 3 NetSchedule server maintains several FIFO queues Push Job Pull Job.
INTRO. To I.T Razan N. AlShihabi
Remote execution of long-running CGIs
Applying Control Theory to Stream Processing Systems
NetSchedule Push-Pull Model
TYPES OFF OPERATING SYSTEM
Near Real Time ETLs with Azure Serverless Architecture
Lecture Topics: 11/1 General Operating System Concepts Processes
Day 2, Session 2 Connecting System Center to the Public Cloud
Network File System (NFS)
Presentation transcript:

NCBI Grid Presentation

NCBI Grid Structure NetCache NetSchedule Load Balancer (LBSM) Load Balancer (LBSM) Worker Nodes CGI Gateway

NetCache Problems: 1.HTTP/CGI is stateless protocol Every CGI call is a new run, no previous memory We need session state storage compatible with our Load Balancer 2. Storing information in files does not always work - File system overflow - Not protected against failures - Hard to load balance - No access log - Network access can be problematic (maintenance issues)

NetCache Design objectives: 1.BLOB ID can be used in Web apps URLs, HTML, cookies 2.Universal temporary BLOB storage Can store session info, graphics, sequences, ASN.1, XML 3.Automatic removal of old, unused objects Garbage collection 4.Compatible with NCBI Load Balancing 5.Can work on off-the-shelf hardware 6.High availability, automatic recovery after failures 7.Easy to scale economically by adding components * No RDBMS license * Any Linux, Unix, Windows box can be a NetCache host

NetCache CGI Load Balancer

NetCache CGI Load Balancer

NetCache CGI Load Balancer BLOB

NetCache CGI Load Balancer BLOB NetCache BLOB ID

NetCache CGI BLOB CGI NetCache BLOB ID

NetCache CGI BLOB CGI NetCache BLOB ID

NetCache Typical use cases: 1.Store session info 2.Graphics generated by CGIs 3.Caching results of computational algorithms 4.Cache results of expensive DBMS or search system queries 5.Data exchange between programs

NetSchedule CGI Typical CGI web call scenario:

NetSchedule CGI 30 sec timeout Typical CGI web call scenario: #include for (int i = 0; i < 10000; ++i) { …. }

CGI 30 sec timeout Expired! Typical CGI web call scenario: NetSchedule Reproduced with permission from Oleg O. Moiseyenko

Why do timeouts happen? Peak load hours. In peak hours number of resource-hungry tasks exceed available CPU time. Peak load hours. In peak hours number of resource-hungry tasks exceed available CPU time. CGI used as a platform to implement complex computationally intensive algorithms CGI used as a platform to implement complex computationally intensive algorithms Execution time depends on web user input Execution time depends on web user input user can specify complex criteria user can specify complex criteria user can upload a lot of data user can upload a lot of data

Worker Node NetSchedule NetSchedule CGI

Worker Node NetSchedule NetSchedule CGI

Worker Node NetSchedule NetSchedule CGI NetSchedule JOB ID

Worker Node NetSchedule NetSchedule CGI

Worker Node NetSchedule NetSchedule CGI #include for (int i = 0; i < 10000; ++i) { …. } Progress Report

NetSchedule Push-Pull Model Queue 1Queue 2 Job 1 Job 2 ….. Job 3 NetSchedule server maintains several FIFO queues Push Job Pull Job Worker Nodes CGIs

NCBI Grid Structure NetCache NetSchedule Load Balancer (LBSM) Load Balancer (LBSM) Stores JOB input/output General purpose queue management

NCBI Grid Structure NetCache NetSchedule Load Balancer (LBSM) Load Balancer (LBSM) Stores JOB input/output Worker Node API: Distribution, Logging, Remote Management General purpose queue management

NCBI Grid Structure NetCache NetSchedule Load Balancer (LBSM) Load Balancer (LBSM) CGI front end, and migration toolkit, HTML templates Stores JOB input/output Worker Node API: Distribution, Logging, Remote Management General purpose queue management

High availability All central components (queue and data storage) are duplicated All central components (queue and data storage) are duplicated All components are controlled by NCBI load balancer All components are controlled by NCBI load balancer Protection against back-end (remote CGI) failures - by timeout or via job re-scheduling Protection against back-end (remote CGI) failures - by timeout or via job re-scheduling Remote administration and statistics Remote administration and statistics

Worker node API High level design High level design standard C++ streams standard C++ streams ASN.1, XML serialization ASN.1, XML serialization Support of SMP Support of SMP thread based parallel jobs thread based parallel jobs Remote administrative access to worker nodes Remote administrative access to worker nodes shutdown shutdown availability checking availability checking statistics statistics

Acknowledgements C++ Group (Development) Denis Vakatov - coordination, design Anton Lavrentiev - communication libraries, load balancer Aaron Ucko - threaded server Anatoliy Kuznetsov - NetCache, NetSchedule Maxim Didenko - Grid API, CGI migration framework Other NCBI Groups BLAST Group Tom Madden George Coulouris Yuri Merezhuk Yan Raytselis Ron Edgar Mike DiCuccio Yuri Kapustin Boris Fedorov Mark Johnson - presentation rehearsal