Presentation on theme: "Cluster / Grid with Web and Semantic Services"— Presentation transcript:
1Cluster / Grid with Web and Semantic Services Dr G Sudha SadasivamProfessor, CSEPSG College of TechnologyCoimbatore
2Agenda Web Services SOA Semantics Grid Architecture 3rd Generation Grid ArchitectureSemantic GridCluster Architecture- HadoopAmazon Web ServicesWork at Grid and Cloud Computing Lab - PSGCTORGANISING A BIRTHDAY PARTY????
3PRODUCTS AND SERVICES – A TRADITIONAL WAY OF DISCOVERING AND ACCESSING
51. Web ServiceA service is a set of actions that form a coherent whole from the point of view of service providers and service requesters - Arranging for a birthday party.Web services provide a standard means of interoperating between different software applications, running on a variety of platforms and/or frameworks in a transparent and loosely coupled mannerA Web service is a software system designedto support interoperable machine-to-machine interactionhas an interface described in a machine-processable format (WSDL).communication using standard SOAP-messages, on HTTPwith an XML serialization in conjunction with other Web-related std.UDDI registryidentified by URIWeb service is an entity that can be:Described (using WSDL)PublishedDiscoveredInvoked by a clientW3C technology standardization process
7COMPONENTSA Web service is an abstract notion that is implemented by a concrete agent.ElementsThe provider entity is the person or organization that provides an appropriate agent to implement a particular service.A requester entity is a person or organization that wishes to make use of a provider entity's Web service.Registry – to register the servicesWeb Service Discovery:Before message exchange, the requester entity and the provider entity must first agree on both the semantics and the mechanics of the message exchangeThe service description (WSD) (message formats, datatypes, transport protocols, and transport serialization formats) represents a contract governing the mechanics of interacting with a particular service.The semantics represents a contract governing the meaning (consequence and purpose) of that interaction.
82. SOA Aim: Alignment of Business needs with IT Architectural style of building enterprise solutions based on servicesSOA is a blueprint that governs creation, deployment, execution and management of reusable business services.WSA is an instance of SOA (Architecture – independent of tech.)Services provide independent, loosely coupled, transparent, composable invocation of tasks in a standard way.SOA separates functions into distinct units (services), which can be distributed over a network and can be combined and reused to create business applications. These services communicate with each other by passing data from one service to another, or by coordinating an activity between two or more services.Guiding principles – Reusability, Open standardsServices (identification, categorization, monitoring and tracking)
11Service Oriented architecture Services created using an SOA and provided by an organisation’s IT should directly support the services that the organisation provides to its customers. (BP – IT)Human-mediated serviceSelf- serviceSystem-system delivery serviceService Oriented architecturecontractLegacy systemNew systemComposite systemSOA is a blueprint that governs creation, deployment, execution and management of reusable business servicesIt aligns Business and TechnologySO business delivers services to its customers
12SOA rolesBusiness Role: SOA is viewed as a set of services that a business wants to expose to customers and clients.Architectural Role: SOA is an architectural style which requires a service provider, requestor and a service description. It provides services that fosters modularity, encapsulation, loose coupling, separation of concerns, reuse, composable and single implementation.Implementation Role: SOA is a complete programming model (process) with standards, tools, methods and techniques, technologies.
13SOA suite Model and Capture business processes and policies Integrate the services using ESB and orchestrate the services into BPDevelop, connect and bind services to build composite applicationsDeploy composite applications and to perform service level managementApply runtime policies to services and govern themActivity monitoring to gain real-time information on BP
14ServiceA service is a manager entity that consists of a collection of components that work together to deliver the business function (currency conversion/airline reservations)A service maps to a business function but a component maps to business entities and the business rules that operate on them.Bank teller applicationcomponents - loan component, savings bank component (with withdrawal / deposit), account manager (to create new accounts).Service - the interfaces of all components (group) can be composed and exposed as services - creation of new accounts, withdrawal and deposit services and loan service.
16UI, Business processes, Service Layer, Component Layer, Object Layer PRESENTATION – portal for aggregationof contents to usersBusiness Process LayerAutomation logicOrchestration of services.Service layer – collection of units of work (interfaces) Processing logicComponent layer – operations that are units of work.SLAObject layer / legacy – Messages for communication(Operational)
17Terms in SOA Services Service provider Service consumer (or Service requestorService locator or service registryService broker – passes service requests to one or more service providers.
18SOA LIFE CYCLE Incremental Iterative Expose Compose Consume Business DriversCREATION OF SERVICES FROM EXISTING / new COMPONENTSCOMBINE EXISTING SERVICESUSE SERVICESConsumer view :Service identificationService CategorisationService exposureChoreographyQoSProvider view :Component identificationComponent SpecificationService realisationService managementStandards Implementation
19Advantages standardisation Faster time to market Operational efficiency and adaptabilityAgility to collaborateContinuous improvementAligns business to ITEase of introducing new technologiesReturn of Investment (ROI)Vendor diversityServices – encap, loose coupling, contract, reuse, composability, autonomous, dynamic, higher granularity
20SERVICE ORIENTED ARCHITECTURE Transport layerService Communication Protocol (ESB)Service DescriptionServiceBusiness ProcessService RegistryPolicySecurityTransactionManagement
21Problems in Web services (Point – Point) Service consumers need to be modified whenever the service provider interface changes. (dynamic)Every consumer should have a suitable protocol adapter for each provider it is connected to. (interoperability)ESBESB acts as a mediator that transforms, routes, notifies and augments information.It provides virtualization of the enterprise resources.The Enterprise Service Bus is an enterprise-class messaging bus.It has the following facilities:messaging infrastructuremessage transformation facility between consumer and providerContent-based routing between service consumers and providers.Capability to convert transport protocols between consumer and provider.
22SOA based Web services Business Process (BPEL) Transport layer (HTTP, JMS, SMTP)Service Communication Protocol (SOAP)Service Description (XML, WSDL)ServiceBusiness Process (BPEL)Service Registry (UDDI)Policy (WSPolicy)Security (WSSecurity)Transaction (WSTransaction)Management (WSManageability)
23MissingpersonOrg RegCampsRegRequestMgmtShelterMobilePersonFamilySearchOrgServicesVolCampMatchRequestsAidsPlaceSMSAlertsWiredInternetOffice SystemsLaptop/PDA/CellWeb ClientBUSINESS SERVICES OFFERRED BY SERVER GRIDBUSINESS PROCESSESChannel AccessPRESENTATION / UIRespondersSAHANASearch proceduresDDoS and Load Balancing
24Missing person’s registry with efficient search Organisation registry with efficient match and volunteer coordinationCamps registryRequest management registry with inventory management and optimisation – searchShelter registryMessaging alertsDamages registryGrid management module to manage coordination efforts among districts and relief organisationsBulletin board – user area
25SOA – screen shots 1. Organisation Registry New Organization Registration with the SystemMaintaining details about each organization with unique IDUpdating Organization’s services
26DESCRIPTIONWhen a Organization wants to provide service it must provide the Organization name, city, branch to the systemBy Default, every Organization that registers for the first time has to provide a single serviceOn successful registration, an automatically generated Organization Id will be displayed to the Organization authorityTo update the service provided, both Organization ID and password are validatedThe various services are displayed in the form and from which Service provider have to select their additional service
27ORG NAME REGISTRATION CITY SYSTEM BRANCH SERVICE ORGANIZATION DB NEW ORGANIZATION REGISTRATION:ORG NAMECITYBRANCHSERVICEORGANIZATION DBUPDATIONORGANIZATION IDREGISTRATIONSYSTEMSERVICEPROVIDER
28SERVICE REGISTRATION PROVIDER SYSTEM ORGANIZATION’S SERVICE UPDATION ORG ID AND PASSWORDRECORD RETRIVALAND VALIDATIONVALIDATION RESULTSERVICES LISTSELECTED SERVICESERVICE UPDATIONSERVICE INFROMATIONUPDATED FORM
29BUSINESS PROCESSESService Provider registers to the systemService provider login validationServices updatingFORMS3 X FormsLOGIN XFORMORGANIZATION DETAILS XFORMSERVICE UPDATED XFORM
36Mainly for file sharing Geographically dispersed peers P2PMainly for file sharingGeographically dispersed peersAutonomous nodesDecentralisedClustersResource sharingClose to each other,Usually homogenousCentralised control, cooperative workingShared Memory ComputingParallel systems, multicoreDivide and conquersynchronizationTightly CoupledHigh Throughput ComputingDistributed Computing, loosely coupledDisparate Autonomous heterogenous systemsComputation intensive – Sharing , single admHigh Performance ComputingTightly coupled, fine grain parallelismHomogenous Systemshigh computing power, short periodLow latency communicationGRIDHeterogeneous systems, HTCVO – trust groups, dynamic, cross organisationalGeographically dispersed Resource sharingScientific, distribution of work among all resourcesCLOUDHeterogeneous systems , HPCOn demand resource provisioning over InternetData centric with grid backbone, utility valueElastic , Business, full utilization of resourcesWeb ServicesApplication integrationSeparation of concernsData integration, interopVirtualisationSystem integrationMulti tenancySharing a resource among multiple clientsViewing a single system as multiple resources
37Some Characteristics of Grids Numerous resourcesOwned by multiple organizations & individualsConnected byheterogeneous,multi-level networksDifferent securityrequirements& policiesDifferent resourcemanagementpoliciesGeographicallyseparatedUnreliable resources and environmentsResources areheterogeneous
38Stages to using the Grid – Classical View write (code) to solve problem“compile” against middlewareaccountingSteering and visualisationStage datasubmit to GridsecuritymiddlewareadvertiseSelect resourcesDeploy to resources
39Technical capabilities Resource modelingMonitoring and notificationAllocationProvisioning, life-cycle management, and decommissioningAccounting and auditingsecurity
40Fabric layer: Provides the resources for shared access G2Fabric layer: Provides the resources for shared accessConnectivity layer: Core communication and authentication protocolsResource layer: Protocols for secure negotiations, initiation, monitoring control, accounting on individual resources.Collective Layer: Protocols and services to capture interactions among a collection of resources.Application Layer: User applications that operate within VO environment.
41G3- Services - OGSA Service based infrastructure for grid Grid aims to integrate, virtualize, and manage resourcesand services within distributed, heterogeneous, dynamic“virtual organizations”Standardization is critical to create interoperable, portable, secure robust, scalable and reusable components and systemsGoal is to standardize grid services by specifying set of standard interfaces.Aims to develop a common , standard and open architecture for grid based applications.Service-oriented architecture, based the Open Grid Services Architecture (OGSA), addresses this need for standardization by defining a set of core capabilities and behaviors that address key concerns in Grid systems.OGSA is based on Grid Service ( extension of web service) .
42OGSA realizes the logical middle layer in terms of services, the interfaces these services expose, the individual and collective state of resources belonging to these services, and the interaction between these services within a service-oriented architecture (SOA).The architecture is not layered,Services are loosely coupled peers that, either work single or part of an interacting group of services,
45OGSIRequirements not met in Web services were implemented as Grid services confirming to OGSI specificationsOGSI specification definesHow grid service instances are named and referencedHow the interfaces and behaviors are common to all Grid servicesHow to specify additional interfaces, behaviors and extensionsGWSDL (Grid WSDL)Introduces Service Data Elements (SDEs)portType inheritanceGrid Service Handle (GSH)Grid Service Reference (GSR)FactoryHandle resolverNotificationService groups (light-weight registries)
482-level naming scheme – GSH and GSR Grid vs Web services• Web Services• Messages exchange• Documents• No notion of “pointer”• Service orientation?• Grid Services• The architecture encourages everything to be exposed through an interface rather than being sent as a document• GSH is the “pointer”• Object orientation? (CORBA?)2-level naming scheme – GSH and GSRSDE – Web services static discovery vs SDE – dynamicInstantiation and life cycle management - factory
51OGSA services defined and implemented as Web Services G4- Grid WSRFOGSA services defined and implemented as Web Services
523. Semantic Web Semantic Web architecture information management Keywords,Statistical,Natural Language,Semantic WebSemantic Web architectureautomated conversion and storage of unstructured text machine process able formatautomatically extract and process the concepts and context in the database –uses intelligent techniquesUses metadata to capture meaning of the information
53To capture KnowledgeMetadataOntology –formal specification of informationA network of concepts, relationships, and constraints that provide context for data and information as well as processes.classes (concepts) and relationships (hierarchy) in the domain. It provides a shared understanding of the domain.Ontology languages - XML, RDF, OWLLogic –formal languages for representing knowledge with semanticsReasoners to infer conclusionsAgentsPieces of software that work autonomously and proactivelyEg- search personalisationUses metadata, ontologies, logic
55Architecture Unicode International encoding standard Any language can be used on the web using one standardized form.Uniform Resource Identifier (URI)uniquely identify resources (e.g., doc)URL+URNXMLlanguage to write structured web documents with user defined vocabularyTo send documents across the WebRDFData model (representation) of web objectsXML based syntax
56RDFSHas modeling to organise web objects into hierarchies (taxonomies) – class, subclass, properties, domain and range restrictionBased on RDFUsed to write ontologyLogic LayerApplication specific declarative knowledge – RIF and SWRLProof layerDeductive processSPARQL can be used for querying ontologies and knowledge bases – SQL likeTrust layerUsers trust using Web services
57RDF triples subject-predicate-object in RDF Joe Smith has homepage(subject) is intended to identify Joe Smith(predicate)(object) is Joe's homepage
58"Joe has family name Smith" RDF graph describing Joe Smith
59RDFS for thecompany ( resource)identified by URIName is Webify Solutions,address is andphone number is WEBIFY.
60OWLClasses - named class, intersection classes, union classes, complement classes, restrictions, and enumerated classesPropertiesObject typeData typeProperty typesFunctionalInverse functionalSymmetricTransitiveIndividuals – instances of classes and properties relate them
62Need for ontology in IT Bank An ontology-driven approach Offers a number of services which can use the same data but with redundancyNew services can be added – but reuse existing data / functionalityAn ontology-driven approachcan capture and represent its total product knowledge in a language-neutral formdeploy the knowledge in a central repository (shared).a single, unified view of data across its applications.precise retrieval of information and seamless enterprise integration,business processes and various data sources can map to each other through a common meta-model.shared ontologyeliminates point-to point integrationsimplifies application integrationreduces data redundancy andprovides the same semantic meaning across applications,eases the bank's maintenance and upgrades.
63Need for semantic web WWW has vast amount of heterogenous information Searching is based on contentsSemantic meaning attached to content items describes the information preciselyRelevancy of information extraction can be improved.Provided services can be tagged with meaning;Web-based software agents can dynamically find these services on the fly and use them to your benefit or in collaboration with other services.
64Need for semantics in SOA In SOA service representations of the available services must be maintained.Metadata to discover and organize servicesMetadata to model and assemble servicesmetadata to encapsulate business logic for dynamic binding,Metadata manage with metadata.Ontology provide a very powerful and flexible way to aggregate, visualize, and normalize service metadata layer.Ontology enhance service discovery, modeling, assembly, mediation, and semantic interoperabilitySemantic technologies provide an abstraction layer above existing IT technologies, one that enables the bridging and interconnection of data, content, and processes across business and IT silos.
65Semantics for Business A business ontology is a formal specification of business concepts and their interrelationships that facilitates machine reasoning and inference.A business ontology ties systems together using metadata, much as a database ties together discrete pieces of data.Organizations can provide a single, unified view of data across their applications,Allows for precise retrieval of information,simplifies enterprise and SOA integration,reduces data redundancy, andProvides uniform semantic meaning across applications.eases development, maintenance, and upgrades across the enterprise.
66Grid semanticsThe Grid’s vision - sharing diverse resources in a flexible, coordinated and secure manner through dynamic formation and disbanding of virtual communities, strongly depends on metadata. Ad hoc expression and use of metadata causes chronic dependency on human interventionThe Semantic Grid is an extension of the Grid in which rich resource metadata is exposed and handled explicitly, and shared and managed via Grid protocols.It exposes semantically rich information associated with grid resources to build more intelligent grid servicesThe layering of an explicit semantic infrastructure over the Grid Infrastructure leads to increased interoperability and greater flexibility.Reference Architecture that extends OGSA (standardisation) to support the explicit handling of semantics, and defines the associated knowledge services to support a spectrum of service capabilities.S-OGSA defines a model (abstraction), the capabilities (what) and the mechanisms (how) for the Semantic Grid.
67Metadata – to label grid resources and entities with concepts (data file according to appln domain) Rules and classification-based reasoning can be used to generate new metadata from existing metadata. (VO membership)S-OGSA hasModel (elements and relationships)Capabilities (services for the components)Mechanisms (elements to deliver the service)
68S-OGSA entities and relationships Grid entities (id in grid)Knowledge entities (K-entities) – Grid entities to operate on knowledge.Semantic Bindings – association between grid and knowledge entities.Semantic grid entities – entities subject to semantic bindings, or semantic bindings, knowledge entity.
71Fabric layer – resources are virtualised through Web services Grid middleware with services – OGSA interact with one another. It deploys web services with port types through which resources are accessedOGSA is extended with light weight semantics and knowledge services to support a spectrum of service capabilitiesTop – application layerSemantics of middleware and fabric layers are considered.
72Services Semantic provisioning services Semantic aware grid services Knowledge provisioning servicesSemantic binding provisioning servicesSemantic aware grid servicesConsume semantic bindings and take actions based on knowledge and metadata
73Semantic aware authorisation service Subject – John Doe, object – resourceSemantic bindings based on matchOntology service provides knowledge to understand semantic bindings
74HadoopWhat is Hadoop?It's a framework for running applications on large clusters of commodity hardware which produces huge data and to process itApache Software Foundation ProjectOpen sourceAmazon’s EC2, Googlealpha (0.21) release available for downloadHadoop IncludesHDFS a distributed filesystemMap/Reduce HDFS implements this programming model. It is an offline computing engineConceptMoving computation is more efficient than moving large data74
75Data intensive applications with Petabytes of data. Web pages billion web pages x 20KB = 400+ terabytesOne computer can read MB/sec from disk ~four months to read the websame problem with 1000 machines, < 10 mins
76FACTS Single-thread performance doesn’t matter We have large problems and total throughput/price more important than peak performanceStuff Breaks – more reliability• If you have one server, it may stay up three years (1,000 days)• If you have 10,000 servers, expect to lose ten a day“Ultra-reliable” hardware doesn’t really helpAt large scales, super-fancy reliable hardware still fails, albeit less often software still needs to be fault-tolerantCommodity machines without fancy hardware give better price – performance ratio.DECISION : COMMODITY HARDWARE.DFS : HADOOP – REASONS?????WHAT SOFTWARE MODEL????????
78HDFS Why? Seek vs Transfer BTree (Relational DBS)– operate at seek rate, log(N) seeks/access-- memory / stream basedsort/merge flat files (MapReduce)– operate at transfer rate, log(N) transfers/sort-- Batch based
79Moving computation to place of data CharacteristicsFault tolerant, scalable, Efficient, reliable distributed storage systemMoving computation to place of dataSingle cluster with computation and data.Process huge amounts of data.Scalable: store and process petabytes of data.Economical79
80• Data Model– Data is organized into files and directories– Files are divided into uniform sized blocks and distributed across cluster nodes– Replicate blocks to handle hardware failure– Checksums of data for corruption detection and recovery– Expose block placement so that computes can be migrated to datalarge streaming reads and small random reads
81Files are broken in to large blocks. – Typically 128 MB block size– Blocks are replicated for reliabilityOne replica on local node, another replica on a remote rack, Third replica on local rack, Additional replicas are randomly placed• Understands rack locality– Data placement exposed so that computation can be migrated to data• Client talks to both NameNode and DataNodes– Data is not sent through the namenode, clients access data directly from DataNode– Throughput of file system scales nearly linearly with the number of nodes.
84Components DFS Master “Namenode” Manages the file system namespace Controls read/write access to filesManages block replicationCheckpoints namespace and journals namespace changes for reliabilityMetadata of Name node in Memory– The entire metadata is in main memory– No demand paging of FS metadataTypes of Metadata:List of files, file and chunk namespaces; list of blocks, location of replicas; file attributes etc.
85DFS SLAVES or DATA NODES Serve read/write requests from clientsPerform replication tasks upon instruction by namenodeData nodes act as:1) A Block Server– Stores data in the local file system– Stores metadata of a block (e.g. CRC)– Serves data and metadata to Clients2) Block Report: Periodically sends a report of all existing blocks to the NameNode3) Periodically sends heartbeat to NameNode (detect node failures)4) Facilitates Pipelining of Data (to other specified DataNodes)
86Map/Reduce Master “Jobtracker” Accepts MR jobs submitted by usersAssigns Map and Reduce tasks to TasktrackersMonitors task and tasktracker status, reexecutes tasks upon failureMap/Reduce Slaves “Tasktrackers”Run Map and Reduce tasks upon instruction from the JobtrackerManage storage and transmission of intermediate output.
87• Uploads new FSImage to the NameNode SECONDARY NAME NODE• Copies FsImage and Transaction Log from NameNode to a temporary directory• Merges FSImage and Transaction Log into a new FSImage in temporary directory• Uploads new FSImage to the NameNode– Transaction Log on NameNode is purged
88HDFS Architecture• NameNode: filename, offset> blockid, block > datanode• DataNode: maps block > local disk• Secondary NameNode: periodically merges edit logsBlock is also called chunk88
90Software Model - ???Parallel programming improves performance and efficiency.In a parallel program, the processing is broken up into parts, each of which can be executed concurrentlyIdentify whether the problem can be parallelised (fib)Matrix operations with independency
91CALCULATING PIThe area of the square, denoted As = (2r)^2 or 4r^2.The area of the circle, denoted Ac, is pi * r2.pi= 4 * No of pts on the circle / num of points on the squareCount the number of generated points that are both in the circle and in the square MAPPI = 4 * r REDUCERestricted parallel programming model meant for large clustersUser implements Map() and Reduce()
93FileHello World Bye WorldHello Hadoop GoodBye HadoopMapFor the given sample input the first map emits:< Hello, 1>< World, 1>< Bye, 1>The second map emits:< Hadoop, 1>< Goodbye, 1>
94The output of the first combine: < Bye, 1>< Hello, 1>< World, 2>The output of the second combine:< Goodbye, 1>< Hadoop, 2>Thus the output of the job (reduce) is:< Hello, 2>
95Map() Reduce() Input <filename, file text> Parses file and emits <word, count> pairseg. <”hello”, 1>Reduce()Sums all values for the same key and emits <word, TotalCount>eg. <”hello”, ( )> => <”hello”, 17>
96FileHello World Bye WorldHello Hadoop GoodBye HadoopMapFor the given sample input the first map emits:< Hello, 1>< World, 1>< Bye, 1>The second map emits:< Hadoop, 1>< Goodbye, 1>
97MR modelMap()Process a key/value pair to generate intermediate key/value pairsReduce()Merge all intermediate values associated with the same keyUsers implement interface of two primary methods:1. Map: (key1, val1) → (key2, val2)2. Reduce: (key2, [val2]) → [val3]Map - clause group-by (for Key) of an aggregate function of SQLReduce - aggregate function (e.g., average) that is computed over all the rows with the same group-by attribute (key).
98Cloud need ‘Era of tera’ ever-growing datasets, Changing demands/loads unpredictable traffic patterns, andthe demand for faster response times.Elasticity – use and relinquish resources as per demandSoftware applications should be internet accessibleLarge scale applications –cloud provides large number of machines, when needed, distributes work among them, provisions new machines on failure, auto scale, relinquish machines when not needed
99AdvantagesAlmost zero upfront infrastructure investmentJust-in-time InfrastructureMore efficient resource utilizationUsage-based costingPotential for shrinking the processing timeLess time for developmentBasis – automated elasticity - on-demand and elastic natureExample – e-ticketing application
101AWSThe Amazon Web Services (AWS) cloud provides a highly reliable and scalable infrastructure for deploying web-scale solutions, with minimal support and administration costs, and good flexibility
102Amazon S3 to retrieve/store input /output datasets. Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud.Operating system, application software and associated configuration settings can be bundled in an Amazon Machine Image (AMI).Scale up / down is done by provisioning / decommissioning multiple instances using simple web service callsOn-Demand Instances / Reserve instances / Spot InstancesAmazon S3 to retrieve/store input /output datasets.store / retrieve large amounts of data as objects in buckets (containers) on the web using standard HTTPCopies can be made in 14 locations using CloudFrontAmazon Simple Queue Service (Amazon SQS) is a reliable, highly scalable, distributed queue for storing messages as they travel between computers and application components.
103Amazon SimpleDB is a web service for real-time lookup and simple querying of structured data Amazon Relational Database Service (Amazon RDS) provides an easy way to setup, operate and scale a relational database in the cloudOn-demand hadoop cluster- distributed processing, automatic parallelization, and job schedulingAmazon Elastic MapReduce provides a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute CloudAmazon Virtual Private Cloud (Amazon VPC) extends corporate network into a private cloud contained within AWS
104Availability Zones are distinct locations engineered to be insulated from failures in other Availability Zones and provide inexpensive, low latency network connectivity to other Availability Zones in the same Region.Elastic IP addresses allocates a static IP address and programmatically assigns it to an instance.CloudWatch can monitor an Amazon EC2 instance for resource utilization, operational performance, and overall demand patterns .Auto scaling feature to create Auto-scaling Group.Incoming traffic can be distributed using elastic load balancing service.Amazon Elastic Block Storage (EBS) volumes provide network-attached persistent storage to Amazon EC2 instances.AWS offers payment and billing services.Amazon CloudFront. provides a high performance, globally distributed content delivery system
106Cloud Services best practices Design for failure and nothing will fail - design, implement and deploy for automated recovery from failure.In AWSFailover gracefully using Elastic IPsUtilize multiple Availability ZonesMaintain an Amazon Machine ImageUtilize Amazon CloudWatchDecouple the components – based on SOA design principle of the loosely coupled the components for scalabilityMessage queues: If one component fails the system will buffer the messages and get them processed when the component comes back up.
107SQS for decoupling and buffering Service interfaces for componentsAMI createdStateless applications
108Implement elasticityThink parallelThe beauty of the cloud shines when you combine elasticity and parallelizationKeep dynamic data closer to the compute and static data closer to the end-user
109PSG-Yahoo Grid and Cloud Computing Lab 2008 till date 54 rack servers – SC145 & PowerEdge 295040 end connectors10 client nodesRHELHadoopGlobusOpenVZXen
111An Efficient Approach to Task Scheduling in Computational Grids Data Discovery in Grid using Content Based Searching TechniqueP2P Information Retrieval Framework for Digital Library System using Hadoop DFS.Integration of Xen and Hadoop frameworkDNA sequencing using hadoop data gridsDNA sequencing in public cloudsVirtualisation – using Xen and Open VZ- a comparison of performanceGrid Security – a tree based dynamic approachStudy of some existing scheduling algorithmsGrid Task Scheduling using PPSOContent based Image RetrievalModification of fairshare scheduling in HadoopTwo level scheduler for cloudsHybrid Search using content based and semantic approaches