Download presentation
Presentation is loading. Please wait.
Published byGervase Evans Modified over 9 years ago
1
Data Management in the Celal ÇIĞIR 24.05.2012
2
2 Understanding Cloud Computing
3
3 Origin of the term “Cloud Computing” “Comes from the early days of the Internet where we drew the network as a cloud… we didn’t care where the messages went… the cloud hid it from us” – Kevin Marks, Google First cloud around networking (TCP/IP abstraction) Second cloud around documents (WWW data abstraction) The emerging cloud abstracts infrastructure complexities of servers, applications, data, and heterogeneous platforms –(“muck” as Amazon’s CEO Jeff Bezos calls it)
4
4 Evaluation of Computing
5
5 A Working Definition of Cloud Computing Cloud computing is –a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is composed of five essential characteristics, three service models, and four deployment models.
6
6 5 Essential Cloud Characteristics 1.On-demand self-service –provisioning computing capabilities automatically 2.Broad network access –available over the network and accessed by several clients 3.Resource pooling –location independence
7
7 5 Essential Cloud Characteristics 4.Rapid elasticity –capabilities can be rapidly and elastically provisioned –unlimited and can be purchased in any quantity in any time 5.Measured service –control and optimize resource usage automatically – resource usage can be monitored
8
8 3 Cloud Service Models Cloud Software as a Service (SaaS) –Use provider’s applications over a network (e.g. Salesforce ) Cloud Platform as a Service (PaaS) –Deploy customer-created applications to a cloud (e.g. Microsoft Azure, Google AppEngine) Cloud Infrastructure as a Service (IaaS) –Rent processing, storage, network capacity, and other fundamental computing resources (e.g. Amazon EC2) To be considered “cloud” they must be deployed on top of cloud infrastructure that has the key characteristics
9
9 Service Model Architectures
10
10 4 Cloud Deployment Models Private cloud –a cloud that is used exclusively by an organization Community cloud –shared infrastructure for specific community Public cloud –sold to the public, mega-scale infrastructure Hybrid cloud –composition of two or more clouds (private, community or public)
11
The NIST Cloud Definition Framework 11 CommunityCloud Private Cloud Public Cloud Hybrid Clouds Deployment Models Service Models Essential Characteristics Common Characteristics Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS) Resource Pooling Broad Network AccessRapid Elasticity Measured Service On Demand Self-Service Low Cost Software VirtualizationService Orientation Advanced Security Homogeneity Massive ScaleResilient Computing Geographic Distribution Based upon original chart created by Alex Dowbor - http://ornot.wordpress.com
12
Compared Technologies Virtualization –is a technology that abstracts away the details of physical hardware and provides virtualized resources for high- level applications. –virtual machines such as VMware, Windows Virtual PC, Oracle VM VirtualBox –forms the foundation of cloud computing since it provides the capability of pooling computing resources 12
13
Compared Technologies Grid Computing –is a distributed computing paradigm that coordinates networked resources to achieve a common computational objectives. –Cloud computing is similar to Grid computing in that it also employs distributed resources to achieve application-level objectives. –Cloud computing takes one step further by leveraging virtualization technologies at multiple levels (hardware and application platform) 13
14
Compared Technologies Utility Computing –model of provisioning computing resources as a metered services –allows providing resources on-demand and charging customer based on usage –Cloud computing can be perceived as realization of utility computing. 14
15
Compared Technologies Autonomic Computing –aims at building computing systems capable of self- management without human intervention. –overcome the management complexity of today’s computer systems especially network management –Cloud computing shares certain autonomic features such as automatic resource provisioning lower the resource cost rather than the complexity 15
16
The Architecture of Cloud 16
17
Cloud Infrastructure 17
18
Cloud Computing 18 From the data management point of view, cloud computing provides full availability: –user can read and write data at any time without ever being blocked – the response times are (virtually) constant and do not depend on the concurrent users, the size of the database or any other system parameter
19
Goals of Cloud Computing Availability –always accessible even on the occasions where there is a network failure or whole data center crashes. Scalability –support very large databases with very high request rates at very low latency Elasticity –satisfy changing the application requirements in both directions (scaling up or down) Performance 19
20
Goals of Cloud Computing Multitenancy –support many applications (tenants) on the same hardware and software infrastructure. Load balancing –automatically move load between servers in order to effectively utilize hardware resources Fault tolerance –recover from a failure without losing any data –successfully commit transactions. Ability to run in a heterogeneous environment 20
21
Challenges of Cloud Computing Availability of service –network links can potentially disappear. Data security and confidentiality –even the data is encrypted, the data may be accessed by a third party without customer’s knowledge. Data lock-in –customers can not easily extract their data. Data transfer bottlenecks –data traffic at every level of the system directly affects the cost 21
22
Challenges of Cloud Computing Application parallelization –computing power is elastic but only if workload is parallelizable. Shared-nothing architecture Performance Unpredictability –complex computer systems, thus performance is changing. Application debugging –hard to find and remove errors in these very large scale distributed systems. 22
23
Cloud Data Management Google: Bigtable –Bigtable is a distributed storage system for managing structured data with large size (petabytes of data) across thousands of commodity servers. –BigTable maps two arbitrary string values (row key and column key) and timestamp (hence three dimensional mapping) into an associated arbitrary byte array. –Does not support a full relational data model. 23
24
Cloud Data Management Google: Bigtable The contents of the web pages are stored in a single column which stores multiple versions of the page under the timestamp. Row range is dynamically partitioned; called as tablet. Tablet represents the unit of distribution and load balancing 24
25
Cloud Data Management Google: Bigtable –Bigtable uses Google File System (GFS) to store log and data files. –SSTable file format is used to store Bigtable data. provides a persistent, ordered immutable map for key to values. –Relies on a distributed lock service: Chubby ensures that one active master at any time stores the bootstrap location of Bigtable stores the Bigtable schame information 25
26
Cloud Data Management Yahoo!: PNUTS / Sherpa –Platform for Nimble Universal Table Storage –massive-scale hosted database system which is designed to support Yahoo!’s web applications. –Simple relational model where data is organized into tables of records with attributes. –Data tables are horizontally partitioned into groups of objects called tablets –The router component is responsible for determining which storage unit need to be accessed for a given record. 26
27
Cloud Data Management Amazon: Dynamo –is a highly available and scalable distributed key/value based datastore. –partitioning scheme relies on a variant of consistent hashing mechanism to distribute load across multiple storage host. –failure on network links will not affect all the data sets due to “ring” scheme hashing mechanism 27
28
Cloud Data Management Amazon: S3 / SimpleDB / RDS –S3 : Simple Storage Service online public storage web service infinite store for objects of variable sizes. an object is a byte container identified by a URI get(uri) : returns an object put(uri, bytestream) : writes a new version of an object –Buckets : similar to folders –Objects : similar to files (up to 5 GB) 28
29
Cloud Data Management Amazon: S3 / SimpleDB / RDS –SimpleDB : is designed for running queries on structured data. the main focus is fast reading. data is organized as domains (tables) which consists of items (records) which are described by attribute name/value pairs. –Relational Database Service (RDS) gives full capabilities of MySQL database. 29
30
Cloud Data Management Microsoft: DRYAD / SQL Azure –Computational vertices –Communication channels –Graph based 30
31
Cloud Data Management Microsoft: DRYAD / SQL Azure –Dryad has its own high-level language called DryadLINQ combines SQL and MapReduce algorithm data is a strongly typed.NET objects operations on objects by using traditional high-level programming language (LINQ, Language Intagrated Query) –SQL Azure Database is a cloud-based relational database service which has been built on MS Server 31
32
32 Microsoft Azure Services Source: Microsoft Presentation, A Lap Around Windows Azure, Manuvir Das
33
Cloud Data Management Cassandra –open sourced by Facebook in 2008 –highly scalable, eventually consistent, distributed, structured key-value store. –like BigTable, Column Family based data model –column is the smallest increment of data tuple (triplet) contains a name, a value and a timestamp 33
34
Cloud Data Management CAP theorem –Consistency (all records are the same in all replicas) –Availability (all replicas can accept updates or inserts) –tolerance to Partitions (the system still functions when distributed replicas cannot talk) 34
35
Cloud Data Management 35
36
Advantages Reduced time-to-market by removing or simplifying the time-consuming hardware provisioning, purchasing, and deployment processes. Reduced cost by following a pay-as-you-go business model. Reduced operational cost and pain by automating IT tasks such as security patches and fail-over. Unlimited (virtually) throughput by adding servers if the workload increases. Requires less in-house IT staff, costs. 36
37
Disadvantages Lack of standardization Requires complex systems –Error prone –Hard to maintain and management Not mature system Security considerations Hard to migrate from one vendor to another 37
38
38 Security is the Major Issue
39
Conclusion Cloud computing is emerging technology with its advantages and disadvatages. Lowers the cost of initial infrastructure building. Provides avaliable, scalable, and pay-for-use services. Security is still the big issue. 39
40
Questions To Cloud or not To Cloud, That is the Question! ? 40
41
References [1]. Sakr, S.; Liu, A.; Batista, D.M.; Alomari, M.;, "A Survey of Large Scale Data Management Approaches in Cloud Environments," Communications Surveys & Tutorials, IEEE, vol.13, no.3, pp.311-336, Third Quarter 2011 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5742778&isnumber=6026692http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5742778&isnumber=6026692 [2]. Voicu, L.C.; Schuldt, H.; Breitbart, Y.; Schek, H.-J.;, "Flexible Data Access in a Cloud Based on Freshness Requirements," Cloud Computing (CLOUD), 2010 IEEE 3rd International Conference on, vol., no., pp.180-187, 5-10 July 2010 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5557996&isnumber=5557954http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5557996&isnumber=5557954 [3]. The National Institute of Standards and Technology (NIST) Cloud Computing Definition at. http://www.nist.gov/itl/csd/cloud-102511.cfm http://www.nist.gov/itl/csd/cloud-102511.cfm [4]. IDC Enterprise panel. Agust 2008. http://www.idcenterprisepanel.com/http://www.idcenterprisepanel.com/ [5]. Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia. 2010. A view of cloud computing. Commun. ACM 53, 4 (April 2010), 50-58. DOI=10.1145/1721654.1721672 http://doi.acm.org/10.1145/1721654.1721672http://doi.acm.org/10.1145/1721654.1721672 [6]. Rajkumar Buyya, Chee Shin Yeo, Srikumar Venugopal, James Broberg, Ivona Brandic, Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility, Future Generation Computer Systems, Volume 25, Issue 6, June 2009, Pages 599-616, ISSN 0167-739X, 10.1016/j.future.2008.12.001. (http://www.sciencedirect.com/science/article/pii/S0167739X08001957)http://www.sciencedirect.com/science/article/pii/S0167739X08001957 [7]. Sean Marston, Zhi Li, Subhajyoti Bandyopadhyay, Juheng Zhang, Anand Ghalsasi, Cloud computing — The business perspective, Decision Support Systems, Volume 51, Issue 1, April 2011, Pages 176-189, ISSN 0167-9236, 10.1016/j.dss.2010.12.006. 41
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.