Presentation is loading. Please wait.

Presentation is loading. Please wait.

©2011 Hewlett-Packard Company and Vertica Confidential11 Cloud Storage Challenges Dr. Dinkar Sitaram

Similar presentations


Presentation on theme: "©2011 Hewlett-Packard Company and Vertica Confidential11 Cloud Storage Challenges Dr. Dinkar Sitaram"— Presentation transcript:

1 ©2011 Hewlett-Packard Company and Vertica Confidential11 Cloud Storage Challenges Dr. Dinkar Sitaram dinkar.sitaram@hp.com

2 2 Overview –Types of cloud storage –Building cloud-scale storages –Challenges: theoretical considerations –Dealing with the challenges Based on Moving to the Cloud by Dinkar Sitaram & Geetha Manjunath, to be published by Elsevier

3 ©2011 Hewlett-Packard Company and Vertica Confidential33 Types of cloud storage

4 4 File-based cloud storage –Allow storage of files in cloud –Amazon S3, Windows Azure, … –Built on top of HTTP –Amazon S3 Overview Create bucket, objects GET http://dinkar.s3.amazon.aws.com/p roject/file.c http://dinkar.s3.amazon.aws.com/p roject/file.c No directories: file names Need AWS Access Key and AWS Secret Key –Region: geographical

5 5 Database oriented cloud storage –Offers a database service –Examples: Amazon RDS (MySQL), Windows Azure SQL –RDS examples Can administer (e.g., create, replicate) database using Amazon RDS APIs Db.createDBInstanceAsync (parms) creates a database Use JDBC APIs to build applications ResultSet rs = stmt.executeQuery (SELECT * FROM Employee)

6 6 Key-value stores –Database consists of pairs No schema as in relational databases Typically data need not be normalized More flexible than RDBMS, scales due to fewer restrictions More work in application (e.g., valid values) to guarantee traditional RDBMS qualities –Examples: Amazon SimpleDB, Google BigTable, Hadoop HBase –Programming example (SDB) Google SimpleJDBC String insert = "INSERT INTO employees (name, title) VALUES (Dinkar', Architect)"; int val = st.executeUpdate(insert);

7 7 XML databases –Store XML documents –Examples: MongoDB Stores JSON documents { Name: Dinkar, Attributes: {Sex: M, Title: Architect} } Documents can have pointers to other documents Index on any attribute (including embedded): db.Orders.EnsureIndex() Searching: db.orders.find() –XML DBs midway between key-value stores and RDBMS Explicitly create indices More complex structures Some XML DBs, e.g., CouchDB, offer transactions

8 ©2011 Hewlett-Packard Company and Vertica Confidential88 Building cloud-scale storage

9 9 Cloud storage requirements –Scaling to cloud-scale: partitioning –Availability: replication

10 10 Partitioning strategies –Similar to methods for partitioning databases –Round-robin on partitioning attributes Loses associativity –Hash partitioning –Range-based –Directory-based Memcached Can provide, e.g., geographical partitioning –References: Parallel database systems: the future of high performance database systems, by DeWitt, D and Gray, J, Communications of the ACM, Volume 35 Issue 6, June 1992.

11 11 Amazon availability –Multiple availability zones per regions Zones failure isolated from each other –Data replicated across 3 availability zones by default

12 ©2011 Hewlett-Packard Company and Vertica Confidential12 Challenges: Theoretical considerations

13 13 CAP theorem –Fundamental limitation of distributed systems –No distributed system can satisfy all three properties below Conjectured in [Brewer00], proved in [LynGil02] by considering a two-node cluster Consistency: all operations appear to be serialized on a non-distributed object Availability: every operation returns a result Partition-tolerance: Arbitrary number of messages between service nodes are lost –References 1. [Brewer00] Towards Robust Distributed Systems by Eric A. Brewer, ACM Symposium on Principles of Distributed Systems, July 16-19 2000, Portland, Oregon 2. [LynGil02] Brewers Conjecture and the Feasibility of Consistent, Available, Partition- Tolerant Web Services, by Nancy Lynch and Seth Gilbert, ACM SIGACT News, Volume 33 Issue 2 (2002), pg. 51-59

14 14 2-node example 1. Servers replicated for availability 2. If network partitions 3.Allow servers to operate independently (inconsistent) OR 4. Bring servers down (no availability)

15 15 Practical example: Netflix –Netflix: video on demand over the Internet –Runs on Amazon cloud –Consider the following scenario User at TV updates list of favorites Load balancer sends update to server 1 Set top box requests favorites list Load balancer sends update to server 2 Is the returned result consistent? Depends! –Comparing NoSQL Availability Models by Adrian Cockcroft, http://perfcap.blogspot.com/2010/1 0/comparing-nosql-availability- models.html http://perfcap.blogspot.com/2010/1 0/comparing-nosql-availability- models.html

16 ©2011 Hewlett-Packard Company and Vertica Confidential16 Dealing with inconsistency predicted by CAP theorem

17 17 Relaxed consistency –Consistency can be relaxed Weak consistency: system does not guarantee to return consistent results Eventual consistency: if no further updates, system will become consistent. If updates are infrequent, can wait for some time to get consistent value Read your writes consistency: a client performing a read after a write will always see its own updates Session consistency: consistency within a session –Amazon S3 US Standard Region: Eventual consistency US West, EU, Asia Pacific Regions: Read your writes consistency for new object creation, eventual consistency for writes and deletes –Reference: Eventual Consistency by Werner Vogel, Communications of the ACM, January 2009

18 18 Example: Handling inconsistency –BASE: an alternative to ACID [Brewer00] Basically Available Soft-state Eventually consistent –Example: online shopping portal User table: transactions by user Transaction table: transactions used for billing How do we update both tables after a purchase? –Traditional database method Begin transaction Update User table Update Transaction table End transaction –BASE, an ACID Alternative, by D. Pritchett, ACM Queue, June 2008 –A common cloud Method Queue update to user table Queue update to transaction table –Databases could be inconsistent –Will become eventually consistent User tableTransaction table Application

19 ©2011 Hewlett-Packard Company and Vertica Confidential19 Conclusions

20 20 Conclusions –Many alternatives for building cloud storage exist –Careful trade-off between consistency and availability


Download ppt "©2011 Hewlett-Packard Company and Vertica Confidential11 Cloud Storage Challenges Dr. Dinkar Sitaram"

Similar presentations


Ads by Google