Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Bases in Cloud Environments

Similar presentations


Presentation on theme: "Data Bases in Cloud Environments"— Presentation transcript:

1 Data Bases in Cloud Environments
Based on: Md. Ashfakul Islam Department of Computer Science The University of Alabama

2 Data Today Data sizes are increasing exponentially everyday.
Key difficulties in processing large scale data acquire required amount of on-demand resources auto scale up and down based on dynamic workloads distribute and coordinate a large scale job on several servers Replication – update consistency maintenance Cloud platform can solve most of the above

3 Large Scale Data Management
Large scale data management is attracting attention. Many organizations produce data in PB level. Managing such an amount of data requires huge resources. Ubiquity of huge data sets inspires researchers to think in new ways. Particularly challenging for transactional DBs.

4 Issues to Consider Distributed or Centralized application?
How can ACID guarantees be maintained? Atomicity, Consistency, Isolation, Durability Atomic – either all or nothing Consistent - database must remain consistent after each execution of write operation Isolation – no interference from others Durability – changes made are permanent

5 ACID challenges Data is replicated over a wide area to increase availability and reliability Consistency maintenance in replicated database is very costly in terms of performance Consistency becomes bottleneck of data management deployment in cloud Costly to maintain

6 CAP CAP theorem Consistency, Availability, Partition Three desirable, and expected properties of real-world services Brewer states that it is impossible to guarantee all three

7 CAP: Consistency - atomic
Data should maintain atomic consistency There must exist a total order on all operations such that each operation looks as if it were completed at a single instant This is not the same as the Atomic requirement in ACID

8 CAP: Available Data Objects
Every request received by a non-failing node in the system must result in a response No time requirement Difficult because even in severe network failures, every request must terminate Brewer originally only required almost all requests get a response, this has been simplified to all

9 CAP: Partition Tolerance
When the network is partitioned all messages sent from nodes in one partition to nodes in another partition are lost This causes the difficulty because Every response must be atomic even though arbitrary messages might not be delivered Every node must respond even though arbitrary messages may be lost No failure other then total network failure is allowed to cause incorrect responses

10 CAP: Consistent & Partition Tolerant
Ignore all requests Alternate solution: each data object is hosted on a single node and all actions involving that object are forwarded to the node hosting the object

11 CAP: Consistent & Available
If no partitions occur it is clearly possible to provided atomic (consistent), available data Systems that run on intranets and LANs are an example of these algorithms

12 CAP: Available & Partition Tolerant
The service can return the initial value for all requests The system can provide weakened consistency, this is similar to web caches

13 CAP: Weaker Consistency Conditions
By allowing stale data to be returned when messages are lost it is possible to maintain a weaker consistency Delayed-t consistency- there is an atomic order for operations only if there was an interval between the operations in which all messages were delivered

14 CAP Can only achieve 2 out of 3 of these
In most databases on the cloud, data availability and reliability (even if network partition) are achieved by compromising consistency Traditional consistency techniques become obsolete

15 Evaluation Criteria for Data Management
Elasticity scalable, distribute new resources, offload unused resources, parallelizable, low coupling Security untrusted host, moving off premises, new rules/regulations Replication available, durable, fault tolerant, replication across globe

16 Evaluation of Analytical DB
Analytical DB handles historical data with little or no updates - no ACID properties. Elasticity Since no ACID – easier E.g. no updates, so locking not needed A number of commercial products support elasticity. Security requirement of sensitive and detailed data third party vendor store data Replication Recent snapshot of DB serves purpose. Strong consistency isn’t required.

17 Analytical DBs - Data Warehousing
Data Warehousing DW - Popular application of Hadoop Typically DW is relational (OLAP) but also semi-structured, unstructured data Can also be parallel DBs (teradata) column oriented Expensive, $10K per TB of data Hadoop for DW Facebook abandoned Oracle for Hadoop (Hive) Also Pig – for semi-structured

18 Evaluation of Transactional DM
Elasticity data partitioned over sites locking and commit protocol become complex and time consuming huge distributed data processing overhead Security same as for analytical

19 Evaluation of Transactional DM
Replication data replicated in cloud CAP theorem: Consistency, Availability, data Partition, only two can be achievable consistency and availability – must choose one availability is main goal of cloud consistency is sacrificed database ACID violation – what to do?

20 Transactional Data Management

21 Transactional Data Management
Needed because: Transactional Data Management heart of database industry almost all financial transaction conducted through it rely on ACID guarantees ACID properties are main challenge in transactional DM deployment in Cloud.

22 Transactional DM Transaction is sequence of read & write operations.
Guarantee ACID properties of transactions: Atomicity - either all operations execute or none. Consistency - DB remains consistent after each transaction execution. Isolation - impact of a transaction can’t be altered by another one. Durability - guarantee impact of committed transaction.

23 Existing Transactions for Web Applications in the Cloud
Two important properties of Web applications all transactions are short-lived data request can be responded to with a small set of well-identified data items Scalable database services like Amazon SimpleDB and Google BigTable allow data to be queried only by primary key. Eventual data consistency is maintained in these database services.

24 Related Research Different types of consistency
Strong consistency – subsequent accesses by transactions will return updated value Weak consistency – no guarantee subsequent accesses return updated value Inconsistency window – period between update and when guaranteed will see update Eventual consistency – form of weak If no new updates, eventually all accesses return last updated value Size of inconsistency window determined by communication delays, system load, number of replicas Implemented by domain name system (DNS)

25 Commercial Cloud Databases
Amazon Dynamo 100% available Read sensitive Amazon Relational Database Services MySQL builtin - replica management used All replicas in the same location Microsoft Azure SQL Primary with two redundancy servers Quorum approach Xeround MySQL [2012] Selected coordinator processes read & write requests Google introduced Spanner Extremely scalable, distributed, multiversion DB Internal use only

26 Tree Based Consistency (TBC)
Our proposed approach: Minimize interdependency Maximize throughput All updates are propagated through a tree Different performance factors are considered Number of children is also limited Tree is dynamic New type of consistency ‘apparent consistency’ is introduced

27 System Description of TBC
Two components Controller Replica server Controller Tree creation Failure recovery Keeping logs Replica server Database operation Communication with other servers 27

28 Performance Factors Identified performance factors
Time required for disk update Workload of the server Reliability of the server Time to relay a message Reliability of network Network bandwidth Network load

29 PEM Causes to enormous performance degradation
Disk update time, workload or reliability of server Reliability, bandwidth, traffic load of network Performance Evaluation Metric 𝑃𝐸𝑀= 𝑖=1 𝑛 (pfi∗wfi) pfi= ith performance factor wfi= ith weight factor Wfi cloud be positive or negative Bigger PEM means better

30 Building Consistency Tree
Prepare the connection graph G(V,E) Calculate PEM for all nodes Select the root of the tree Run Dijkstra’s algorithm with some modification Predefined fan-out of tree is maintained by algorithm Consistency tree is returned by algorithm

31 Example Connection Path Server Server Reliability (pf1)
.9 .8 .7 .6 25 20 15 26 17 23 24 19 22 .98 .91 .93 .99 .96 Server Reliability (pf1) 50 20 60 30 10 Server Delay (pf2) Path Reliability (pf1) Path Delay (pf2) wf1 = 1 ; wf2 = -.02 5 1 6 2 4 2 5 3 3 6 1 4

32 Update Operation An update operation is done in four steps
An update request will be sent to all children of the root The root will continue to process the update request on its replica The root will wait to receive confirmation of successful updates from all of its immediate children A notification for a successful update will be sent from root to the client

33 Consistency Flag Two types of consistency flag used:
Partial consistent flag, Fully consistent flag Partial consistent flag Set by top-down approach Last updated operation sequence number is stored as flag Inform immediate children Set Fully Consistent Flag Set by bottom-up approach leaf found empty descendants list, set fully consistent flag as operation sequence number informs immediate ancestor ancestor set fully consistent flag after getting confirmation from all descendants

34 Consistency Assurance
All update requests from user are sent to root Root waits for its immediate descendants during update requests Read requests are handled by immediate descendants of root

35 Maximum Number of Allowable Children
Larger number of children higher interdependency possible performance degradation Smaller number of children Less reliability Higher chance of data loss Three categories of trees in experiment sparse, medium and dense t = op + wl Where t-resp time, op-disk time, wl-OS load

36 Maximum Number of Allowable Children
Maximum number should be set by trading off between reliability and performance

37 Inconsistency Window Amount of time a distributed system is being inconsistent Reason behind Time consuming update operation Accelerate update operations System starts processing next operation in queue Getting confirmation from certain number of node Not waiting for all to reply

38 MTBC Modified TBC MTBC Root sends update request to all replica
Root waits for only its children to reply Intermediate nodes will make sure either their children are updated or not MTBC Reduces inconsistency window Increase complexity at children end

39 Effect of Inconsistency Window
Inconsistency window has no effect on performance Possible data loss only if root and its children all go down at the same time

40 Failure Recovery Primary server failure
controller finds most updated servers with help of consistency flag finds max reliable server from them rebuild consistency tree initiate synchronization

41 Failure Recovery Primary server failure
Controller identifies max reliable server from them Rebuilds consistency tree Initiates synchronization Other server or communication path down Checks server down or communication down Rebuilds tree without down server Finds alternate path Reconfigures tree

42 Apparent Consistency All write requests are handled by root
All read requests are handled by root’s children Root and its children are always consistent Other nodes don’t interact with user User found system is consistent any time We call it “Apparent Consistency” – ensures strong consistency

43 Different Strategies Three possible strategies for consistency TBC
Classic Quorum Each update requires participation of all nodes Better for databases that are updated rarely Based on quorum majority voting Better for databases that are updated frequently

44 Differences in read and write operations

45 Classic Read and Write Technique

46 Quorum Read and Write Technique

47 TBC Read and Write Technique

48 Experiments Compare the 3 strategies

49 Experiments Design All requests pass through a gateway to the system
Classic, Quorum & TBC implemented on each server A stream of read & write requests are sent to the system Transport layer is implemented in the system Thousands of ping responses are observed to determine transmission delay and packet loss pattern Acknowledgement based packet transmission

50 Experimental Environment
Experiments are performed on a cluster called Sage Green prototype cluster at UA Intel D201GLY2 mainboard 1.2 GHz Celeron CPU 1 Gb 533 Mhz RAM 80 Gb SATA 3 hard drive D201GLY2 builtin 10/100 Mbps LAN Ubuntu Linux 11.0 server edition OS Java 1.6 platform MySQL Server (version )

51 Workload Parameters

52 Effect of Database Request Rate
Experiment: Compare the response time of the 3 approaches of Classic, Quorum and TBC as the request rate increases from λ of .01 to .80 Examine response time for read, write and combined

53 Effect of Database Request Rate
Quorum has large read response Quorum write is better Classic write has highest response time Classic read performs slightly better than TBC read

54 Effect of Database Request Rate
Classic defeats quorum at this point Quorum has a higher response time at higher arrival rate Classic’s performance is expected TBC performs best at any arrival rate

55 Effect of Read-Write Ratio
Experiment: Compare the response time of the 3 approaches of Classic, Quorum and TBC as the read/write decreases from 90/10% to 50/50% Examine response time for read, write and combined

56 Effect of Read-Write Ratio
Quorum has less effect on response time by higher write ratio Read set and write sets aren’t separate Classic & TBC affected by higher write ratio Ratio has no effect on read response time

57 Effect of Read-Write Ratio
Classic is affected more than TBC by higher write ratio Higher write ratio has less effect on quorum

58 Effect of Network Quality
Experiment: Compare the response time of the 3 approaches of Classic, Quorum and TBC for 3 different types of network quality Dedicated, regular, unreliable Examine response time for read, write and combined

59 Effect of Network Quality
Classic read always performs better Classic write is affected by network quality

60 Effect of Network Quality
TBC always performs better Network quality affects classic the most

61 Effect of Heterogeneity
Experiment: Compare the response time of the of Classic, and TBC for 3 different heterogeneous infrastructures Low, medium, high

62 Effect of Heterogeneity
Infra structure heterogeneity affects classic

63 Findings TBC performs better for different arrival rates – higher frequency of requests TBC performs better for some read-write ratios TBC performs better for different network quality (packet loss, network congestion) TBC performs better in a heterogeneous environment Next step is to include transaction management in TBC 63

64 Next Challenge - Transactions
We are going to present how TBC can be used in transaction management, serializability maintenance, and concurrency control in cloud databases We also analyze the performance of the transaction manager Auto-scaling and partitioning will be addressed too 64

65 Transactions A Transaction is a logical unit of database processing
A Transaction consists one or more read and/or write operations We can divide transactions into Read-write transactions Read-only transactions Execute update, insert or delete operations on data items Can have read operations too Have only read operations 65

66 Serializability A number of transactions must be serializable
Conflict - two transactions perform operations on same data item and at least one is a write Order of conflicting operations in interleaved schedule same as order in some serial schedule Final states are same Potential techniques to be implemented Timestamp ordering Commit ordering

67 Serializability A schedule of transactions is serializable if it is equivalent to some serial execution order Conflict serializability All Conflicting operations in the same order in both schedules 67

68 Concurrency Control Mechanism to maintain ACID properties in concurrent access to DBs Four major techniques: locking, timestamp, multiversion and optimistic We use: Locking-lock the data items before getting access to them Timestamp-execute conflicting operations in order of their timestamps 68

69 Read-Write Transaction in TBC
Transaction initiation request Transaction sequence number Transaction request to Root of TBC generated tree Lock request to lock manager Accepts or rejects lock request Transaction request to immediate children of Root Children reply to Root Transaction commit process sends successful reply to user Synchronize updates with parent periodically User Program 2 1 Controller 8 3 Root 5 4 7 6 Lock Manager Intermediate Children Version Manager 9 Other Children 69

70 Read-Only Transaction in TBC
Transaction initiation request Transaction sequence number Transaction request to intermediate children of Root Executes transaction, requests version information Ensure latest version of data Results sent back to user User Program 2 1 Controller 6 3 Root Lock Manager Intermediate Children 5 4 Version Manager Other Children 70

71 Lock Manager Two types of lock fields Secondary Lock Database
Primary Lock Database Tables Key Columns Rows 71

72 Concurrency Control in TBC
Distributed concurrency control Controller, Root Combination of locking & timestamp mechanisms Lock manager is implemented at the root Controller manages incremental sequence number Deadlock scenario is resolved by wait-die strategy 72

73 Serializability in TBC
Dedicated thread launched to execute operations in read-write transactions one by one at root node Thread executes both read and write operations within a transaction serially multiple threads execute multiple transactions concurrently. Threads in read-set servers associated with read-only transactions execute read operations within a transaction serially Multiple read-only transactions read concurrently Read-only transactions read values written by committed transactions, no conflict with currently executing transactions. 73

74 Serializability in TBC
Proved: Read-write transaction is serializable Read-only transaction is serializable Together read-write and read-only transaction execution is serializable Therefore - TBC ensures serializability

75 Common Isolation Problems
Dirty read, unrepeatable read, lost update TBC solves these: Dirty read Only read from committed transactions Unrepeatable read Data items are locked until transaction commits Lost update Two transactions never have a lock on the same data item at the same time 75

76 ACID Maintenance in TBC
Atomicity Committed transactions are written to DB Failed transactions are discarded from the list Consistency TBC approach maintains apparent consistency – guarantees strong consistency Isolation Lock manager ensures two transactions never try to access the same data items Durability Multiple copies of committed transactions 76

77 Different Strategies for Experiments
Three possible strategies for consistency Classic Quorum TBC Each update requires participation of all nodes Better for databases that are updated rarely Quorum – used by MS Azure SQL and Xeround cloud DBs Based on quorum majority voting 77

78 Classic Approach Read-Write Transaction Read-Only Transaction 78

79 Quorum Approach Read-Write Transaction Read-Only Transaction 79

80 TBC Approach Read-Write Transaction Read-Only Transaction 80

81 Experiment Design Classic, Quorum & TBC implemented on each server
A stream of read-write & read-only transaction requests are sent to the system Transport layer is implemented in the system Acknowledgement based packet transmission Thousands of ping responses are observed at different times to determine transmission delay and packet loss pattern Same seed is used in random variable for every approach 81

82 Default Parameters 82

83 Effect of Transaction Request Rate
Experiment: Compare the response time of the 3 approaches of Classic, Quorum and TBC as the request rate increases from λ of .05 to .25 Examine response time for read-only, read-write and combined Examine transaction restart percentage 83

84 Effect of Transaction Request Rate
Interdependency is the main reason behind a higher response time for quorum Classic and TBC have lower response time for no dependency TBC has slightly better performance than classic Classic has higher interdependency and higher response time Quorum has lower response time than classic TBC has better & slow growing response time 84

85 Effect of Transaction Request Rate
Quorum has a worse performance time for higher read-only response time Classic is also affected by a higher arrival rate TBC has a very slow growing response time with a higher arrival rate Classic has an exponential growth in restart percentage Quorum also has an exponential growth restart percentage but better than classic TBC has a linear growth in restart percentage 85

86 Effect of Read-Write and Read-Only Ratio
Experiment: Compare the response time of the 3 approaches of Classic, Quorum and TBC as the read-write/read-only decreases from 90/10% to 50/50% Examine response time for read-only, read-write and combined Examine transaction restart percentage 86

87 Effect of Read-Write and Read-Only Ratio
Quorum response time decreases with lower read-only ratio Classic and TBC response time increases with lower read-only ratio - more writes so higher loads on R/W set servers TBC still performs slightly better Quorum response time has an exponential growth at a higher read-write ratio Classic response time also has similar exponential growth TBC is less effected by higher read-write ratio, almost linear 87

88 Effect of Read-Write and Read-Only Ratio
Both classic and quorum response time has exponential growth Classic has better performance than quorum TBC response time grows slowly, almost linear Everyone has exponential growth in restart percentage TBC has fewer restarted transactions than others at higher read-write ratio 88

89 Effect of Number of Tables and Columns
Experiment: Compare the response time of the 3 approaches of Classic, Quorum and TBC as the number of tables, columns increases from 5,5 to 45,25 Examine response time for read-only, read-write and combined Examine transaction restart percentage 89

90 Effect of Number of Tables and Columns
Response time of read-only transactions remains the same with respect to a higher number of tables and columns As expected, quorum has the highest and TBC has the lowest response time and classic is slightly higher than TBC Slightly decreasing response time is identified for read-write transactions with respect to a higher number of tables and columns TBC as expected performed better than the others 90

91 Effect of Number of Tables and Columns
As expected, slightly decreasing response time with respect to higher number of tables & columns found for everyone TBC performs noticeably better than the others Clear deceasing pattern found for restart percentage for every approach TBC has higher restart percentage than the others Higher restart percentage does not affect response time for TBC 91

92 Auto Scaling Auto scaling means automatically add or remove resources when necessary One of the key features promised by a cloud platform Scale up means add resources when workload exceeds defined threshold Scale down means remove resources when workload decreases beyond defined threshold Thresholds are pre-defined according to SLAs 92

93 Types of Scaling Two types Vertical scaling Horizontal scaling
Scaling is done by moving to more powerful resources Or moving to less powerful resource Scaling is done by adding resources Or removing resources 93

94 Partitioning Way to share workload with additional servers to support elasticity Process of splitting a logical database into smaller ones An increased workload can be served by several sites If relation r is partitioned Divided into fragments r1, r2, …… rn Fragments contain enough info to reconstruct r 94

95 Auto Scaling for Read-Only Transaction
Scaling up Servers overloaded are identified Additional server is requested Connects to the root Database synchronization Scaling down Server underutilization is identified Remove the connection to root Disengaged the server from the system No database partitioning is required 95

96 Auto Scaling for Read-Only Transaction
User Controller Read-Only Transaction Requests Root Children Additional Servers 96

97 Auto Scaling for Read-Write Transaction
Scaling up Server overloading is identified Additional servers are requested to form write-set New consistency tree is prepared Database partitioning is initiated Controller updates partition table Scaling down Server underutilization is identified Initiate database reconstruction from the partitions Disengage servers from the system Each write set is associated with a partition 97

98 Auto Scaling for Read-Write Transaction
User Controller Read-Write Transaction Requests Root Write-Set Servers Additional Servers Children 98

99 Database Partitioning in TBC
99

100 Future Direction Auto scaling is very important in terms of performance Identify stable workload change Predict future workload pattern to prepare in advance Apply machine learning An efficient database partition Less conflict Avoid distributed query processing Find smart algorithm for partitioning 100

101 Conclusions For databases in a cloud: Consistency remains bottleneck
TBC is proposed to meet all these challenges Experimental results shows TBC Reduces interdependency Able to take advantage of network parameters 101

102 Conclusions TBC Transaction management system
Guarantees ACID properties Maintains serializability Prevents common isolation level mistakes Hierarchical lock manager allows more concurrency Experimental results show it is possible to maintain concurrency without sacrificing consistency and response time in clouds Quorum may not always perform better in a cloud 102

103 Conclusions Tree-Based Consistency
maintains strong consistency in database transaction execution provides customers with better performance than existing approaches is a viable solution for ACID transactional database management in a cloud 103

104 End of TBC

105 Relational Joins Hadoop is not a DB
Debate between parallel DBs and MR for OLAPS Dewitt/Stonebreaker call MR “step backwards” Parallel faster because can create indexes

106 Relational Joins - Example
Given 2 data sets S and T: (k1, (s1,S1)) k1 is join attribute, s1 is tuple ID, S1 is rest of attributes (k2, (s2,S2)) (k1, (t1,T1)) info for T (k2, (t2,T2)) S could be user profiles – k is PK, tuple info about age, gender, etc. T could be logs of online activity, tuple is particular URL, k is FK

107 Reduce side Join 1:1 Map over both datasets, emit (join key, tuple)
All tuples grouped by join key – what is needed for join Which is what type of join? Parallel sort-merge join If one-to-one join – at most 1 tuple from S, T match If 2 values, one must be from S, other from T, (don’t know which since no order), join them

108 Consistency in Clouds

109 Current DB Market Status
MS SQL doesn’t support auto scaling and load. MySQL recommended for “lower traffic” New products: advertise replace MySQL with us Oracle recently released on-demand resource allocation IBM DB2 can auto scale with dynamic workload. Azure Relational DB – great performance


Download ppt "Data Bases in Cloud Environments"

Similar presentations


Ads by Google