Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.1 Isolation Levels in PostgreSQL Kevin Grittner | 2016.02.05 | PgConf.Russia 2016.

Similar presentations


Presentation on theme: "© Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.1 Isolation Levels in PostgreSQL Kevin Grittner | 2016.02.05 | PgConf.Russia 2016."— Presentation transcript:

1 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.1 Isolation Levels in PostgreSQL Kevin Grittner | 2016.02.05 | PgConf.Russia 2016

2 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.2 The Problem: Race Conditions The Simple Solution: The “I” in ACID Solving it the Hard Way: Blocking Future Improvements for Serializable Transactions Overview

3 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.3 The Problem: Race Conditions Update In Place versus Multi-Version Concurrency Control

4 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.4 With Update In Place Any concurrent access (read or write) to same scope while a connection is writing With MVCC Any concurrent write access to same scope while a connection is writing Potential Race Conditions

5 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.5 With Update In Place Any concurrent access (read or write) to same scope while a connection is writing Examples: Firebird, VoltDB, some configurations of MS SQL Server With MVCC Any concurrent write access to same scope while a connection is writing Examples: PostgreSQL, Oracle, some configurations of MS SQL Server Potential Race Conditions

6 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.6 With Update In Place Any concurrent access (read or write) to same scope while a connection is writing Examples: Firebird, VoltDB, some configurations of MS SQL Server Traditionally protected with blocking locks using S2PL or weaker variations on it With MVCC Any concurrent write access to same scope while a connection is writing Examples: PostgreSQL, Oracle, some configurations of MS SQL Server Traditionally protected with OCC, or SI, supplemented with explicitly requested blocking locks; PostgreSQL now has SSI as alternative Potential Race Conditions

7 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.7 Different transactions or statements can see different versions of the database state, which all exist within the database at once Access uses a “snapshot” to control the version it sees Due to atomicity, the work of a transaction does not become visible to other transactions until it commits When two transactions are both running, neither can see the “in flight” work of the other Blocking is not needed to prevent seeing partial work MVCC: Why Is It Different?

8 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.8 Let's say there is a bank that will cover a shortage in one account with money from another of that person's accounts, as an “overdraft protection” service -– (simplified) setup create table account ( name text not null, type text not null, balance money not null default '0.00'::money, primary key (name, type) ); insert into account values ('kevin','saving', 500), ('kevin','checking', 500); Overdraft Protection Example: Setup

9 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.9 –- session 1 begin; select type, balance from account where name = 'kevin'; update account set balance = balance - 900::money where name = 'kevin' and type = 'saving'; commit; –- session 2 begin; select type, balance from account where name = 'kevin'; update account set balance = balance - 900::money where name = 'kevin' and type = 'checking'; commit; Overdraft Protection Example: Exploit

10 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.10 The Simple Solution: The “I” in ACID Isolation can and should prevent race conditions automatically.

11 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.11 ACID: Defined Atomicity: A set of changes grouped into a database transaction must appear atomically, both to concurrent transactions and in terms of crash recovery. Consistency: Each transaction must move the database from one consistent state (with regard to business rules) to another. Isolation: For programming efficiency, each transaction must be able to be coded independently of what other transactions may happen to be running at the same time. Durability: In the event of a crash, all modifications made by transactions for which the application was notified of successful completion, and all modifications which had become visible to other transactions, must still be completed upon crash recovery.

12 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.12 SQL-92 (and later): “The isolation level of a SQL- transaction is SERIALIZABLE by default. The level can be explicitly set by the.” “The execution of concurrent SQL-transactions at isolation level SERIALIZABLE is guaranteed to be serializable. A serializable execution is defined to be an execution of the operations of concurrently executing SQL-transactions that produces the same effect as some serial execution of those same SQL- transactions. A serial execution is one in which each SQL-transaction executes to completion before the next SQL-transaction begins.” ACID Implies Serializable Isolation; The SQL Standard Agrees

13 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.13 SELECT … FOR UPDATE SELECT … FOR SHARE LOCK TABLE Advisory locks To worry about what else is running at the same time With Serializable Isolation You Don't Need

14 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.14 default_transaction_isolation = 'serializable' Retry transactions that throw a serialization failure With Serializable Isolation You Do Need

15 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.15 default_transaction_isolation = 'serializable' Retry transactions that throw a serialization failure Optimizations that can help  Declare transactions as READ ONLY when possible  Increase max_pred_locks_per_transaction  Use a connection pooler  Decrease random_page_cost and/or increase cpu_tuple_cost  Pay attention to transaction scope  Avoid “idle in transaction” connections for long periods With Serializable Isolation You Do Need

16 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.16 Begin loop Begin transaction isolation serializable Try Catch sqlstate in ('40001', '40P01') Rollback transaction and discard any results Continue loop Commit transaction End loop Retrying Serialization Failures

17 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.17 Solving it the Hard Way: Blocking You can force actual serialization of operations in application code.

18 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.18 Two-phase locking techniques can be used under MVCC, although by doing so you give up the MVCC concurrency benefits to some degree or other To solve the problem this way, you must either essentially implement strict two-phase locking (S2PL) yourself in application code, or analyze what transactions you have and where they could create anomalies so that you can address those specific points There are three main ways to cover race conditions under weaker isolation levels:  Serially schedule conflicting tasks  Promote conflicts (e.g., SELECT … FOR UPDATE)  Materialize conflicts (e.g., add summary columns; always updated) Using Blocking With MVCC

19 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.19 A queue of desired changes to the account table could be created, and the requests serviced one-at-a-time. As long as there is no overlap between transactions updating the table, the race condition is avoided. Transactions making changes like this could issue a LOCK TABLE statement which would prevent concurrent data modification at the front of every transaction which will need to validate against other accounts like this. As long as the lock is acquired before any statement needs a snapshot, a valid test can be made. Overdraft Protection Example: Protection through Serializing Tasks

20 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.20 –- session 1 begin; select type, balance from account where name = 'kevin' for update; update account set balance = balance - 900::money where name = 'kevin' and type = 'saving'; commit; –- session 2 begin; select type, balance from account where name = 'kevin' for update; update account set balance = balance - 900::money where name = 'kevin' and type = 'checking'; commit; Overdraft Protection Example: Protection through Conflict Promotion

21 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.21 Add a customer table; ensure referential integrity from the account table. Create a trigger function, and triggers, to ensure that a total balance for the customer is maintained in the customer table whenever there is an INSERT, UPDATE, or DELETE against the account table. Take special care for updates of the primary key column(s). Determine how the redundant data will be periodically checked for accuracy. Overdraft Protection Example: Protection through Conflict Materialization

22 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.22 Future Improvements for Serializable Transactions Sadly, the feature is not perfect – yet.

23 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.23 Currently, you might want to check for the presence of a row by a primary key, unique constraint, or unique index and add the row if missing. If a concurrent transaction adds the same row, it will not be clear that this is a serialization failure, and the retry loop shown on a previous slide will not work without modification. The new UPSERT capability might be useful to work around this. A patch has been submitted (for a later major release) to improve this. Duplicate Key on Unique Index: Serialization Failure, or Not?

24 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.24 Internally, reads and writes by transactions need to be tracked and compared to detect read-write conflicts; heap tuples are locked to track tuples which were read, and index “gaps” are locked to track what tuples were not read (and thus have a read-write dependency on later inserts). There is a flag in the AM definition to say whether the AM understands this and implements support. When this is false, a read- write dependency is created between any read from the index and any write which adds a tuple to the index. The btree AM is currently the only one with support. Index AMs besides btree

25 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.25 It would reduce false positives if we could enhance the granularity of btree SIRead locks Related to this, we should have some explicit flag to indicate whether an SIRead lock is covering a tuple or a gap, as this could optimize heap access a little Btree Only Goes to Page Granularity

26 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.26 One university study of a SERIALIZABLE technique they were hoping would perform better than what PostgreSQL uses found that with the very high number of connections creating many read-write dependencies PostgreSQL used about 50% of the CPU time on the machine recording and testing read-write dependencies, due to a O(N^2) technique for tracking these. We were faster than the alternative anyway. It would still be very good to fix this. O(N^2) Bottleneck at High Concurrency

27 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.27 Because even a read-only transaction can contribute to a serialization anomaly, and a hot standby does not have access to the internal state related to tracking this, serializable transactions are not allowed on a hot standby. It would be possible to pass minimal data from the primary to standbys in the WAL stream to indicate at what points a transaction could get a snapshot which would behave in a serializable fashion. A request for a serializable transaction could then use a safe snapshot. Serializable Transactions on Hot Standby

28 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.28 The techniques used for serializable transactions in PostgreSQL are somewhat optimistic – assuming low direct contention, and perform poorly when that optimism is misplaced. The classic example is when using a table as a queue, where there is high contention on each end of the queue. This is an exception to the “don't need explicit” locks case, at least for good performance. Good news: there is work underway to make serializable transactions the best performing option when queue access does not need to be strictly ordered (and therefore serialized). High Contention Points: Queue Tables

29 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.29 In spite of the room for improvement, the technique used for serializable transactions in PostgreSQL has benchmarked faster than the alternatives for most workloads, requires less programming effort to use, and is less error-prone. See the paper referenced in the next slide for benchmarks performed at MIT, some of which were confirmed during peer review for the presentation of that paper at the VLDB conference. This technique has been in production use at many sites for years. Keep Calm and Try It!

30 © Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.30 PostgreSQL Transaction Isolation Documentation: http://www.postgresql.org/docs/current/static/transaction-iso.html Examples of serializable transactions helping: https://wiki.postgresql.org/wiki/SSI Internals, development, and benchmark: http://drkp.net/papers/ssi-vldb12.pdf Additional Resources...


Download ppt "© Copyright EnterpriseDB Corporation, 2015. All Rights Reserved.1 Isolation Levels in PostgreSQL Kevin Grittner | 2016.02.05 | PgConf.Russia 2016."

Similar presentations


Ads by Google