Presentation is loading. Please wait.

Presentation is loading. Please wait.

Arrested by the CAP Handling Data in Distributed Systems

Similar presentations


Presentation on theme: "Arrested by the CAP Handling Data in Distributed Systems"— Presentation transcript:

1 Arrested by the CAP Handling Data in Distributed Systems
Aviran Mordo, VP of Engineering, Wix.com linkedin/aviran aviransplace.com

2 Service A Service B System A, two systems

3 What is this arrow? Service A Service B
Arrow represent a distributed system

4 Microservices = Distributed System
eCom Catalog syb-system

5 Over 800 Microservices (unique) in Production

6 Hello Aviran Mordo, VP of Engineering, Wix.com @aviranm

7

8 Wix.com in Numbers 130M website builders (+2M monthly)
600M monthly visitors Multiple clouds & data centers (Google, Amazon) Over 800 microservices 2000 Employees (~50% R&D) #5 best software companies to work for worldwide (according to Glassdoor)

9 AGENDA Avoiding database transactions Handling database schema changes
Read consistency in a distributed system Dealing with multiple datacenters

10 01 Avoid DB Transactions

11 Create an Invoice

12 Create an Invoice Header Multiple line items Master – details tables

13 Create an Invoice Header Save as Transaction Multiple line items

14 Create an Site Multiple Pages
Just like invoice with multiple line items, we save a site with multiple pages

15 How do we save multiple pages in a transaction (without DB transaction)?

16 Replace DB Transaction with Logical Transaction

17 Saving a Wix Site’s Data
Browser Saving a Wix Site’s Data List of page IDs Editor Server Save page(s) Save header Save each page as an atomic operation Finalize transaction by sending site header (pointers to pages) Site Pages DB Save page(s) Site Header DB Save header Can generate orphaned pages, not a problem in practice Logical DB transaction

18 Master-Master Replication across DCs
MySQL Active – Active DC-1 DC-2 Master-Master Replication across DCs Pages MySQL Pages MySQL Replicating data across DC (conflicts)

19 Write Traffic may Flow to Both Datacenters
Browser Browser Save page Save page DC-1 DC-2 Write Traffic may Flow to Both Datacenters Pages MySQL Pages MySQL

20 Stop replication or Ignore conflict (drop incoming)
Wix users change millions of pages every day. DC-1 DC-2 Replication Conflict Pages MySQL Pages MySQL MySQL strategy Stop replication or Ignore conflict (drop incoming)

21 DB Conflicts can be safely ignored as content is identical
Avoiding Replication Conflicts DC-1 DC-2 Pages MySQL Pages MySQL Page ID is a content-based hash: • Immutable data • Idempotent operation DB Conflicts can be safely ignored as content is identical

22 02 Database & Schema Changes

23 No Downtime

24 Database Changes Add Fields Remove Fields
Complete Schema / Database Change Altering very large tables may take a very long time and cause downtime.

25 Database Changes Add Fields Remove Fields
Complete Schema / Database Change 1.1. For adding metadata (non-indexed fields) Use a blob field for schema flexibility (JSON works really well).

26 Database Changes Add Fields Remove Fields
Complete Schema / Database Change 1.1. For adding metadata (non-indexed fields) Use a blob field for schema flexibility (JSON works really well). 1.2. If the fields are searchable (indexed) Use another table and join by primary key.

27 Database Changes Add Fields Remove Fields
Complete Schema / Database Change 1.1. For adding metadata (non-indexed fields) Use a blob field for schema flexibility (JSON works really well). 1.2. If the fields are searchable (indexed fields) Use another table and join by primary key. 2. Stop using it in the code. Do not do any DB schema changes.

28 Database Changes Add Fields Remove Fields
Complete Schema / Database Change 1.1. For adding metadata (non-indexed fields) Use a blob field for schema flexibility (JSON works really well). 1.2. If the fields are searchable (indexed fields) Use another table and join by primary key. 2. Stop using it in the code. Do not do any DB schema changes. 3. Lazy migration

29 Feature Toggles

30 Feature Toggle = Code branch
FT Open New Code Old Code Mitigate risk by gradually exposing a feature

31 Feature Toggle = Code branch
FT Open FT Open Not just a Boolean, can also be a state. Can have criteria: Company employees Specific users / group Percentage of traffic By GEO By Language By user-agent User Profile based Any other context… New Code Old Code Mitigate risk by gradually exposing a feature

32 New DB Schema with Data Migration
Plan a lazy migration path controlled by feature toggle Deploy the new schema/DB

33 Distributed Transaction
#1 #2 Write to both (first old then new) / Read from old Warning! Distributed Transaction #3 Write to both / Read from New, fallback to old Fail on write to old, “ignore" failure on new Backward compatibility is a must! Write to old / Read from old #4 Write only to New / Read from new, fallback to old #5 Eagerly migrate data in the background #6 Write and Read to new - Remove migration code Point of No Return Your old DB is now read-only and will not change.

34 Remove old DB

35 03 Consistent Read

36 Glasses.com Store owner Customer
In this use case we have 2 actors that need different consistency level

37 Store owner updates a product’s details
UpdateProduct(…) Product Service Save data Master DB Slave DB Replicate

38 Customer wants to view a product
GetProduct(…) Product Service Read data Master DB Slave DB Replicate

39 Store owner wants to view a product for update
Usually not an issue... GetProduct(…) Product Service Read data Master DB Slave DB Replicate

40 Store owner wants to view a product for update
...unless there’s a replication lag. GetProduct(…) Product Service Read data Master DB Slave DB Replicate

41 Store owner wants to view a product for update
Separate API for consistent reads GetConsistentProduct(…) Product Service Read data Master DB Slave DB Replicate Good for read after write

42 04 Multiple Datacenters

43 Multiple Data Centers DC-1 DC-2 Replicate Product Service
GetConsistentProduct(…) GetConsistentProduct(…) DC-1 DC-2 Product Service Product Service Read data Read data Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate

44 Cross DC Replication Lag
GetConsistentProduct(…) GetConsistentProduct(…) DC-1 DC-2 Product Service Product Service Read data Read data Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate Inconsistent data

45 Cross DC Flows DC-1 DC-2 Replicate Load Balancer Load Balancer
Product Service Product Service Read data Read data Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate

46 Option 1 Pin APIs to Active DC

47 GetConsistentProduct(…)
Configure Master DC in the LB Configure API-level Stickiness DC-1 DC-1 DC-2 Master DC GetConsistentProduct(…) GetConsistentProduct(…) Load Balancer Load Balancer Product Service Product Service Read data Read data Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate

48 GetConsistentProduct(…)
Configure Master DC in the LB Configure API-level Stickiness Pros: Fine grain control over API No changes for the service Cons: Complicated LB configuration Multiple connection strings (one for master and one for replica DB DC-1 DC-1 DC-2 Master DC GetConsistentProduct(…) GetConsistentProduct(…) Load Balancer Load Balancer Product Service Product Service Read data Read data Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate

49 Separate read/write Services
Option #2 Separate read/write Services

50 Configure Master DC in the LB Configure Service-level Stickiness
GetConsistentProduct(…) Master DC Load Balancer Load Balancer Product Write Service Product Read Service Product Write Service Product Read Service Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate Seperating services also help scaling No need for 2 DB connetion strings one for master and other for replica like in the prev exmple

51 Configure Master DC in the LB Configure Service-level Stickiness
Pros: No multiple DB connection strings Simpler LB configuration Fits microservices architecture best practice Better for scaling read services Cons: More complicated system (adding another microservice) Additional service for the client to talk with DC-1 DC-1 DC-2 GetConsistentProduct(…) Master DC Load Balancer Load Balancer Product Write Service Product Read Service Product Write Service Product Read Service Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate Seperating services also help scaling No need for 2 DB connetion strings one for master and other for replica like in the prev exmple

52 Pin DB to Service using SQLProxy
Option #3 Pin DB to Service using SQLProxy

53 Configure Master DC in the SQL Proxy
GetConsistentProduct(…) Master DC Load Balancer Load Balancer Product Service Product Service SQL Proxy SQL Proxy Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate

54 Configure Master DC in the SQL Proxy
Pros: Simple microservice DB configuration DB replication lag monitoring Adds DB maintenance flexibility Cons: Adding DB access latency Take away control from the developers DC-1 DC-1 DC-2 GetConsistentProduct(…) Master DC Load Balancer Load Balancer Product Service Product Service SQL Proxy SQL Proxy Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate

55 Option #4 Redirect Client

56 Client Routing Browser Replicate DC-1 DC-2 Product Service
GetProduct(…) GetProduct(…) Product Service Product Service Read data Read data Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate DC-1 DC-2

57 Client Routing Browser Master DC Replicate DC-1 DC-2 Product Service
GetConsistentProduct(…) Browser GetProduct(…) GetProduct(…) Master DC Product Service Product Service Read data Read data Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate DC-1 DC-2

58 Client Routing Browser Master DC Replicate DC-1 DC-2 Pros:
Fine grain control over API Simpler DC configuration Cons: Complicated client configuration Traffic changes need to update all clients with new config GetConsistentProduct(…) Browser GetProduct(…) GetProduct(…) Master DC Product Service Product Service Read data Read data Master DB Slave DB Replicate Master DB Slave DB Replicate Replicate DC-1 DC-2

59 RECAP Option 1– API-level cross DC Option 2 – Separate Service
Option 3 - ProxySQL (pin to DC) Option 4 – Client routing

60 WHAT WE DO AT WIX Option 1– API-level cross DC
Option 2 – Separate Service Option 3 - ProxySQL (pin to DC) Option 4 – Client routing

61 Informing the users of eventual consistency processes
Your changes are being applied, it may take few minutes to show up on the site… Just like invoice with multiple line items, we save a site with multiple pages

62 Client API should remain simple
Is Store Owner GetConsistentProduct GetProduct Yes No Slave DB Master Replicate Product Write Service Product Read Service API server GetProduct(…) GetConsistentProduct(…) Client API should not be aware of consistency concerns

63 Arrow -> Distributed System
Avoiding database transactions Handling database schema changes Read consistency in a distributed system Dealing with multiple datacenters

64 Thank You Download presentation at: http://wix.to/sUCZAGY
linkedin/aviran aviransplace.com


Download ppt "Arrested by the CAP Handling Data in Distributed Systems"

Similar presentations


Ads by Google