Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi-Tenant Databases for SaaS (Software as a Service)

Similar presentations


Presentation on theme: "Multi-Tenant Databases for SaaS (Software as a Service)"— Presentation transcript:

1 Multi-Tenant Databases for SaaS (Software as a Service)
11/17/08 (Some of the Slides are from Alfons Kemper)

2 Outline Motivations Multi-tenancy database (MTD)
MTD database schema layouts Existing: basic, private, extension, universal, pivot New: chunk table & chunk folding Querying chunk tables Summary

3 SaaS & Multi-Tenancy SaaS (Software-as-a-Service)
Software managed by SaaS vendor, run on SaaS server Tenancy = client organization ~10 users, small/medium business Multi-instance separate instance for each tenant Multi-tenancy single instance of software serving multiple tenants

4 Why Multi-Tenancy Economy of scale Examples? large number of tenants
sharing the cost of single software instance cost saving for each individual user Examples?

5 Multi-Tenancy Example

6 Multi-Tenancy in Practice
Big iron # tenants per database 10000 1000 100 Size of Machine 10000 1000 100 10 10000 1000 100 10 1 Blade Proj Mgmt CRM ERP Banking Small Large Complexity of Application Economy of scale decreases with application complexity At the sweet spot, compare TCO of 1 versus 100 databases

7 A Typical Blade Server IBM HS20 (Wikipedia) IBM BladeCenter HS12:
CPU (single, dual, quad-core xeon 2~3 GHz) Memory (max): 24 GB Internal storage (max): 293 GB 24G / (10k tenant) = 2.4M/tenant

8 Multi-Tenancy in Practice
Big iron # tenants per database 10000 1000 100 Size of Machine 10000 1000 100 10 10000 1000 100 10 1 Blade Proj Mgmt CRM ERP Banking Small Large Complexity of Application Economy of scale decreases with application complexity At the sweet spot, compare TCO of 1 versus 100 databases

9 Alternative: Multi-Instance + Virtualization

10 Multi-Tenant Databases (MTD)
Consolidating multiple businesses onto same operational system Consolidation factor dependent on size of the application and the host machine Support for schema extensibility Essential for ERP applications Support atop of the database layer Non-intrusive implementation Query transformation engine maps logical tenant-specific tables to physical tables Various problems, for example: Various table utilization (“hot spots”) Metadata management when handling lots of tables

11 Outline Motivations Multi-tenancy database (MTD)
MTD database schema layouts Existing: basic, private, extension, universal, pivot New: chunk table & chunk folding Querying chunk tables Summary

12 Classic Web Application (Basic Layout)
Pack multiple tenants into the same tables by adding a tenant id column Great consolidation but no extensibility Account TenId AcctId Name ... 17 1 Acme 17 2 Gump 35 1 Ball 42 1 Big

13 Private Table Each tenant gets his/her own private schema
No sharing SQL transformation: Renaming only High meta-data/data ratio Lots of tables: linear in # of tenants, each with own meta-data Low buffer utilization (~8k for index/data for each table) Account TenId AcctId Name ... 17 1 Acme 17 2 Gump 35 1 Ball 42 1 Big

14 CRM Schema

15 Handling Lots of Tables
Simplifying assumption: No extensibility Experiment setup: CRM schema with 10 tables 10,000 tenants are packed onto one DBMS (DB2, 1G Memory) Data set size remains constant Parameter: Schema Variability Number of tenants per schema instance 0 (least variable): all tenants share one instance (= 10 tables) 1 (most variable): each tenant has separate instance (= 100k tables)

16 Handling Lots of Tables – Results
10 fully shared Tables private Tables

17 Extension Table Split off the extensions into separate tables
Additional join at runtime Row column for reconstructing the row (discussion: consider Acct17) Account17(A, N, H, B) = select Ae.A, Ae.N, Ha.H, Ha.B from Account_ext as Ae, Healthcare_account as Ha where Ae.Tenant = Ha.Tenant & Ae.Row = Ha.Row and Ae.Tenant = 17

18 Extension Table (Cont’d)
Good: Better consolidation than Private Table layout common attributes go to same table Bad: Number of tables still grows in proportion to number of tenants tenants in same domain may often have varied schema so large # of *extension* tables (such as Healthcare_account)

19 Universal Table Generic structure with VARCHAR value columns
n-th column of a logical table is mapped to ColN in an universal table Extensibility: # of columns may expand as needed Disadvantages Very wide rows  Many NULL values Not type-safe  Casting necessary No index support (note column index not very meaningful) logical table (private after renaming)

20 Pivot Table Each field of a row in logical table is given its own row.
Multiple pivot tables for each type (int, string, e.g.) Int String

21 Reconstruct Tenant Table from Pivot Table
Row: 0 Account17(H, B) = select Ps.Str, Pi.int from Pivot_str as Ps, Pivot_int as Pi where Ps.Tenant = Pi.Tenant & Ps.Table = Pi.Table & Ps.Row = Pi.Row & Ps.Col = 2 & Ps.Col = 3 & Ps.Tenant = 17 & Ps.Table = 0 Int align (Only Hospital & Beds) String

22 Reconstruct Tenant Table (cont’d)
align: same tenant, table, row Row: 0 Account17(A, N, H, B) = select Pi’.int, Ps’.str, Ps.Str, Pi.int from Pivot_int as Pi’, Pivot_str as Ps’, Pivot_str as Ps, Pivot_int as Pi where Pi’.Tenant = Ps’.Tenant & Ps’.Tenant = Ps.Tenant & Ps.Tenant = Pi.Tenant & Pi’.Table = Ps’.Table & Ps’.Table = Ps.Table & Ps.Table = Pi.Table & Pi’.Row = Ps’.Row & Ps’.Row = Ps.Row & Ps.Row = Pi.Row & Pi’.Col = 0 & Ps’.Col = 1 & Ps.Col = 2 & Ps.Col = 3 & Ps.Tenant = 17 & Ps.Table = 0 4-way join Int String

23 Pivot Table: Performance
Row: 0 Generic type-safe structure Eliminates handling many NULL values Performance Depends on the column selectivity of the query (number of reconstructing joins) E.g., query on A17(H,B) is more selective than A17(A,N,H,B) See previous example Int String

24 Pivot Table  Chunk Table
Row: 0 Row: 0 Chunk 0 Chunk 1 Field 0 Field 1 Field 2 Field 3 one row for each field one row for each chunk Chunk table Pivot table

25 How to Chunk or Fragment Original Rows
Many possible fragmentations One idea: group fields by their “popularity”

26 Chunk Table 3. reconstruct original row Account17(A, N, H, B) =
select Cis.Int1, Cis.Str1, Cis’.Str1, Cis’.Int1 from Chunk Cis, Chunk Cis’ where Cis.Chunk = 0 & Cis’.Chunk =1 & Cis.Row = Cis’.Row & Cis.Tenant = Cis’.Tenant & Cis.Table = Cis’.Table Row: 0 Chunk 0 Chunk 1 2-way join 2. merge chunk 1. align rows

27 Chunk Table vs. Universal Table
# of rows (for each original row) = # of chunks Universal as extreme chunking: only one chunk per row Chunk 0 only one row for each original row

28 Chunk Table Performance
Fewer joins for reconstruction Indexable Reduced meta-data/data ratio dependant on chunk size Row: 0 Chunk 0 Chunk 1

29 Chunk Folding: Alternative Chunking
Mixes Extension and Chunk Tables Optimal row fragmentation depends on, e.g. Workload Data distribution Data popularity base table chunk the extensions

30 Querying Chunk Tables Query Transformation Compilation Scheme:
Row reconstruction needs many self- and equi-joins Can be automatically translated Compilation Scheme: Collect all table names and their corresponding columns from the logical source query Obtain table definitions: for each table obtain the Chunk Tables and the meta-data identifiers representing the used columns generate a query that filters the correct columns (based on the meta-data identifiers) and aligns the different chunk relations on their ROW column. Extend each table reference in the logical source query by corresponding table definition (obtained in step 2)

31 Query Example: Step 1 Step 1 result: tables = {Account17}
columns = {Account17.Beds, Account17.Hospital}

32 Query Example: Step 2 (Pivot Table)
String Int

33 Query Example: Step 3 (Chunk Table)

34 Join Overhead Costs Join Overhead No aligning joins

35 Summary Multi-tenancy database critical to scale SaaS solution
Varied schema layout schemes different degrees of consolidation & extensibility optimal layout depends on particular data set, work load, etc. Novel Chunk Table layout wider chunks tend to achieve better performance improvement slows down at some point (e.g., |chunk| = 15) response time using chunking may approach conventional tables Need more work on finding good layout w/ multiple factors


Download ppt "Multi-Tenant Databases for SaaS (Software as a Service)"

Similar presentations


Ads by Google