Presentation is loading. Please wait.

Presentation is loading. Please wait.

ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

Similar presentations


Presentation on theme: "ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)"— Presentation transcript:

1 ITCS6010/ Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

2 ITCS6010/ Outline Motivations Multi-tenancy database (MTD) MTD database schema layouts –Existing: basic, private, extension, universal, pivot –New: chunk table & chunk folding Querying chunk tables Summary

3 ITCS6010/ SaaS & Multi-Tenancy SaaS (Software-as-a-Service) –Software managed by SaaS vendor, run on SaaS server Tenancy = client organization –~10 users, small/medium business Multi-instance –separate instance for each tenant Multi-tenancy –single instance of software serving multiple tenants

4 ITCS6010/ Why Multi-Tenancy Economy of scale –large number of tenants –sharing the cost of single software instance cost saving for each individual user Examples?

5 ITCS6010/ Multi-Tenancy Example

6 ITCS6010/ Multi-Tenancy in Practice Complexity of Application Size of Machine Blade Big iron Small Large CRM 10 ERP 1000 Proj Mgmt 1 Banking # tenants per database Economy of scale decreases with application complexity At the sweet spot, compare TCO of 1 versus 100 databases

7 ITCS6010/ A Typical Blade Server IBM HS20 (Wikipedia) IBM BladeCenter HS12: CPU (single, dual, quad-core xeon 2~3 GHz) Memory (max): 24 GB Internal storage (max): 293 GB 24G / (10k tenant) = 2.4M/tenant

8 ITCS6010/ Multi-Tenancy in Practice Complexity of Application Size of Machine Blade Big iron Small Large CRM 10 ERP 1000 Proj Mgmt 1 Banking # tenants per database Economy of scale decreases with application complexity At the sweet spot, compare TCO of 1 versus 100 databases

9 ITCS6010/ Alternative: Multi-Instance + Virtualization

10 ITCS6010/ Multi-Tenant Databases (MTD) Consolidating multiple businesses onto same operational system –Consolidation factor dependent on size of the application and the host machine Support for schema extensibility –Essential for ERP applications Support atop of the database layer –Non-intrusive implementation –Query transformation engine maps logical tenant-specific tables to physical tables –Various problems, for example: –Various table utilization (hot spots) –Metadata management when handling lots of tables

11 ITCS6010/ Outline Motivations Multi-tenancy database (MTD) MTD database schema layouts –Existing: basic, private, extension, universal, pivot –New: chunk table & chunk folding Querying chunk tables Summary

12 ITCS6010/ Classic Web Application (Basic Layout) Pack multiple tenants into the same tables by adding a tenant id column Great consolidation but no extensibility Account AcctId 1 2 Name Acme Gump...TenId Ball 421Big

13 ITCS6010/ Private Table Each tenant gets his/her own private schema –No sharing –SQL transformation: Renaming only High meta-data/data ratio –Lots of tables: linear in # of tenants, each with own meta-data –Low buffer utilization (~8k for index/data for each table) Account AcctId 1 2 Name Acme Gump...TenId Ball 421Big

14 ITCS6010/ CRM Schema

15 ITCS6010/ Handling Lots of Tables Simplifying assumption: No extensibility Experiment setup: –CRM schema with 10 tables –10,000 tenants are packed onto one DBMS (DB2, 1G Memory) –Data set size remains constant Parameter: Schema Variability –Number of tenants per schema instance –0 (least variable): all tenants share one instance (= 10 tables) –1 (most variable): each tenant has separate instance (= 100k tables)

16 ITCS6010/ Handling Lots of Tables – Results 10 fully shared Tables private Tables

17 ITCS6010/ Extension Table Split off the extensions into separate tables –Additional join at runtime –Row column for reconstructing the row (discussion: consider Acct17) Account17(A, N, H, B) = select Ae.A, Ae.N, Ha.H, Ha.B from Account_ext as Ae, Healthcare_account as Ha where Ae.Tenant = Ha.Tenant & Ae.Row = Ha.Row and Ae.Tenant = 17

18 ITCS6010/ Extension Table (Contd) Good: Better consolidation than Private Table layout –common attributes go to same table Bad: Number of tables still grows in proportion to number of tenants –tenants in same domain may often have varied schema –so large # of *extension* tables (such as Healthcare_account)

19 ITCS6010/ Universal Table Generic structure with VARCHAR value columns –n-th column of a logical table is mapped to ColN in an universal table –Extensibility: # of columns may expand as needed Disadvantages –Very wide rows Many NULL values –Not type-safe Casting necessary –No index support (note column index not very meaningful) logical table (private after renaming)

20 ITCS6010/ Pivot Table 1. Each field of a row in logical table is given its own row. 2. Multiple pivot tables for each type (int, string, e.g.) Row: 0 Int String

21 ITCS6010/ Reconstruct Tenant Table from Pivot Table Row: 0 Int String Account17(H, B) = select Ps.Str, Pi.int from Pivot_str as Ps, Pivot_int as Pi where Ps.Tenant = Pi.Tenant & Ps.Table = Pi.Table & Ps.Row = Pi.Row & Ps.Col = 2 & Ps.Col = 3 & Ps.Tenant = 17 & Ps.Table = 0 align (Only Hospital & Beds)

22 ITCS6010/ Reconstruct Tenant Table (contd) Row: 0 Int String Account17(A, N, H, B) = select Pi.int, Ps.str, Ps.Str, Pi.int from Pivot_int as Pi, Pivot_str as Ps, Pivot_str as Ps, Pivot_int as Pi where Pi.Tenant = Ps.Tenant & Ps.Tenant = Ps.Tenant & Ps.Tenant = Pi.Tenant & Pi.Table = Ps.Table & Ps.Table = Ps.Table & Ps.Table = Pi.Table & Pi.Row = Ps.Row & Ps.Row = Ps.Row & Ps.Row = Pi.Row & Pi.Col = 0 & Ps.Col = 1 & Ps.Col = 2 & Ps.Col = 3 & Ps.Tenant = 17 & Ps.Table = 0 align: same tenant, table, row 4-way join

23 ITCS6010/ Pivot Table: Performance Generic type-safe structure –Eliminates handling many NULL values Performance –Depends on the column selectivity of the query (number of reconstructing joins) –E.g., query on A17(H,B) is more selective than A17(A,N,H,B) –See previous example Row: 0 Int String

24 ITCS6010/ Pivot Table Chunk Table Row: 0 Chunk 0Chunk 1 Field 0Field 1Field 2Field 3 Pivot table Chunk table one row for each field one row for each chunk

25 ITCS6010/ How to Chunk or Fragment Original Rows Many possible fragmentations One idea: group fields by their popularity

26 ITCS6010/ Chunk Table Row: 0 Chunk 0Chunk 1 Account17(A, N, H, B) = select Cis.Int1, Cis.Str1, Cis.Str1, Cis.Int1 from Chunk Cis, Chunk Cis whereCis.Chunk = 0 & Cis.Chunk =1 & Cis.Row = Cis.Row & Cis.Tenant = Cis.Tenant & Cis.Table = Cis.Table 2. merge chunk1. align rows 3. reconstruct original row 2-way join

27 ITCS6010/ Chunk Table vs. Universal Table Chunk 0Chunk 1 # of rows (for each original row) = # of chunks Chunk 0 only one row for each original row Universal as extreme chunking: only one chunk per row

28 ITCS6010/ Chunk Table Performance Fewer joins for reconstruction Indexable Reduced meta-data/data ratio dependant on chunk size Row: 0 Chunk 0Chunk 1

29 ITCS6010/ Chunk Folding: Alternative Chunking Mixes Extension and Chunk Tables Optimal row fragmentation depends on, e.g. –Workload –Data distribution –Data popularity base table chunk the extensions

30 ITCS6010/ Querying Chunk Tables Query Transformation –Row reconstruction needs many self- and equi-joins –Can be automatically translated Compilation Scheme: 1.Collect all table names and their corresponding columns from the logical source query 2.Obtain table definitions: for each table a.obtain the Chunk Tables and the meta-data identifiers representing the used columns b.generate a query that filters the correct columns (based on the meta-data identifiers) and aligns the different chunk relations on their ROW column. 3.Extend each table reference in the logical source query by corresponding table definition (obtained in step 2)

31 ITCS6010/ Query Example: Step 1 Step 1 result: tables = {Account17} columns = {Account17.Beds, Account17.Hospital}

32 ITCS6010/ Query Example: Step 2 (Pivot Table) Int String

33 ITCS6010/ Query Example: Step 3 (Chunk Table) Chunk 0Chunk 1

34 ITCS6010/ Join Overhead Costs No aligning joins Join Overhead

35 ITCS6010/ Summary Multi-tenancy database critical to scale SaaS solution Varied schema layout schemes –different degrees of consolidation & extensibility –optimal layout depends on particular data set, work load, etc. Novel Chunk Table layout –wider chunks tend to achieve better performance –improvement slows down at some point (e.g., |chunk| = 15) –response time using chunking may approach conventional tables Need more work on finding good layout w/ multiple factors


Download ppt "ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)"

Similar presentations


Ads by Google