Presentation is loading. Please wait.

Presentation is loading. Please wait.

The ScaleDB Storage Engine Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture Moshe Shadmon.

Similar presentations


Presentation on theme: "The ScaleDB Storage Engine Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture Moshe Shadmon."— Presentation transcript:

1 The ScaleDB Storage Engine Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture Moshe Shadmon

2 Agenda Overview ScaleDBs Clustering Architecture o Shared-Disk vs. Shared-Nothing o MySQL and a Shared-Disk Storage Engine o ScaleDB Installation o Demo ScaleDBs Indexing Technology o Multi-Table Index o Enabling Multi-Table Index in MySQL o Demo Summary ScaleDB Status & Product Availability

3 Overview Plug-in Storage Engine for MySQL Main Features: o Shared-Disk Architecture o Innovative Multi-Table Indexing o Transactional o Row-Level Locking o ACID Compliant o Atomicity: All tasks of a transaction performed or none of them are. o Consistency: The database is in a consistent state before and after the transaction. o Isolation: Data is not available in an intermediate state during a transaction o Durability: When a transaction completes, the transactions data will persist o Disk-Based Storage Engine

4 Shared-Disk vs. Shared-Nothing Manageability Adaptability Availability/Fault-Tolerance Scalability Performance Total Cost of Ownership (TCO)

5 Shared-Nothing: Database Instance 1 Table A Table B Table C Database Instance 1 Database Instance 2 Database Instance 3 Table A Table B Table C Vertical Partitioning

6 Shared Nothing: Partitioning Your Data…How Predict usage patterns, application evolution, data growth patterns…all are moving targets Avoid data skew: bottlenecks caused by frequently accessed data on just a few nodes Avoid data shipping between nodes Avoid delays from distributed 2-phase commit Searches outside the partition column require participation by all nodes Scaling becomes an exercise in fire fighting

7 Bob2010K Shideh1835K Ted5060K Kevin62120K Angela55140K Mike4590K Physical View nameage salary Partitioned by Salary Logical View Shared-Nothing: Horizontal Partitioning Ted5060K Kevin62120K Mike4690K nameage salary Bob2010K nameage salary Shideh1835K Angela55140K nameage salary Horizontal Partitioning – Salary % 3

8 Selections with equality predicates referencing the partitioning attribute are directed to a single node: o Retrieve Emp where salary = 60K SELECT FROM Emp WHERE salary=60K Equality predicates referencing a non- partitioning attribute and range predicates are directed to all nodes: o Retrieve Emp where age = 20 o Retrieve Emp where salary < 20K SELECT FROM Emp WHERE salary<20K Shared-Nothing: Horizontal Partitioning Pitfalls

9 DB Cluster Node 1 DB Cluster Node 2 DB Cluster Node 3 Table A Table B Table C Shared Disk Subsystem High-Speed Interconnect Shared-Disk: No Partitioning, Full Access to Data Database Instance 1 Table A Table B Table C

10 Node A Node B Node C Slave A Slave B Slave C Scalability & Availability Shared Nothing

11 Scalability & Availability Shared Disk Node A Node B Node C Data MySQL Servers with ScaleDB Engine Node DNode E

12 Grow by simply adding nodes to the cluster o Servers can be added and removed dynamically according to your needs o No interruption to your application High-Availability with dynamic failover o Existing nodes automatically take over Significantly reduced maintenance costs o Can be built on low-cost commodity hardware o No data partitioning o No need for slaves Low Total Cost of Ownership (TCO) Shared-Disk: Summarizing Shared-Disk Benefits

13 ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer Server Instance A Shared-Disk: Making it work with MySQL Node 1 ScaleDB Engine Instance B Buffer Manager Cluster Manager Comm. Layer Node 2 Server Instance B Shared Disk Sub-system Cluster Interconnect

14 ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer Node 1 Server Instance A ScaleDB Engine Instance B Buffer Manager Cluster Manager Comm. Layer Node 2 Server Instance B Shared-Disk: Insert New Row Shared Disk Sub-system Cluster Interconnect

15 ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer Node 1 Server Instance A ScaleDB Engine Instance B Buffer Manager Cluster Manager Comm. Layer Node 2 Server Instance B Shared-Disk: Select Shared Disk Sub-system Cluster Interconnect

16 ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer Node 1 Server Instance A ScaleDB Engine Instance B Buffer Manager Cluster Manager Comm. Layer Node 2 Server Instance B Shared-Disk: Create Table Shared Disk Sub-system Cluster Interconnect Table A Meta-Data Table A Meta-Data

17 ScaleDB Installation Define cluster = true in ScaleDB Config file: ScaleDB.cnf is at the same directory as my.cnf: Cluster params: o cluster = true o nodes_in_cluster = 2 o node_id = 1 o this_machine_port = 100 o next_machine_ip_address = o next_machine_port = 100 o log_directory = /share/logs/

18 Demo - Sysbench ScaleDB cluster – one node – show throughput ScaleDB cluster – 2 nd node – show throughput

19 ScaleDB: Multi-Table Indexing B-tree: Only indexes the data in tables Index #1 #1#2 Index #2 Index #3 Index #4 Index #5 #3#4#5 ScaleDB Index #1 #2 #3 #4 #5 ScaleDB: Indexes the data and relationships Advantages: Faster Smaller Referential integrity

20 Example Scenario: Select information that is spread across 3 tables: Colleges, Students and Enrollment Relationships: Students are enrolled in courses within departments of colleges SELECT c1.CollName, s.StudName, c2.CourseName, e.Grade FROM College AS c1 JOIN Student AS s JOIN Enrollment AS e JOIN Course AS c2 ON ( c1.CollNo = s.CollNo AND s.CollNo = e.CollNo AND s.StudentNo = e.StudentNo AND e.CollNo = c2.CollNo AND e.DeptNo = c2.DeptNo AND e.CourseNum = c2.CourseNum ) WHERE c1.CollNo = X AND s.StudentNo = Y ;

21 Option #1: Conventional Joins IDCollegeStudents 234Institute of Technology1, High Tech Institute5,742 85Golden State College2, Kaplan College12, California College1,926 IDStudent NameSS#Phone 1220Bruce Chizen (650) Naomi Seligman (279) Raymond Bingham 8872Reed Hastings (312) Maria Klawe 1123Bernard Vergnes CollegeIDCourse NameStudentGrade 510C67Mathematics C123History C14Photography Students Table College Table Enrollment Table Search enrollment by College & Student Get Student information Get College information

22 Option #2: Materialized View IDCollegeStudentsIDCourse NameIDStudent Name 234Institute of Technology1,334C134Mathematics1145John Cheechoo… 234Institute of Technology1,334C134Mathematics1837Ryane Clowe… 234Institute of Technology1,334C134Mathematics2256Patrick Marleau… 234Institute of Technology1,334C134Mathematics2277Jamie McGinn… 234Institute of Technology1,334C134Mathematics4113Torrey Mitchell… 234Institute of Technology1,334C134Mathematics1145… 385Golden State College2,224G85World History7783Joe Pavelski… 385Golden State College2,224G85World History2234Jeremy Roenick… 385Golden State College2,224G85World History1177Devin Setoguchi… 385Golden State College2,224G85World History4113Torrey Mitchell…...

23 Col_ID# Col_Name Col_Budget Col_Description Colleges 001Agriculture$1,234,567Nice place to visit 002Arts$5,432,567Sports not so good 003Business$9,999,666Cool logo 004Education$3,234,567Ugh Worcester 005Engineering$8,238,568Serious work 006Law$7,237,767Jumpy students 007Liberal Arts$9,898,777Pretty campus 008Medicine$5,987,004In Texas Students Mike HoganCaucasian Moshe SmithCaucasian Sally ShadmonNative American Billy FleegleAfrican American Saul GoodeAfrican American Tim CollinsPolynesian Sam GeeAsian Rod PaulinoAsian Enrollment B C B A B C F D Coll_ID#Coll_NameColl_BudgetColl_Description Student_ID# College_ID# Student_Name Student_Desc College_ID# Dept_ID#Student_ID# Grade Option #3: Multi-Table Index College Students Enrollment Departments Courses ScaleDB Multi-Table Index Enrollment

24 Mapping Foreign Keys to Data Views Create Students Table o Foreign key – College Students Enrollment Create Enrollment Table o Foreign key - Students Course Create Course Table o Foreign Key – Department Department Create Department Table o Foreign key – College College Create College Table The Parent-Child tables are Created in MySQL Such that MySQL is able to operate over the new tables The data of the Parent-Child tables is assembled on the fly from the source tables

25 Mapping Foreign Keys to Data Views Students Enrollment Course DepartmentCollege DepartmentCollege StudentsCollege Physical files: 1. College 2. Department 3. Student 4. Course 5. Enrollment ScaleDB Meta-Data Tables: 1. College 2. College-Dept 3. College-Dept-Course 4. College-Students 5. College-Students-Enrollment 6. Department 7. Students 8. Course 9. Enrollment

26 Enabling the MySQL optimizer to use a Multi-Table Index SELECT c1.CollName, s.StudName, c2.CourseName, e.Grade FROM College AS c1 JOIN Student AS s JOIN Enrollment AS e JOIN Course AS c2 ON ( c1.CollNo = s.CollNo AND s.CollNo = e.CollNo AND s.StudentNo = e.StudentNo AND e.CollNo = c2.CollNo AND e.DeptNo = c2.DeptNo AND e.CourseNum = c2.CourseNum ) WHERE c1.CollNo = X AND s.StudentNo = Y ; CREATE TABLE sdb_view_college_course_student ( L1_CollNo INT NOT NULL, L1_CollName CHAR(32) NOT NULL, L1_CollBudget INT NOT NULL, L1_CollDescription CHAR(60) NOT NULL, … Table College Columns L2_StudNo INT NOT NULL, L2_StudName CHAR(48) NOT NULL, … Table Student Columns L3_CourseNum CHAR(9) NOT NULL, L3_Grade CHAR(2) NOT NULL, … Table Enrollment Columns PRIMARY KEY ( L1_CollNo, L2_StudtNo, L3_CourseNum)) ENGINE = SCALEDB; Select L1_CollName, L2_StudName, L3_CourseName, L3_Grade FROM sdb_view_college_course_student WHERE l1_CollNo = X AND l2_StudentNo = Y ;

27 The Multi-Table Index Multi-Table Index appears to MySQL as a data table ScaleDB does not maintain data file associated with the Multi-Table Index For a query using virtual table, ScaleDB assembles the rows on the fly using the Multi-Table Index ScaleDB indexes are different than B-tree indexes ScaleDB indexes provide the same functionality as B-tree, plus… o They maintain referential integrity with minimal overhead o They allow you to search for the data and relationships o They are much smaller in size

28 Demo Query with join Query with Multi-Table Index 2 nd node virtual table

29 Benchmarking ScaleDB Index

30 Summary ScaleDB Cluster o Multiple ScaleDB instances share the same physical data. o Connecting to the cluster is similar to connecting to a single node. o For the application, the cluster appears as a single node. o Transparent application failover o Transparent Scalability ScaleDB Indexes o Provide the B-tree functionality o High performance Map relationships Maintain referential integrity Smaller footprint Independent of the key size

31 ScaleDB Status and Product Availability Started Beta Process o We are looking for beta companies Product launch is scheduled for June timeframe Please talk to us if you are developer interested in working with ScaleDB


Download ppt "The ScaleDB Storage Engine Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture Moshe Shadmon."

Similar presentations


Ads by Google