SQL Server Storage Architecture Physical IO - subsystem Disk Logical Disk System – Windows Drives Drive C:Drive D:Drive E: SQL Server Storage FileGroupB FileB2 FileB1 FileGroupA FileA1 Table1Table2
Solution to what? Load Speed (ETL) Query Speed Data Management Backup / Restore DBCC CHECKDB, remove Fragmentation
Solutions Use Multiple FileGroups/Files Spread Data to maximize resource use Sliding Window if there is a time dimension Partitioned Tables and/or Views ETL – Insert into empty unindexed tables Use READ_ONLY FileGroups to minimize maintenance needs.
I/O Performance Little has changed in 50 years Watch out for bottlenecks in the I/O Path Memory reduces the need for I/O Disks can only do so many I/O operations per second The more disk heads you have the higher the I/O throughput.
At 3 PM on the 1 st of the month: Where do you want your data to be?
Sliding Window Always There Data Temporal Data Temporal Data Temporal Data Temporal Data Temporal Data
Read_Only FileGroups Require only one Backup Dont require page or row locks Dont require maintenance The ALTER requires exclusive access to the database before SQL 2008 ALTER DATABASE MODIFY FILEGROUP SET READ_ONLY
Concern - Load Performance (ETL) 4 Hour maximum window for any load Load into large indexed tables is unacceptably long. Example: 2 million row insert into 400 million row table with 10 indexes took 12 hours.
Concern – Query Performance Users have little patience Data warehouse Queries Frequent small to medium to support UI Less frequent large queries on fact tables may access 10s of GB
Fact Table Queries Concentrated time period Most recent Year ago May go against full table to get year-against-year
Dimension Table Queries Smaller than fact table queries Sometimes involve millions of rows Frequent – support the UI
Partitioned Views Available in SQL Server Standard Available in SQL Server 2000 Created like any view Check constraints tell SQL Server which data is in which table CREATE VIEW Fact AS SELECT * FROM Fact_ UNION ALL SELECT * FROM Fact_ ALTER TABLE Fact_ ADD CONSTRAINT CK_FACT_ _Date CHECK (FactDate >= and FactDate <
Partitioned View - 2 Looks to a query like any table or view Can take advantage of parallel execution. Limited to 256 tables Can cross servers (Performance Warning) SELECT FactDate, ….. FROM Fact WHERE CustID= AND FactDate =
View Fact Partitioned View Physical IO - subsystem Disk Logical Disk System – Windows Drives Drive C:Drive D:Drive E: SQL Server Storage FileGroupB FileB2 FileB1 FileGroupA FileA1 Table1Table2 FGF1 F1 FGF2FGF3FGF4 F4F3F2 Fact_ Fact_ Fact_ FGF1 F1 FGF2FGF3FGF4 F4F3F2
Partition Elimination The query compiler can eliminate partitions from consideration in the plan Partition elimination happens at query compile time. Values matching the partitioning column must be constants to allow partition elimination.
Demo 1 – Partitioned Views
Partitioned Tables SQL Server Enterprise SQL Server 2005 and Above Require a non-null partitioning column Check constraints tell SQL Server what data is in each parturition All tables are partitioned!
Partitioned Tables 2 Partition Function Defines how to split data Partition Scheme Defines where to store each range of data CREATE Partitioned View Fact_PF(smalldatetime) RANGE RIGHT FOR VALUES ( , ) CREATE PARTITION SCHEME Fact_PF AS PARTITION Fact_pf TO (PRIMARY, FG_ , FG_ )
Table Fact Partitioned Table Physical IO - subsystem Disk Logical Disk System – Windows Drives Drive C:Drive D:Drive E: SQL Server Storage FileGroupB FileB2 FileB1 FileGroupA FileA1 Table1Table2 Fact.$Partition=1 Fact.$Partition=2 Fact.$Partitoin=3 Fact.$Partition=4 FGF1 F1 FGF2FGF3FGF4 F4F3F2
Demo 2 – Partitioned Tables
Partitioning Goals Adequate Import Speed Maximize Query Performance Make use of all available resources Data Management Migrate data to cheaper resources Delete old data easily
Achieving Load Speed
Achieving Query Speed Eliminate access to partitions during query compile All disk resources should be used Parallel access All available memory should be used All available CPUs should be used Parallel query
Solution Partition at a sufficiently high grain Spread dimension data to all useable disks Separate Data and Index FileGroups Multiple files per FileGroup Spread Fact data by partition key to all useable disks Rotate file locations to maximize dispersion
Concern – Data Management (Backup) Lets say you have a 10 TB database. Now back that up.
Backup Calculation 10 TB = GB Typical Backup speed Low end 1 GB per minute High end 10 GB per minute At 10 GB/Minute Whos got 1000 minutes?
Achieving Backup Performance Backup less! Maintain data in a READ_ONLY state Compress Backups
Partial Backup Partial Base Backs up read_write filegroups Partial Differential Differential backup of read_write filegroups BACKUP DATABASE READ_WRITE_FILEGROUPS WITH DIFFERENTIAL …. BACKUP DATABASE READ_WRITE_FILEGROUPS …..
Maintenance Operations Maintain only READ_WRITE data DBCC CHECKFILEGROUP ALTER INDEX REBUILD PARTITION = REORGANIZE PARTITION = Avoid SHRINK
SQL Server 2008 – Whats New Row, page, and backup compression Filtered Indexes Optimization for star joins MERGE T-SQL DML Resource Governor Fewer operations require exclusive access to the database
New England Visual Basic Pro Focused on VB.Net development MS Waltham – MPR C 1st Thursday - 6:15 to 8:30 PM Sept 4 – Jim ONeil – ASP.Net Dynamic Data Sept 25 – Chris Hammond – DotNetNuke Oct 2 – Kathleen Dollard – XML Litterals in VB 9 Nov 6 – Joe Stagner – Stupid Hacker Tricks and How 2 Defend Feb 5 09 – Joe Hill – Novell – Mono/VB/etc….