Presentation is loading. Please wait.

Presentation is loading. Please wait.

Very Large Databases @murilocmiranda http://www.sql.pt/ Administration @murilocmiranda http://www.sql.pt/ murilo.miranda@gmail.com.

Similar presentations


Presentation on theme: "Very Large Databases @murilocmiranda http://www.sql.pt/ Administration @murilocmiranda http://www.sql.pt/ murilo.miranda@gmail.com."— Presentation transcript:

1 Very Large Databases @murilocmiranda http://www.sql.pt/
Administration @murilocmiranda

2 AGENDA

3 AGENDA What is a VLDB? Typical Troubles OS Config Instance Config
DB Config Maintenance AGENDA

4 VLDB??

5 There’s no official definition.
VLDB??

6 There’s no official definition.
Typically occupying TB range. VLDB??

7 VLDB?? There’s no official definition. Typically occupying TB range.
Billions of rows. VLDB??

8 VLDB?? There’s no official definition. Typically occupying TB range.
Billions of rows. Typically: OLAP or OLTP with large amount of users. VLDB??

9 Wikipedia… A very large database, or VLDB, is a database that contains an extremely high number of tuples (database rows), or occupies an extremely large physical filesystem storage space. The most common definition of VLDB is a database that occupies more than 1 terabyte or contains several billion rows, although naturally this definition changes over time. VLDB??

10 SQL vs. VLDB

11 Maximum database size SQL vs. VLDB

12 Maximum database size 524,272 TB SQL vs. VLDB

13 SQL vs. VLDB 16 TB 2 TB Maximum data file size Maximum log file size
A limit of files which can be distributed between filegroups. SQL vs. VLDB

14 Typical Troubles

15 Maintenance Typical Troubles

16 Backups Maintenance Typical Troubles

17 Backups Maintenance Indexes Typical Troubles

18 Backups Maintenance Indexes Statistics Typical Troubles

19 Typical Troubles Maintenance Backups Indexes Statistics
Disaster Recovery Typical Troubles

20 Typical Troubles Performance Maintenance Backups Indexes Statistics
Disaster Recovery Typical Troubles

21 OS CONFIG

22 Perform Volume Maintenance
OS CONFIG

23 Turning on Instant Initialization to speed up data file growth
and restores. OS CONFIG

24 Storage Layout OS CONFIG

25 Plan an efficient storage layout.
OS CONFIG

26 OS CONFIG Plan an efficient storage layout.
Normally, the more spread, the more effective. OS CONFIG

27 OS CONFIG Plan an efficient storage layout.
Normally, the more spread, the more effective. Suggestion: SQL BIN SQL DATA SQL IDX SQL LOGS SQL TMP OS CONFIG

28 Mountpoints OS CONFIG

29 Mountpoints could be a good strategy.
OS CONFIG

30 OS CONFIG Mountpoints could be a good strategy.
Mountpoints are persistent directories that point to disk volumes. OS CONFIG

31 OS CONFIG Pros: Scalable. Save drive letters (limited to 26).
Easy to add. No need to restart SQL Server. OS CONFIG

32 OS CONFIG Cons: Looks like a simple folder.
Need a different approach to monitor. OS CONFIG

33 So, if you don’t know the server….
OS CONFIG

34 OS CONFIG Partition Alignment Track = Group of Sectors and CLusters
Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10

35 Setting the partition offset properly can improve up to 30% the performance.
OS CONFIG Track = Group of Sectors and CLusters Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10

36 Setting the partition offset properly can improve up to 30% the performance.
Partition alignment increases throughput (bytes/sec) and reduce disk queues. OS CONFIG Track = Group of Sectors and CLusters Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10

37 Setting the partition offset properly can improve up to 30% the performance.
Partition alignment increases throughput (bytes/sec) and reduce disk queues. A partition that is track misaligned will occasionally cause 2 I/O operations instead of one. OS CONFIG Track = Group of Sectors and CLusters Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10

38 Unless performed at the time of partition creation, the default alignment offset (31,5 Kb) will result in unaligned partitions on versions of Windows up to and including Windows Server 2003. OS CONFIG Track = Group of Sectors and CLusters Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10

39 OS CONFIG This offset is associated with hidden sectors,
which basically store partition information. OS CONFIG Track = Group of Sectors and CLusters Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10

40 OS CONFIG This offset is associated with hidden sectors,
which basically store partition information. Considering that: Each disk sector has 512 bytes. Win has 63 hidden sectors. OS CONFIG Track = Group of Sectors and CLusters Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10

41 This offset is associated with hidden sectors,
which basically store partition information. Considering that: Each disk sector has 512 bytes. Win has 63 hidden sectors. 512 * 63 = 31,5 Kb OS CONFIG Track = Group of Sectors and CLusters Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10

42 OS CONFIG Example: Stripe Unit Size: 64Kb* Allocation Unit Size: 64Kb
Optimal values OS CONFIG Track = Group of Sectors and CLusters Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10 * Defined by storage team.

43 OS CONFIG Example: Stripe Unit Size: 64Kb* Allocation Unit Size: 64Kb
Optimal values Data (Alloc. Unit Size) Stripe Size OS CONFIG Track = Group of Sectors and CLusters Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10 * Defined by storage team.

44 OS CONFIG Optimal solution: Data (Alloc. Unit Size) Stripe Size
Track = Group of Sectors and CLusters Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10

45 OS CONFIG Best Practice: Set an offset of 1024 Kb.
This value works for mostly disks out there. Allocation Unit Size = Stripe Unit Size. The rule: Offset / Allocation unit = INTEGER Eg: 1024/64=16 OS CONFIG Track = Group of Sectors and CLusters Cluster = File allocation Unit – Um Cluster tem N Sectors – Para um allocation unit de 64K -> 512 b (sector size) * 128 = 64K!! Sector = Smallest accessible unit in a physical disk (usually 512 bytes) Stripe Unit Size = Define o tamanho que os dados serão distribuidos entre os discos de um grupo RAID-0, RAID-5, RAID-10

46 WARNIG Some I/O subsystem vendors intercepting what Windows is trying to do and are still creating partitions with the incorrect offset – Even for Windows ALWAYS check!

47 Anti-Virus in servers…
is really a need? OS CONFIG

48 OS CONFIG Cost money to license. Maintenance costs.
Can cause problems in Prod. Can’t protect to zero-day exploits. OS CONFIG

49 What can we do instead? OS CONFIG

50 OS CONFIG Keep the servers patched. Configure the firewall properly.
Restrict server’s access. You can install AV… in workstations! OS CONFIG

51 What’s the big problem for SQL Server?
OS CONFIG

52 OS CONFIG One more app fighting for resources.
SQL Server files can be locked. OS CONFIG

53 How can AV and SQL Server live together?
OS CONFIG

54 Add exceptions! OS CONFIG

55 OS CONFIG Basically the AV should ignore:
SQL Server data and log files (.mdf, .ndf and .ldf). Backup files (.bak and .trn). Full-text Catalog files. Trace files (.trc). ERRORLOG files. SQL Server binaries folder. Filestream folder. OS CONFIG More on:

56 Instance Config

57 Memory Instance Config

58 Memory This is a very open subject. Instance Config

59 Instance Config Memory This is a very open subject.
There are lots of discussions about that… Instance Config

60 Instance Config Memory This is a very open subject.
There are lots of discussions about that… There’s no perfect formula, because the correct awnser is…. Instance Config

61 Instance Config Memory … it depends !! This is a very open subject.
There are lots of discussions about that… There’s no perfect formula, because the correct answer is…. … it depends !! Instance Config It depends: HBAs or FusionIO storage, both of which require a lot of free memory for the drivers. Concurrent applications in the server. Number of instance instance or SQL Services

62 Instance Config Memory An efficient general rule…
Baseline: 1 GB for the OS Up to 16 GB available 1 GB for each 4 GB More than 16 GB 1 GB for every 8 GB Instance Config

63 Instance Config Memory This is for 64 bit servers…
For 32 bit, here is a good article to follow: Instance Config 32-bit applications are natively restricted to a 2 GB VAS /3GB allows a 32-bit process to increase its VAS to 3 GB To address more than 4 GB of RAM on 32-bit Windows, the OS needs to have the /PAE switch added to the boot.ini file

64 TempDB Instance Config

65 TempDB Two common behaviors: Instance Config

66 TempDB Two common behaviors: Ignore. Overvalue. Instance Config

67 “TempDb is the SQL’s public toilet”
As per Brent Ozar: “TempDb is the SQL’s public toilet” Instance Config

68 TempDB And this is true! Instance Config

69 TempDB Instance Config

70 Instance Config TempDB There’s a myth:
tempdb should always have one data file per processor core. Instance Config

71 Instance Config Again…. TempDB There’s a myth:
tempdb should always have one data file per processor core. Again…. Instance Config

72 Instance Config Again…. It depends! TempDB There’s a myth:
tempdb should always have one data file per processor core. Again…. It depends! Instance Config

73 The more files, the more costly.
TempDB Execute large operations, like a sort or store a huge temporary table, may be slowed down because of the round-robin operation. The more files, the more costly. Instance Config

74 Instance Config TempDB Common wait types on TempDB:
PAGELATCH_*: Contention for In-memory allocation bitmaps. PAGEIOLATCH_*: Contention at the I/O subsystem level. Instance Config

75 TempDB How many tempdb data files should we have? Instance Config

76 Instance Config TempDB How many tempdb data files should we have?
A recommended approach is: Up to 8 cores: Number of files = Number of cores. More than 8 cores: Add 8 files. Monitor PAGELATCH_*. Add 4 more files at a time, if necessary. Instance Config

77 Instance Config TempDB Other TempDB best practices:
Isolate the TempDB in a different storage system. Depending of the load, you might need to separate LDF and M(N)DF. Use a fast drive (SSD :). Set an initial size, equally to all the files. Set the auto-growth accordingly. If you have a heavy operation using constantly the TempDB, consider create a staging table into your own database. Instance Config

78 Instance Config TempDB
From SQL Server 2012, local disk TempDB in SQL Server cluster. Instance Config

79 Instance Config TempDB
From SQL Server 2012, local disk TempDB in SQL Server cluster. More flexibility. Use PCIe bus instead of HBA, and have more throughput. Data and Log are in SAN, TempDB locally: Avoid congestion or contention on a shared storage network or array. Instance Config

80 DB CONFIG

81 DB CONFIG Don’t rely on auto-grow.
You can manage file growth and control the free disk space and avoids performance problems. DB CONFIG chunks less than 64MB and up to 64MB = 4 VLFs chunks larger than 64MB and up to 1GB = 8 VLFs chunks larger than 1GB = 16 VLFs Huge t-logs cause huge VLFs Huge VLFs are hard to clear (performance issue) SQL Server can only clear (backup) inactive VLFs, so a huge VLF can take time to be free and will take time to be backed up.

82 DB CONFIG Don’t rely on auto-grow. Have page checksums turned on.
You can manage file growth and control the free disk space and avoids performance problems. Have page checksums turned on. To detect damaged pages. DB CONFIG chunks less than 64MB and up to 64MB = 4 VLFs chunks larger than 64MB and up to 1GB = 8 VLFs chunks larger than 1GB = 16 VLFs Huge t-logs cause huge VLFs Huge VLFs are hard to clear (performance issue) SQL Server can only clear (backup) inactive VLFs, so a huge VLF can take time to be free and will take time to be backed up.

83 DB CONFIG Don’t rely on auto-grow. Have page checksums turned on.
You can manage file growth and control the free disk space and avoids performance problems. Have page checksums turned on. To detect damaged pages. Make sure auto-stats update is turned on. For OLTP consider turning auto-stats update off only for heavily updated tables, and schedule a job that periodically updates the statistics for those tables. DB CONFIG chunks less than 64MB and up to 64MB = 4 VLFs chunks larger than 64MB and up to 1GB = 8 VLFs chunks larger than 1GB = 16 VLFs Huge t-logs cause huge VLFs Huge VLFs are hard to clear (performance issue) SQL Server can only clear (backup) inactive VLFs, so a huge VLF can take time to be free and will take time to be backed up.

84 DB CONFIG

85 DB CONFIG Make sure you’re managing the transaction log correctly:
Full recovery requires log backups. No advantage in have multiple log files. Control the file growth or this could cause VLF fragmentation. Performance issues. Slow backup time. Don’t set the log file growth size to a multiple of 4 in older SQL Server versions. DB CONFIG

86 MAINTENACE

87 Few questions… MAINTENANCE

88 MAINTENANCE How to meet your SLAs dealing with a TB database?
Is data-loss acceptable? What about the recovery time? Are you able to UPDATE STATS, do INDEX MAINTENANCE and run a INTEGRITY CHECK in time and WITHOUT PROBLEMS? MAINTENANCE

89 DISASTER RECOVERY MAINTENANCE

90 MAINTENANCE First of all, think in a Disaster Recovery plan!
SQL Server is not Oracle, we have “free” included options: Log Shipping (HA and DR) Database Mirroring (HA and DR) DB Snapshot advantage Replication (HA, DR and LB) AlwaysOn (HA, DR and LB) We can still be safe with a storage level replication. MAINTENANCE

91 Partition Compress Clean MAINTENANCE

92 MAINTENANCE Partition, Compress and Clean
Using the partitioning feature you can devise the maintenance. MAINTENANCE

93 MAINTENANCE Partition, Compress and Clean
Using the partitioning feature you can devise the maintenance. You can use the DBCC CHECKFILEGROUP command. DBCC CHECKFILEGROUP and DBCC CHECKDB are. The main difference is that DBCC CHECKFILEGROUP is limited to the single specified filegroup and required tables. MAINTENANCE

94 MAINTENANCE Partition, Compress and Clean
Using the partitioning feature you can devise the maintenance. Devising a filegroup architecture allows piecemeal restores with low TTR Online piecemeal restore: After the PRIMARY FG restore the DB can be online. The tables will come available while each FG is restored. Design the database accordingly: Keep the necessary into the PRIMARY FG. Configuration tables, indispensable data, etc… Think in the consistency: keep related tables in the same FG. MAINTENANCE

95 MAINTENANCE Partition, Compress and Clean
Compress backups Vs. Compress Data Backup compression: More CPU usage to backup/restore (avg ~20%). Less time to backup/restore (avg ~40%). Good compression ratio. SELECT backup_size/compressed_backup_size FROM msdb..backupset; A backup set will not be able to contain both compressed and uncompressed backups. No advantage with TDE enabled. MAINTENANCE

96 MAINTENANCE Partition, Compress and Clean
Compress backups Vs. Compress Data Data compression (ROW and PAGE): One time operation. Reduce the physical database size. Reduce the I/O required for a workload. Allows more data to be stored in the buffer cache. More CPU usage. Usually good for DW systems For OLTP may also benefit. FILESTREAM data is not compressed. MAINTENANCE

97 MAINTENANCE Partition, Compress and Clean
Compress backups Vs. Compress Data Data compression (ROW and PAGE): TDE and Data Compression play together! Backup and Data Compression can coexist! MAINTENANCE

98 MAINTENANCE Partition, Compress and Clean
Compress backups Vs. Compress Data Data compression (ROW and PAGE): ROW or PAGES compression? You can use “SQL Server Compression Estimator” MAINTENANCE

99 MAINTENANCE Partition, Compress and Clean Purge and Archive the data
Purging data: If data is needed no more… Save storage. Faster backups. Improves the performance. MAINTENANCE

100 MAINTENANCE Partition, Compress and Clean Purge and Archive the data
Archiving data: If data is still needed… Isolate in a different FG. Set as Read-Only: Avoids locking. For faster scans: 100% fill factor. Update statistics with FULLSCAN. You can adapt the backup strategy. You can adapt the backup strategy using Partial Backups. This allows you to exclude read-only filegroups. MAINTENANCE

101 MAINTENANCE More about DBCC CHECKDB
CHECKDB takes time and uses resources. Run a DBCC CHECKDB using the WITH PHYSICAL_ONLY option. Limits the checking to the integrity of the physical structure of the page and record headers and the allocation consistency of the database. Faster, but a full CHECKDB is required periodically. MAINTENANCE

102 MAINTENANCE More about DBCC CHECKDB
We can divide up the consistency checking over several days, Paul Randal’s prescription is: Divide tables in two buckets (bigger ones and the rest) On Sunday: Run a DBCC CHECKALLOC Run a DBCC CHECKCATALOG Run a DBCC CHECKTABLE on each table in the first bucket On Monday, Tuesday, Wednesday: Run a DBCC CHECKTABLE on each table in the 2nd, 3rd, 4th buckets, respectively On Thursday: Run a DBCC CHECKTABLE on each table in the 5th bucket On Friday and Saturday: Run a DBCC CHECKTABLE on each table in the 6th and 7th buckets, respectively MAINTENANCE More on:

103 MAINTENANCE More about BACKUPS
Besides doing PARTIAL BACKUPS we have more options… A MULTISTREAM BACKUP is an option to run faster: File 1 E: DB File 2 F: File 3 G: MAINTENANCE

104 MAINTENANCE More about BACKUPS
To make sure it will be well stored, we can use a MIRROR. File 1 File 1 E: DB File 2 File 2 F: File 3 File 3 G: MAINTENANCE

105 MAINTENANCE More about BACKUPS If storing to the network:
Use a separate network card to avoid network congestion. Don’t forget about T-LOG backups! Create a good backup strategy. Verify the backups periodically. MAINTENANCE

106 MAINTENANCE INDEXES MAINTENANCE
Only rebuild/defrag indexes that are really fragmented (avoid unnecessary work in short maintenance windows) If you defrag instead of rebuild, make sure you manually update stats. Be wary of doing large index maintenance jobs if you use log shipping or DBM They contribute to large log backups Index rebuilds are always full-logged when DBM is present MAINTENANCE

107 QUESTIONS?

108 OBRIGADO! @murilocmiranda


Download ppt "Very Large Databases @murilocmiranda http://www.sql.pt/ Administration @murilocmiranda http://www.sql.pt/ murilo.miranda@gmail.com."

Similar presentations


Ads by Google