1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure.

Slides:



Advertisements
Similar presentations
RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
Advertisements

- Dr. Kalpakis CMSC Dr. Kalpakis 1 Outline In implementing DBMS we need to answer How should the system store and manage very large amounts of data?
Storage and File Structure By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
Recap of Feb 20: Database Design Goals, Normalization, Normal Forms Goals for designing a database: a schema with: –simple, easy to phrase queries –avoids.
Storage and File Structure
©Silberschatz, Korth and Sudarshan11.1Database System Concepts Chapter 11: Storage and File Structure Goals:  go beyond conceptual or logical level 
Recap of Feb 25: Physical Storage Media Issues are speed, cost, reliability Media types: –Primary storage (volatile): Cache, Main Memory –Secondary or.
1 Classification of Physical storage Media Speed with which data can be accessed Cost per unit of data Reliability  data loss on power failure or system.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
CMSC424: Database Design Data Storage. Storage Hierarchy.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 11: Storage and.
©Silberschatz, Korth and Sudarshan11.1Database System Concepts Chapter 11: Storage and File Structure Overview of Physical Storage Media Magnetic Disks.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
Data Storage Technology
Dr. Kalpakis CMSC 461, Database Management Systems URL: Storage and File Structure.
Storing Data: Disks & Files
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 5 – Storage Organization.
BY Eleazar Chidke Okereke ITEC546Storage and File Structure1.
Storage and File Structure Chapter 10
1 Database Systems Storage Media Asma Ahmad 21 st Apr, 11.
Lecture 11: DMBS Internals
Physical Storage and File Organization COMSATS INSTITUTE OF INFORMATION TECHNOLOGY, VEHARI.
Lecture 8 of Advanced Databases Storage and File Structure Instructor: Mr.Ahmed Al Astal.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Chapter 10 Storage & File Structure. n Overview of Physical Storage Media n Magnetic Disks n Tertiary Storage n Storage Access n File Organization n Organization.
Database System Concepts, 5th Ed. Bin Mu at Tongji University Chapter 11: Storage and File Structure.
1 Secondary Storage Management Submitted by: Sathya Anandan(ID:123)
1 Storage and File Structure. 2 Classification of Physical Storage Media Speed with which data can be accessed Cost per unit of data Reliability  data.
Overview of Physical Storage Media
File Processing : Storage Media 2015, Spring Pusan National University Ki-Joune Li.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Lecture 9: Storage and File Structure Overview of Physical Storage Media Magnetic Disks.
Chapter 11: Storage and File Structure Chapter 11: Storage and File Structure Overview of Physical Storage Media Magnetic Disks RAID File Organization.
Chapter 11: Storage and File Structure Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization.
11.1Database System Concepts. 11.2Database System Concepts Now Something Different 1st part of the course: Application Oriented 2nd part of the course:
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
CS4432: Database Systems II Data Storage 1. Storage in DBMSs DBMSs manage large amounts of data How does a DBMS store and manage large amounts of data?
©Silberschatz, Korth and Sudarshan11.1Database System Concepts Chapter 11: Storage and File Structure Overview of Physical Storage Media Magnetic Disks.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 10: Storage and.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Storage Media 2004, Spring Pusan National University Ki-Joune Li.
CS 101 – Sept. 28 Main vs. secondary memory Examples of secondary storage –Disk (direct access) Various types Disk geometry –Flash memory (random access)
Section 13.1 – Secondary storage management (Former Student’s Note)
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Storage and File Structure Malavika Srinivasan Prof. Franya Franek.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
Storage & File Structure Meghan Nagpal. Storage Media  Cache: Small, fastest form of storage; managed by the hardware; no effects about managing cache.
Data Storage and Querying in Various Storage Devices.
11.1 Chapter 11: Storage and File Structure 11.1 Overview of physical storage media 11.2 Magnetic disks 11.3 RAID 11.4 Tertiary access 11.5 Storage access.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 10: Storage and.
Storage Overview of Physical Storage Media Magnetic Disks RAID
Chapter 11: Storage and File Structure
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
RAID Redundant Arrays of Independent Disks
Storage and Disks.
Lecture 16: Data Storage Wednesday, November 6, 2006.
Chapter 10: Storage and File Structure
Storage and File Structure
Performance Measures of Disks
File Processing : Storage Media
Lecture 11: DMBS Internals
Chapter 10: Storage and File Structure
Disk Storage, Basic File Structures, and Buffer Management
File Processing : Storage Media
Module 10: Physical Storage Systems
Chapter 11: Storage and File Structure
Storage and File Structure
Section 13.1 – Secondary storage management (Former Student’s Note)
Chapter 12: Physical Storage Systems
Presentation transcript:

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure DBMS and Storage/File Structure Why do we need to know about storage/file structure  Many database technologies are developed to utilize the storage architecture/hierarchy  Data in the database needs to be organized and stored/retrieved efficiently

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage Hierarchy Magnetic Tape Optical Disk Magnetic Disk Flash Memory Cache unit price Memory Volatile primary storage Non-volatilespeed Secondary storage Tertiary storage

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Primary Storage (Volatile) Cache  Speed: 7 to 20 ns ( 1 nanosecond = 10 –9 seconds )  Capacity: A typical PC level 2 cache 64KB-2 MB. Within processors, level 1 cache usually ranges in size from 8 KB to 64 KB. Main memory:  Speed: 10s to 100s of nanoseconds;  Capacity: Up to a few Gigabytes widely used currently per-byte costs have decreased roughly factor of 2 every 2 to 3 years)

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Secondary Storage (Non-volatile) Flash memory  Speed: read speed similar to main memory, write is much slower  Capacity: 32M to 512M currently  Forms: SmartMedia, memory stick, secure digital, BIOS  Cost: roughly same as main memory Magnetic-disk  Capacities: up to roughly 100 GB currently Growing constantly and rapidly with technology improvements (factor of 2 to 3 every 2 years)

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Tertiary Storage (Non-volatile) Optical storage  CD-ROM (640 MB) and DVD (4.7 to 17 GB) most popular forms  CD-RW, DVD-RW, and DVD-RAM  Reads and writes are slower than with magnetic disk  Juke-box systems, with large numbers of removable disks, a few drives, and a mechanism for automatic loading/unloading of disks available for storing large volumes of data

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Tertiary Storage (Non-volatile) Tape storage  non-volatile, used primarily for backup (to recover from disk failure), and for archival data  sequential-access – much slower than disk  very high capacity (40 to 300 GB tapes available)  Tape jukeboxes available for storing massive amounts of data hundreds of terabytes (1 terabyte = 10 9 bytes) to even a petabyte (1 petabyte = bytes)

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Between Memory and Disk The permanent residency of database is mostly on disk In database, cost is usually measured by the number of disk I/O But disks are too slow and we need memory to be the buffers … but memory is volatile  this introduced a number of issues

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Lets talk about disk

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Disk Subsystem Disk interface standards families ATA (AT adaptor) range of standards SCSI (Small Computer System Interconnect) range of standards Several variants of each standard (different speeds and capabilities)

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Disk Speed Seek time (milliseconds) Rotation time/latency milliseconds Data-transfer rate (4-8MB/sec) Typical numbers: H 16,000 tracks per platter H sectors per track: 200 – 400 H 512 bytes per sector H 4-16KB per block H 5, ,000 r p m Access time = seek time + latency Discuss ways to improve disk reading speed

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure RAID RAID: Redundant Arrays of Independent Disks  high capacity and high speed by using multiple disks in parallel, and  high reliability by storing data redundantly

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Mean time to failure (MTTF) Average time the disk is expected to run continuously without any failure.  Typically 3 to 5 years (1 year = 8,760 hours)  MTTF = 30,000 to 1,200,000 hours for a new disk an MTTF of 1,200,000 hours for a new disk means that given 1000 relatively new disks, on an average one will fail every 1200 hours When number of disks increase, the chance of some disk failure increase proportionally

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Parallelism Two main goals of parallelism in a disk system: 1.Load balance multiple small accesses to increase throughput 2.Parallelize large accesses to reduce response time. Basic strategy: Stripping  Compare and contrast bit stripping and byte stripping

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Redundancy store extra information that can be used to rebuild information lost in a disk failure Basic strategy: mirroring, parity Mean time to data loss depends on mean time to failure, and mean time to repair  E.g. MTTF of 100,000 hours, mean time to repair of 10 hours gives mean time to data loss of 500*10 6 hours (or 57,000 years) for a mirrored pair of disks (ignoring dependent failure modes) Data Parity

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure RAID levels

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Choice of RAID Levels Level 1 provides much better write performance than level 5  Level 5 requires at least 2 block reads and 2 block writes to write a single block, whereas Level 1 only requires 2 block writes  Level 1 preferred for high update environments such as log disks Level 1 had higher storage cost than level 5  disk drive capacities increasing rapidly (50%/year) whereas disk access times have decreased much less (x 3 in 10 years)  I/O requirements have increased greatly, e.g. for Web servers  When enough disks have been bought to satisfy required rate of I/O, they often have spare storage capacity so there is often no extra monetary cost for Level 1! Level 5 is preferred for applications with low update rate, and large amounts of data Level 1 is preferred for all other applications

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Buffer Management Database can not fit entirely in memory, needs memory as a buffer for speed reasons LRU is used in many OS  Spatial and temporal locality due to loops Database has a more predictable behavior  Example: join

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure DBMS Buffer Management Strategies Pinned block – memory block that is not allowed to be written back to disk. Toss-immediate strategy – frees the space occupied by a block as soon as the final tuple of that block has been processed Most recently used (MRU) strategy – system must pin the block currently being processed. After the final tuple of that block has been processed, the block is unpinned, and it becomes the most recently used block. Statistical information based  E.g., the data dictionary is frequently accessed. Heuristic: keep data-dictionary blocks in main memory buffer Buffer managers also support forced output of blocks for the purpose of recovery

1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Exercises