Storing Data Dina Said 1 1.

Slides:



Advertisements
Similar presentations
CPS216: Data-Intensive Computing Systems Data Access from Disks Shivnath Babu.
Advertisements

Storing Data: Disk Organization and I/O
I/O Management and Disk Scheduling
- Dr. Kalpakis CMSC Dr. Kalpakis 1 Outline In implementing DBMS we need to answer How should the system store and manage very large amounts of data?
B-tree. Why B-Trees When the data is too big, we will have to use disk storage instead of putting all the data in main memory In such case, we have to.
Storing Data: Disks and Files: Chapter 9
Problems in IO & File System CS 1550 Recitation November 4 th /6 th, 2002 The questions in this slide are from Andrew S. Tanenbaum's textbook page 375,
Tutorial 8 CSI 2132 Database I. Exercise 1 Both disks and main memory support direct access to any desired location (page). On average, main memory accesses.
CS 277 – Spring 2002Notes 21 CS 277: Database System Implementation Notes 02: Hardware Arthur Keller.
Storage. The Memory Hierarchy fastest, but small under a microsecond, random access, perhaps 2Gb Typically magnetic disks, magneto­ optical (erasable),
CS4432: Database Systems II Data Storage - Lecture 2 (Sections 13.1 – 13.3) Elke A. Rundensteiner.
ICS 624 Spring 2011 Multi-Dimensional Clustering Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/24/20111Lipyeow.
Disk Access Model. Using Secondary Storage Effectively In most studies of algorithms, one assumes the “RAM model”: –Data is in main memory, –Access to.
13.2 Disks Mechanics of Disks Presented by Chao-Hsin Shih Feb 21, 2011.
1 Storing Data: Disks and Files Yanlei Diao UMass Amherst Feb 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
1 Storage Hierarchy Cache Main Memory Virtual Memory File System Tertiary Storage Programs DBMS Capacity & Cost Secondary Storage.
1 CS143: Disks and Files. 2 System Architecture CPU Main Memory Disk Controller... Disk Word (1B – 64B) ~ x GB/sec Block (512B – 50KB) ~ x MB/sec System.
CS4432: Database Systems II Lecture 2 Timothy Sutherland.
1 Hard Drive Storage. 2 Introduction zThis sections discusses: yHow a hard drive works yHow to estimate storage size yHow to estimate time.
CPSC-608 Database Systems Fall 2009 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #5.
CS 342 – Operating Systems Spring 2003 © Ibrahim Korpeoglu Bilkent University1 Input/Output – 5 Disks CS 342 – Operating Systems Ibrahim Korpeoglu Bilkent.
CENG 351 Fall Secondary Storage Devices: Magnetic Disks.
1 Disk Storage, Basic File Structures, and Hashing.
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
1 Introduction to Computers Day 4. 2 Storage device A functional unit into which data can be –placed –retained(stored) –retrieved(accessed)
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Hard Drive / Hard Disk Functions of hard disk
CS4432: Database Systems II Data Storage (Better Block Organization) 1.
Disk Memory Topics Disk Memory Structure Disk Capacity class10.ppt.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How are data stored? –physical level –logical level.
1 6 Further System Fundamentals (HL) 6.2 Magnetic Disk Storage.
1 IT420: Database Management and Organization Storage and Indexing 14 April 2006 Adina Crăiniceanu
Introduction to Database Systems 1 Storing Data: Disks and Files Chapter 3 “Yea, from the table of my memory I’ll wipe away all trivial fond records.”
Copyright ©: Nahrstedt, Angrave, Abdelzaher, Caccamo1 Disk & disk scheduling.
ICS 321 Fall 2011 Overview of Storage & Indexing (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 11/9/20111Lipyeow.
External Storage Primary Storage : Main Memory (RAM). Secondary Storage: Peripheral Devices –Disk Drives –Tape Drives Secondary storage is CHEAP. Secondary.
IDA / ADIT Databasteknik Databaser och bioinformatik Data structures and Indexing (I) Fang Wei-Kleiner.
CS 405G: Introduction to Database Systems 25 Exercise Chen Qian University of Kentucky.
Chapter 8 External Storage. Primary vs. Secondary Storage Primary storage: Main memory (RAM) Secondary Storage: Peripheral devices  Disk drives  Tape.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.
Lecture 40: Review Session #2 Reminders –Final exam, Thursday 3:10pm Sloan 150 –Course evaluation (Blue Course Evaluation) Access through.
Database Systems Disk Management Concepts. WHY DO DISKS NEED MANAGING? logical information  physical representation bigger databases, larger records,
Disk Basics CS Introduction to Operating Systems.
CPSC 404 Assignment #1, Winter 2008 Term 2. Due: Wednesday, Feb 4, by 5 pm. Laks V.S. Lakshmanan.
Section 13.2 – Secondary storage management (Former Student’s Note)
DBMS 2001Notes 2: Hardware1 Principles of Database Management Systems Pekka Kilpeläinen (after Stanford CS245 slide originals by Hector Garcia-Molina,
Device Management Mark Stanovich Operating Systems COP 4610.
Disk storage systems Question#1 (True/False) A track is divided into multiple units called sectors.
Disk Average Seek Time. Multi-platter Disk platter Disk read/write arm read/write head.
Magnetic Disk Rotational latency Example Find the average rotational latency if the disk rotates at 20,000 rpm.
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
Programmer’s View of Files Logical view of files: –An a array of bytes. –A file pointer marks the current position. Three fundamental operations: –Read.
COSC 6340: Disks 1 Disks and Files DBMS stores information on (“hard”) disks. This has major implications for DBMS design! » READ: transfer data from disk.
CPS216: Advanced Database Systems Notes 03: Data Access from Disks Shivnath Babu.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
Lecture 3 Secondary Storage and System Software I
1 Components of the Virtual Memory System  Arrows indicate what happens on a lw virtual address data physical address TLB page table memory cache disk.
CS 405G: Introduction to Database Systems 13b Exercise Chen Qian University of Kentucky.
CS422 Principles of Database Systems Disk Access Chengyu Sun California State University, Los Angeles.
File organization Secondary Storage Devices Lec#7 Presenter: Dr Emad Nabil.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems DISK I/0.
Storing Data Dina Said.
CS522 Advanced database Systems
Database Management Systems (CS 564)
Hard Drives.
File Organizations What an OS provides Copyright © Curt Hill.
Disks and Files DBMS stores information on (“hard”) disks.
Parameters of Disks The most important disk parameter is the time required to locate an arbitrary disk block, given its block address, and then to transfer.
CPS216: Advanced Database Systems Notes 04: Data Access from Disks
Presentation transcript:

Storing Data Dina Said 1 1

Question 9.4 If you have a large file that is frequently scanned sequentially, explain how you would store the pages in the file on a disk. 2

Question 9.4 If you have a large file that is frequently scanned sequentially, explain how you would store the pages in the file on a disk. The pages in the file should be stored ‘sequentially’ on a disk. We should put two ‘logically’ adjacent pages as close as possible. In decreasing order of closeness, they could be on the same track, the same cylinder, or an adjacent cylinder. 3

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. What is the capacity of a track in bytes? What is the capacity of each surface? What is the capacity of the disk? 4

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. What is the capacity of a track in bytes? What is the capacity of each surface? What is the capacity of the disk? Image from: http://pushypanda.blogspot.ca/2011/01/keeping-up-with-hard-times.html 5

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. What is the capacity of a track in bytes? bytes/track = bytes/sector × sectors/track = 512 × 50 = 25K What is the capacity of each surface? bytes/surface = bytes/track × tracks/surface = 25K × 2000 = 50, 000K What is the capacity of the disk? bytes/disk = bytes/surface× surfaces/disk = 50, 000K × 5 × 2 = 500, 000K 6

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. How many cylinders does the disk have? 7

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. How many cylinders does the disk have? The number of cylinders is the same as the number of tracks on each platter, which is 2000. 8

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. Give examples of valid block sizes. Is 256 bytes a valid block size? 2048? 51200? 9

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. Give examples of valid block sizes. Is 256 bytes a valid block size? 2048? 51200? The block size should be a multiple of the sector size. We can see that 256 is not a valid block size while 2048 is. 51200 is not a valid block size in this case because block size cannot exceed the size of a track, which is 25600 bytes. 10

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. If the disk platters rotate at 5400 rpm (revolutions per minute), what is the maximum rotational delay? 11

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. If the disk platters rotate at 5400 rpm (revolutions per minute), what is the maximum rotational delay? If the disk platters rotate at 5400rpm, the time required for one complete rotation, which is the maximum rotational delay, is 1/5400 × 60 = 0.011seconds 12

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. If the disk platters rotate at 5400 rpm (revolutions per minute), what is the maximum rotational delay? If the disk platters rotate at 5400rpm, the time required for one complete rotation, which is the maximum rotational delay, is 1/5400 × 60 = 0.011seconds The average rotational delay is half of the rotation time, 0.006 seconds. 13

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. /*Reward*/ If one track of data can be transferred per revolution, what is the transfer rate? 14

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. /*Reward*/ If one track of data can be transferred per revolution, what is the transfer rate? The capacity of a track is 25K bytes. Since one track of data can be transferred per revolution, the data transfer rate is 25K/ 0.011= 2, 250Kbytes/second 15

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. 1. How many records fit onto a block? 16

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. 1. How many records fit onto a block? 1024/100 = 10. We can have at most 10 records in a block. 17

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. How many blocks are required to store the entire file? 18

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. How many blocks are required to store the entire file? we need 10,000 blocks to store the file. 19

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. If the file is arranged sequentially on the disk, how many surfaces are needed? 20

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. What is the capacity of a track in bytes? bytes/track = bytes/sector × sectors/track = 512 × 50 = 25K What is the capacity of each surface? bytes/surface = bytes/track × tracks/surface = 25K × 2000 = 50, 000K What is the capacity of the disk? bytes/disk = bytes/surface× surfaces/disk = 50, 000K × 5 × 2 = 500, 000K 21

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. If the file is arranged sequentially on the disk, how many surfaces are needed? One track has 25 blocks, one surface has 25*2000=100,000 blocks. we need 10,000 blocks to store this file. So we need less than one surface to store this file. Is it correct? 22

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. Image from: http://gerardnico.com/wiki/data_storage/disk 23

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. If the file is arranged sequentially on the disk, how many surfaces are needed? One track has 25 blocks, One cylindar has 25*5*2=250 blocks. we need 10,000 blocks to store this file. So we need 10,000/250 = 40 cylindars, i.e. We will need the 10 surfaces to store the file. 24

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. How many records of 100 bytes each can be stored using this disk?

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. How many records of 100 bytes each can be stored using this disk? The capacity of the disk is 500,000K, which has 500,000 blocks. Each block has 10 records. Therefore, the disk can store no more than 5,000,000 records.

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. How many records of 100 bytes each can be stored using this disk? The capacity of the disk is 500,000K, which has 500,000 blocks. Each block has 10 records. Therefore, the disk can store no more than 5,000,000 records.

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. If pages are stored sequentially on disk, with page 1 on block 1 of track 1, what page is stored on block 1 of track 1 on the next disk surface? Remember that there are 25 blocks per track

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. If pages are stored sequentially on disk, with page 1 on block 1 of track 1, what page is stored on block 1 of track 1 on the next disk surface? There are 25 blocks in each track. It is block 26 on block 1 of track 1 on the next disk surface.

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. How would your answer change if the disk were capable of reading and writing from all heads in parallel?

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. How would your answer change if the disk were capable of reading and writing from all heads in parallel? If the disk were capable of reading/writing from all heads in parallel, we can put the first 10 pages on the block 1 of track 1 of all 10 surfaces. Therefore, it is block 2 on block 1 of track 1 on the next disk surface.

Review - Time to access a block (access time) - Seek time: Time taken to move the disk heads to the track on which this block is located - Rotational Delay: Waiting time for the block to rotate under the disk head (use the average time which is half of the maximum rotation) - Transfer time: time to actually read or write the data in the block (time for the disk to rotate over the block) Access time = seek time + Rotational Delay + Transfer Time

Review - Time to access a block (access time) - Seek time: Time taken to move the disk heads to the track on which this block is located - Rotational Delay: Waiting time for the block to rotate under the disk head (use the average time which is half of the maximum rotational delay) - Transfer time: time to actually read or write the data in the block (time for the disk to rotate over the block) Access time = seek time + Rotational Delay + Transfer Time

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. If the disk platters rotate at 5400 rpm (revolutions per minute), what is the maximum rotational delay? If the disk platters rotate at 5400rpm, the time required for one complete rotation, which is the maximum rotational delay, is 1/5400 × 60 = 0.011seconds The average rotational delay is half of the rotation time, 0.006 seconds. 34

Question 9.5 Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double- sided platters, and average seek time of 10 msec. If one track of data can be transferred per revolution, what is the transfer rate? The capacity of a track is 25K bytes. Since one track of data can be transferred per revolution, the data transfer rate is 25K/ 0.011= 2, 250Kbytes/second 35

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. What time is required to read a file containing 100,000 records of 100 bytes each sequentially? Remember that the seek time is 0.01 mseconds

Question 9.6 What time is required to read a file containing 100,000 records of 100 bytes each sequentially? Access time = seek time + Rotational Delay + Transfer Time A track can be read/written in one rotation → Transfer Time = no of tracks * time of one rotation Since we have 400 tracks, Transfer Time is 400 × 0.011 = 4.4 seconds Sequential Access → We don't need rotational delay

Question 9.6 What time is required to read a file containing 100,000 records of 100 bytes each sequentially? – Seek time is the time to locate the tracks → Seek time of the file = no Of Cylinders * seek time of track. - The seek time is 40 × 0.01 = 0.4seconds. - Therefore, total access time is 4.4+ 0.4 = 4.8 seconds.

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. Again, how would your answer change if the disk were capable of reading/writing from all heads in parallel (and the data was arranged optimally)?

Question 9.6 Again, how would your answer change if the disk were capable of reading/writing from all heads in parallel (and the data was arranged optimally)? If the disk were capable of reading/writing from all heads in parallel, the disk can read 10 tracks at a time. The transfer time is 10 times less, which is 0.44 seconds. Thus total access time is 0.44 + 0.4 = 0.84seconds

Question 9.6 Consider again the disk specifications from Exercise 9.5, and suppose that a block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two blocks. What is the time required to read a file containing 100,000 records of 100 bytes each in a random order? To read a record, the block containing the record has to be fetched from disk. Assume that each block request incurs the average seek time and rotational delay.

Question 9.6 What is the time required to read a file containing 100,000 records of 100 bytes each in a random order? To read a record, the block containing the record has to be fetched from disk. Assume that each block request incurs the average seek time and rotational delay. Access time = Seek time + Rotational Delay + Transfer Time. The average access time for a block of data would be 16.44 msec. For a file containing 10,000 blocks, the total access time would be 164.4 seconds.