Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 6: Storage. 2 Magnetic media  Magnetic disks: are suitable for dynamic data that requires frequent changes provide good access time and high.

Similar presentations


Presentation on theme: "1 Chapter 6: Storage. 2 Magnetic media  Magnetic disks: are suitable for dynamic data that requires frequent changes provide good access time and high."— Presentation transcript:

1 1 Chapter 6: Storage

2 2 Magnetic media  Magnetic disks: are suitable for dynamic data that requires frequent changes provide good access time and high transfer rate are used for data that must be kept online during data capturing and processing. Data may be archived onto other magnetic or optical storage afterwards. are suitable for video-on-demand applications where large amounts of time dependent information must be transferred at high bit rate

3 3 RAID  Redundant Arrays of Inexpensive Disks. (developed at UC Berkeley in 1987)  Redefined to “...Independent...” (referring to their failure modes)  uses parallelism between multiple disks to improve aggregate I/O performance

4 4 RAID  Data is distributed across several physical disks.  As an alternative to single large expensive disk (SLED) in traditional mainframe systems.  Several levels of RAID, seeking to optimize among performance (data) availability cost

5 5 RAID  Advantages: high data transfer rate for large data accesses (multiple disks can be used to serve the same request at the same time) short queueing time on small data accesses (multiple requests can be served at the same time) uniform load balancing across all of the disks  Disadvantage: Large disk arrays are highly vulnerable to disk failures. E.g., single disk MTTF = 200,000 hrs  100-disk disk array MTTF = 2000 hrs, or 3 months.  need to add redundancy for better availability  write overhead! mean time to failure

6 6 Data striping a file

7 7 Data striping  distribute data transparently over multiple disks to make them appear as a single, fast, large disk.  multiple I/Os can be served in parallel parallel independent requests  shorter queueing time parallel accesses of a single request  higher transfer rate

8 8 Redundancy  Redundancy: many disks  low reliability redundancy is needed to tolerate disk failures

9 9 Redundancy  Two important factors in RAID design: granularity of data interleaving: fine: distribute the data so that all of the array’s disks cooperate in servicing every request  high I/O transfer rate, but disk requests will suffer from a longer queueing time and thus a higher access delay coarse: small requests  access few disks, small queueing time; large requests  all disks, good transfer rate data and parity distribution How redundant information is computed? Where shall the redundant information reside?

10 10 RAID: Level 0 (non-redundant)  data striping only  no data protection redundancy  no need to write redundant information  best write performance (among all RAID levels)  any disk failure  data loss  used in supercomputing environments where performance and capacity, rather than reliability, are the primary concerns

11 11 RAID: Level 1 (mirrored)  use twice as many disks as level 0  data is duplicated, called mirroring, or shadowing  read is faster, but write is slightly slower (why?)  If a disk fails, its mirror copy can still serve.  used in database application where availability and transaction rate are more important than storage efficiency redundant data

12 12 RAID 1  Q: Compare the following 3 RAID-1 configurations. 0 1 0 1 2 3 2 3 0123 3012 Simple Shadowing Declustering Chained Declustering 3a3a3b3b3c3c2a2a 1c1c 2b2b 0a0a 2c2c 0b0b 1a1a 0c0c 1b1b 0123

13 13 RAID: Level 2 (error correction)  uses Hamming code  For n disks, about log 2 (n) of them store redundant data. (More space efficient than mirroring.)  If a disk fails, multiple redundant disks need to be read to identify the bad one. However, only one redundant disk needs be read to recover the lost data.  RAID-2 assumes disk logic can’t tell which disk is returning faulty readings. No practical use.

14 14 RAID: Level 4 (block-interleaved parity)  Current disk controller logic can generally detect and identify disk failures.  Note that a single parity disk is enough to recover data lost due to single disk failure.  Performance: small read  access one data disk; large read  access many data disks; small write  4 I/O (read the data disk and the parity disk, compute the difference between the old and new images, update the data disk, update the parity disk; Q: what about large writes?

15 15 Parity disk 0001 11 011100 11 01 11 00 a file 11 01 parity disk 1:odd # of 1’s 0:even # of 1’s

16 16 RAID: Level 4 (block-interleaved parity)  2 update schemes: Read-modify-write Regenerate-write.

17 17 Read-modify-write 0001 11 011100 11 01 11 00 a file 11 01 parity disk change to 1111

18 18 Read-modify-write 0001 11 011100 11 01 11 00 a file 11 01 parity disk change to 1111 read 000111

19 19 Read-modify-write 0001 11 011100 11 01 11 00 a file 11 01 parity disk change to 1111 modify 11 10

20 20 Read-modify-write 11 0111 011100 11 01 11 00 a file 10 01 parity disk change to 1111 write 11 10

21 21 Regenerate write 0001 11 011100 11 01 11 00 a file 11 01 parity disk change to 11110000

22 22 Regenerate write 0001 11 011100 11 01 00 11 00 a file 00 01 parity disk change to 11110000 write

23 23 Enhancing RAID-4 B1B1 D1D1 A1A1 C1C1 B2B2 D2D2 A2A2 C2C2 B3B3 D3D3 A3A3 C3C3 Bp Dp Ap Cp B1B1 Dp A1A1 C1C1 B2B2 D1D1 A2A2 Cp Bp D2D2 A3A3 C2C2 B3B3 D3D3 Ap C3C3 Parity Disk Striped Parity

24 24 Distributing parity  Parity disk simplify the mapping of logical addresses to disk addresses. every write must update the associated bits on the single parity disk.  Striped parity can perform parallel parity update

25 25 RAID: Level 5 (block-interleaved distributed parity)  eliminates the parity disk bottleneck by distributing the parity uniformly over all of the disks.  improves read performance by allowing all disks be used to serve read requests.

26 26 RAID: Level 6 (P+Q redundancy)  uses 2 redundant disks to protect up to two disk failures  computes 2 different parities instead of 1  similar read performance as with Level 5, but write is slightly worse

27 27 Optical media  well accepted because: high storage capacity random access to data life span of more than 30 years (c.f. << 20 years for magnetic media) portable  Optical videodisk was invented by Friebus in 1929. Prototype using laser to record and read was demonstrated by Phillips and MCA in 1972.

28 28 Optical media  Videodisks developed by Philips has been commercially available since 1978.  Then compact disk technology for digital audio (CD-DA) came out in early 1980s. The use of optical disks for digital data storage came with the introduction and improvement of CD-ROM during the 1980s.

29 Optical disk technology  Optical storage media use the intensity of reflected laser light as an information source.

30 30  An optical disk consists of 3 layers: Protective layer (a very thin layer on the label side) Reflective layer (aluminum coating) Substrate layer (transparent) Optical disk technology  In the factory, depressions are cut on the disk surface, forming “lands” and “pits” (0.12  m different in heights).

31 31 Optical disk technology  Laser reflected from the lands and pits give different readings.  Do you know: that data are read from the disk “inside-out?” that the spiral track is about 3.5 miles long? that you should never write on the label side? that a CD should be cleaned “radially”?

32 32 Advantages of optical media  High density (for a portable device). Distance between tracks is 1.6  m, each track is 0.6  m wide, i.e., 1 bit per sq.  m or 1Mb per sq.mm. A floppy disk has 96 tracks per inch while an optical disk has 16000 tracks per inch.  Long life. Magnetization can decrease over time. ‘Lands’ and ‘pits’ not changed unless physically damaged. (However, mal-manufacturing/mis- handling may lead to “laser rot”.)

33 33 Advantages of optical media  Low wearing. Laser source in head can be positioned at 1mm from disk surface. Disk head does not have to be as close to the surface as with magnetic disks. This reduces friction and increases life span.

34 34 Digital optical disks  Audio CD was developed by Philips and Sony in 1982.  Basic technology extended to 650 MB CD-ROM in 1985.  At single speed, data rate is 150KBps.  CD-ROM/XA announced in 1986 to support applications of text, images, audio and FSFM video.  Recent developments include CD-R, CD-RW, MO (magneto optical), and DVD. 1024 2 bytes 1024 bytes Full-screen full-motion

35 35 CD-DA (Compact Disk Digital Audio)  1982 by Philips and Sony  12cm diameter, 1.2 mm thick optical disk, stores/plays in CLV (constant linear velocity). Spiral tracks of about 20,000 windings in total.  Data are recorded such that pit-to-land and land-to- pit transitions are coding ‘1’s. ‘0’s are coded as no transition.  Redundancy added to break up consecutive ‘1’s and ‘0’s.

36 36 CD-DA  Data rate: 44.1KHz sampling, 16-bit quantization, 172.27 KBps  Capacity: 747MB, 74 min high-quality sound  Capability of random access to tracks and index points.

37 37 8 to 14 modulation (EFM)  Pits and lands may not follow too closely one after another on a CD-DA. Rule 1: between any 2 ‘1’s, there are at least 2 ‘0’s.  For synchronization, pit or land sequences are not allowed to be too long. Rule 2: at most 10 ‘0’s can follow one after another.  Solution: Map every 8-bit pattern into a 14-bit pattern that satisfies the 2 rules. Among the 2 14 patterns, 267 of them are valid.  Also, between consecutive 14-bit sequences, 3 merging bits are added to enforce the rules.

38 38 8 to 14 modulation (Example)

39 39 Low level data encoding  Thus, an eight-bit byte of actual data is encoded into a total of 17 channel bits.  For synchronization and error correction, every 24 bytes of data is packaged into a frame: sync pattern (24 + 3 bits) control byte (17 bits) 12 data bytes (12 * 17 bits) 4 error correction bytes (4 * 17 bits) 12 data bytes (12 * 17 bits) 4 error correction bytes (4 * 17 bits) Total: 588 channel bits for 192 actual data bits. numbers in brackets are channel bits

40 40 First level error correction  Recall that each frame contains 24 data bytes and 8 error correction bytes.  The first 4 correction bytes cover the frame’s data. The other 4 correction bytes cover data over 7 frames.  When a frame is read, the first 4 correction bytes are checked. If not ok, the decoder decodes the data bytes after subsequent correction codes are read.  7 frames = 7.7 mm track length. Try cover your CD with a small piece of paper and see if it still works.

41 41 Interleaved coding  An audio CD records samples from an audio wave. Successive sample values are closely related.  A scratch on a CD may wipe out a continuous segment of data.  With interleaving, the samples are dispersed: 1 2 3 4 5 6 7 6 5 4 3 4 5 4 5 4 5 6 7 8 9 8 7 6 7 1 6 3 4 9 2 7 4 5 8 3 6 5 6 7 4 5 4 7 6 5 4 5 8 7 data on a CD

42 42 Interleaved coding  Suppose a burst error occurs destroying 4 samples 1 2 3 4 5 6 7 6 5 4 3 4 5 4 5 4 5 6 7 8 9 8 7 6 7 1 6 3 4 9 2 7 4 5 8 3 6 5 6 7 4 5 4 7 6 5 4 5 8 7  Interpolation to restore (approximately) the sample values: 1 2 3 4 5 6 6 6 5 4 3 4 5 4 5 4 5 6 7 8 8 8 7 6 7

43 43 CD-ROM (Compact Disk Read Only)  1985 by Philips and Sony.  Tracks are divided into audio and data types. Disks containing both types are called Mixed Mode Disks.  It operates in 2 modes: Mode 1: for computer data Mode 2: for media data

44 44 CD-ROM (Compact Disk Read Only)  Mode 1 Mode 1 achieves a better (lower) error rate by using a second level of error correction. Random access to sub-track units called blocks (2352 bytes). (For CD-DA, random access is on track level only.) Each block has the address of min:sec:block. This is specified in the Header field.

45 45 CD-ROM  Mode 1 for computer data. A capacity of 333,000 blocks to be played in 74 min, i.e., 650MB storage with a data rate of 150KBps. Sync 12 User Data 2048 Blanks 8 ECC 276 Header 4 EDC 4 2352 bytes Note: each 24 of the 2352 bytes are represented physically on the disc as a frame (i.e., 588 channel bits). Each block is thus represented by 98 frames.

46 46 CD-ROM  Mode 2 Mode 2 holds data of any media. Additional error correction not crucial, so not used. Disk has capacity of ~740MB and a data rate of 171KBps. Sync 12 User Data 2336 Header 4 2352 bytes

47 47 CD-ROM  CD-ROM is a very economical medium for publication and distribution.  Limitations of CD-ROM: Random access to a CD track can be anywhere from 200ms up to 1 sec in access time. Optical disk heads are heavier than magnetic heads. More inertia takes a longer seek time for head movements. Continuous media stored sequentially in CD-ROM tracks. Although important for multimedia applications, simultaneous playback of audio and other data is not possible.

48 48 CD-ROM/XA (Extended Architecture)  1989, established by Microsoft, Philips and Sony.  Goal: concurrent output of several media. Within 1 track, blocks of different media can be stored. It allows interleaved storage and retrieval of multimedia data.  A sub-header is added to each block to describe the block.  CD-ROM/XA uses CD-ROM mode 2 to define actual blocks. Two forms:

49 49 CD-ROM/XA Form 1 provides more error detection/correction at the expense of redundancy. 2048 bytes (of 2352) are for user data. Form 2 allows 13% more storage for user data, but at the expense of error correction. Sync 12 User Data 2324 Subheader 8 EDC 4 Header 4 2352 bytes Sync 12 User Data 2048 Subheader 8 ECC 276 Header 4 EDC 4 2352 bytes

50 50 CD-R (Compact Disk Recordable)  CD-R allows tracks to be recorded once.  4 layers: protective, reflective, dye, and substrate. Polycarbonate Lacquer Aluminum Polycarbonate Lacquer Dye Gold Traditional CD-ROM “Molded” by a stamper CD-R Media Burned by high power laser beam Don’t leave out in sunlight 24K

51 51 CD-R  Land and pit reflections realized by irreversible thermal effect (above 250C) on the dye.  Playable on CD players.

52 52 CD-R  Recording sessions A CD has 3 areas: lead-in, actual data, lead-out. Lead-in includes the table of contents: directory, indices to individual tracks. Data area includes all tracks where actual data is stored. Lead-out marks the end of the data area. Information Lead-out Lead-in Lead-out Session 1Session 2...

53 53 CD-R  Recording sessions Multiple sessions of lead-in, data, lead-out can be written separately over time. During 1 write activity, all data for a session are written with their table of contents, after which the session can be read by a CDROM drive. Information Lead-out Lead-in Lead-out Session 1Session 2...

54 54 CD-MO (Compact Disk Magneto Optical)  Specification published by Philips and Sony in 1991.  Working principle is different from other CD technologies (incompatible with other CD formats.) Based on the polarization of light by magnetic field. Disk surface is light reflecting magnetic substrate. During writing, surface is heated to above 150C, and a magnetic field polarizes individual dipoles. During reading, surface is irradiated with a laser beam, polarization of laser light changed according to the magnetization.

55 55 CD-RW  An alloy is used in CD-RW that can take on two states: crystalline: reflects light well amorphous: doesn’t reflect light well  To change the alloy into the crystalline state or the amorphous state, two laser beams of different power are used: “Write power”: amorphous state “Erase power”: crystalline state “Read power”: no state change, used to pick up the readings from the disk

56 56 Sony MiniDisk  A small MO disk with a data capacity of about 1/5 of a CD.  An MD can record about the same length of music as a CD since the audio is compressed.  The compression technique used is called ATRAC (Adaptive Transform Acoustic Coding). Typical compression ratio is 5:1.  Like MP3, ATRAC uses psychoacoustics.

57 57 Digital Versatile Disk (DVD)  also called Digital Video Disk.  capacity: 4.7 to 17 billion bytes (25 CDs)  digital video can be stored and distributed cheaper than tape; also it allows interactivity  can be used to store up to 133 minutes (8-9 hrs for high capacity DVDs) of studio quality video and multi-channel surround-sound audio, or 30 hours of CD-quality audio

58 58 DVD  DVD achieves a greater capacity by reducing the minimum pit length from 0.834 micron (CD) to 0.4 micron (DVD) reducing the inter-track space from 1.6 micron (CD) to 0.74 micron (DVD)

59 59 DVD  To read the condensed pits, DVD uses a laser of a shorter wavelength (635-650 nm; for CD it is 780 nm).  Reducing the pit size and track distance increases the disc’s capacity to 4.7GB.  Dual layering. A semi-reflective layer on top of a fully reflective layer  8.5GB total.  Double side. Two substrates bonded back-to-back. Each side could have one layer or two layers  capacity ranges from 9.4GB to 17GB.

60 60 DVD 1.2mm 0.6mm single-sided, single layer, 4.7GB single-sided, dual layer, 8.5GB double-sided, dual layer, 17GB

61 61 DVD  Other factors that improve the storage capacity 8-16 modulation instead of 8-14 + 3 merge bits slightly larger “usable” surface area more efficient error coding

62 62 DVD  Other features Error correction is about 10 times better than that of CD. Some DVDs are recorded using opposite track path (i.e., one spiral layer starts from the center, followed by another spiral layer that starts from the rim towards the center). OTP reduces the seek time when the player switch layer during video playback.


Download ppt "1 Chapter 6: Storage. 2 Magnetic media  Magnetic disks: are suitable for dynamic data that requires frequent changes provide good access time and high."

Similar presentations


Ads by Google