Using Algebraic Signatures in Storage Applications Thomas Schwarz, S.J. Associate Professor, Santa Clara University Associate, SSRC UCSC Storage Systems.

Slides:



Advertisements
Similar presentations
Chapter 16: Recovery System
Advertisements

Henry C. H. Chen and Patrick P. C. Lee
SECURE HASHING ALGORITHM By: Ruth Betcher. Purpose: Authentication Not Encryption Authentication Requirements:  Masquerade – Insertion of message from.
Latent Semantic Indexing (mapping onto a smaller space of latent concepts) Paolo Ferragina Dipartimento di Informatica Università di Pisa Reading 18.
Database Administration and Security Transparencies 1.
Software Certification and Attestation Rajat Moona Director General, C-DAC.
2P13 Week 11. A+ Guide to Managing and Maintaining your PC, 6e2 RAID Controllers Redundant Array of Independent (or Inexpensive) Disks Level 0 -- Striped.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
1 CSIS 7102 Spring 2004 Lecture 8: Recovery (overview) Dr. King-Ip Lin.
Disk Scrubbing in Large Archival Storage Systems Thomas Schwarz, S.J. 1,2 Qin Xin 1,3, Ethan Miller 1, Darrell Long 1, Andy Hospodor 1,2, Spencer Ng 3.
Fast Fingerprint Calculations Thomas Schwarz, S.J.
Session 5 Hash functions and digital signatures. Contents Hash functions – Definition – Requirements – Construction – Security – Applications 2/44.
CSE331: Introduction to Networks and Security Lecture 21 Fall 2002.
Chapter 4  Hash Functions 1 Overview  Cryptographic hash functions are functions that: o Map an arbitrary-length (but finite) input to a fixed-size output.
Signature Based Concurrency Control Thomas Schwarz, S.J. JoAnne Holliday Santa Clara University Santa Clara, CA 95053
1 Archival Storage for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University.
Witold Litwin Riad Mokadem Thomas Schwartz Disk Backup Through Algebraic Signatures.
Chapter 12 File Management Systems
Chapter 3 Encryption Algorithms & Systems (Part C)
Wide-area cooperative storage with CFS
Cryptography1 CPSC 3730 Cryptography Chapter 11, 12 Message Authentication and Hash Functions.
File System Security Jason Eick and Evan Nelson. What does a file system do? A file system is a method for storing and organizing computer files and the.
RAID Systems CS Introduction to Operating Systems.
Servers Redundant Array of Inexpensive Disks (RAID) –A group of hard disks is called a disk array FIGURE Server with redundant NICs.
Chapter 8.  Cryptography is the science of keeping information secure in terms of confidentiality and integrity.  Cryptography is also referred to as.
Team CMD Distributed Systems Team Report 2 1/17/07 C:\>members Corey Andalora Mike Adams Darren Stanley.
Cryptography and Network Security Chapter 11 Fifth Edition by William Stallings Lecture slides by Lawrie Brown.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Lecture 15 Lecture’s outline Public algorithms (usually) that are each other’s inverse.
1 Pattern Matching Using n-grams With Algebraic Signatures Witold Litwin[1], Riad Mokadem1, Philippe Rigaux1 & Thomas Schwarz[2] [1] Université Paris Dauphine.
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
1 Solid State Storage (SSS) System Error Recovery LHO 08 For NASA Langley Research Center.
HASH Functions.
1 Chapter 12 File Management Systems. 2 Systems Architecture Chapter 12.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
Message Authentication  message authentication is concerned with: protecting the integrity of a message protecting the integrity of a message validating.
Information Security Principles Assistant Professor Dr. Sana’a Wafa Al-Sayegh 1 st Semester ITGD 2202 University of Palestine.
Hash Functions A hash function H accepts a variable-length block of data M as input and produces a fixed-size hash value h = H(M) Principal object is.
Data and its manifestations. Storage and Retrieval techniques.
Fall 2002CS 395: Computer Security1 Chapter 11: Message Authentication and Hash Functions.
1 Pattern Matching Using n-gram Sampling Of Cumulative Algebraic Signatures : Preliminary Results Witold Litwin[1], Riad Mokadem1, Philippe Rigaux1 & Thomas.
COEN 180 Erasure Correcting, Error Detecting, and Error Correcting Codes.
Redundant Array of Independent Disks.  Many systems today need to store many terabytes of data.  Don’t want to use single, large disk  too expensive.
11.1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 11 Message Integrity and Message Authentication.
11-Basic Cryptography Dr. John P. Abraham Professor UTPA.
INFORMATION MANAGEMENT Unit 2 SO 4 Explain the advantages of using a database approach compared to using traditional file processing; Advantages including.
Lecture 2: Introduction to Cryptography
Witold Litwin Université Paris Dauphine Darrell LongUniversity of California Santa Cruz Thomas SchwarzUniversidad Católica del Uruguay Combining Chunk.
Data Integrity Proofs in Cloud Storage Author: Sravan Kumar R and Ashutosh Saxena. Source: The Third International Conference on Communication Systems.
Hash Functions Ramki Thurimella. 2 What is a hash function? Also known as message digest or fingerprint Compression: A function that maps arbitrarily.
Computer Science and Engineering Computer System Security CSE 5339/7339 Lecture 11 September 23, 2004.
CS426Fall 2010/Lecture 51 Computer Security CS 426 Lecture 5 Cryptography: Cryptographic Hash Function.
IT 221: Introduction to Information Security Principles Lecture 5: Message Authentications, Hash Functions and Hash/Mac Algorithms For Educational Purposes.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Database Laboratory Regular Seminar TaeHoon Kim Article.
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Cryptographic Hash Function. A hash function H accepts a variable-length block of data as input and produces a fixed-size hash value h = H(M). The principal.
OceanStore : An Architecture for Global-Scale Persistent Storage Jaewoo Kim, Youngho Yi, Minsik Cho.
CS Introduction to Operating Systems
File-System Management
CMPE Database Systems Workshop June 16 Class Meeting
Cryptographic Hash Functions
Cryptographic Hash Function
CSCE 715: Network Systems Security
Cryptographic Hash Functions
RAID RAID Mukesh N Tekwani
CSE 451: Operating Systems Winter 2007 Module 18 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570.
Erasure Correcting Codes for Highly Available Storage
RAID RAID Mukesh N Tekwani April 23, 2019
Presentation transcript:

Using Algebraic Signatures in Storage Applications Thomas Schwarz, S.J. Associate Professor, Santa Clara University Associate, SSRC UCSC Storage Systems Research Center, University of California, Santa Cruz, Retreat June 1,2 2004

Signatures Small strings that characterize objects. Calculated from the object. Distinct Signatures  Objects different. Same Signatures  Objects same. With high probability. Error probability is 2 -f, f length of signatures in bits. A.k.a. checksums, hashes, fingerprint, condensed representation, …

Signatures Examples Tripwire: Protection against malware. Maintains the signatures of all system libraries in a secure location. Before a library module is called, verify signature of module.

Signatures Examples Remote comparison of files: Problem arose out of first prototypes of replicated databases. Divide records into pages. Calculate and compare signatures for all pages. Do this efficiently by combining signatures of a set of pages into a super-signature.

Signatures Integrity check for archival storage Keep two copies of archived data. Maintain the signatures of tape contents. Periodically “scrub” tapes.

Signatures Similarity Measurement between Files. Similarity of web-pages. Similarity of files in Deep-Store.

Signatures For Scalable Distributed Data Structures SDDS implement a large file of records in buckets distributed over a network. SDDS operations (insert, update, delete, read, scan) have execution times independent of SDDS file size. Use signatures of blocks to decide which portions of the bucket needs to be backed up. More secure than dirty bit. Litwin, W., Mokadem, R., Schwarz, T.: Disk Backup through algebraic signatures in scalable and distributed data structures. Proc. 5th Workshop on Distributed Data and Structures, Thessaloniki, June 2003 (WDAS 2003).

Signatures For Scalable Distributed Data Structures Use signatures of records to test whether they have been changed. Leads to a read-verification based concurrency scheme. Read the record. Process the record. Verify that record has not changed by signature. Litwin, W., Mokadem, R., Schwarz, T.: Disk Backup through algebraic signatures in scalable and distributed data structures. Proc. 5th Workshop on Distributed Data and Structures, WDAS’03, Thessaloniki, June Schwarz, T., Holliday, J.: A Signature Based Concurrency Scheme for Scalable Distributed Data Structures. Workshop on Distributed Data and Structures, WDAS'04, Lausanne, 2004.

Cryptographically Secure Signatures Computationally impossible to find an object with the same signature. Protects against malicious attacks. Used to protect data integrity or to sign data: Object signature Apply Private Key K(signature) Object K(signature) Store encrypted signature with object

Cryptographically Secure Signatures MD Rivest 16B SHA NSA: FIPS 180 /ANSI x B … Implement a one-way hash.

Signatures with Algebraic Properties Composable signatures * : Capable of calculating object signatures from component objects. Updatable signatures: Calculate new signature of a changed object from old signature and the signature and location of change. * Suel, T., Noel, P., and Trendafilov, D.: Improved File Synchronization for Maintaining Large Replicated Collections over Slow Networks. In Proc. 20th Int. Conf. on Data Engineering, ICDE, Boston, 2004, p Litwin, W., Schwarz, T. Algebraic Signatures for Scalable Distributed Data Structures. Proc. of the 20th International Conference on Data Engineering (ICDE), Boston, 2004, p

Signatures with Algebraic Properties Algebraic properties prevent cryptographic security. Fundamental Design Trade-off.

Algebraic Signatures Karp-Rabin signatures over Galois fields. A Galois field defines addition, multiplication, subtraction, division, etc. over bit strings of length f. Same mathematical rules as for rational numbers, real numbers, complex numbers, etc. Single and compound signature of P=(p 1,p 2, …) Karp, R. and Rabin, M.: Efficient randomized pattern-matching algorithms. In IBM Journal of Research and Development, Vol. 31, No. 2, March Schwarz, T., Bowdidge, R. and Burkhard, W.: Low Cost Comparison of File Copies. In Proc. Intern. Conf. on Distributed Computing Systems, Paris, Fr., 1990, (ICDCS 5 Proceedings), p

Algebraic Signatures Properties of compound signature: Size is mf. Detects for sure any change of up to m symbols. A symbol is a GF element, i.e. a bit string of length f. Collision probability is 2 -fm

Algebraic Signatures Algebraic Signatures Properties Can update signature from simple change: Discovers changes from a cut-and-paste operation.

Algebraic Signatures Algebraic Signatures Properties Can calculate the signature of a parity object from the signatures of the data objects. Holds for normal parity (RAID Level 5) But also for some forms of generalized parity. Reed-Solomon Codes. Convolutional Array Codes. Thomas Schwarz, S.J.: Verification of Parity Data in Large Scale Storage Systems, PDPTA 2004, Las Vegas.

Algebraic Signatures in Large Scale Storage Systems Data – Parity Coherency: If we miss an update to parity data, then we can no longer reconstruct data: D1D2D3D4D5P D1’D2D3D4D5P D1’D2D3D4D5P ?

Protecting Data in a Large Archival Storage System. Disk-Based Archival Storage System Data is cold: Power down disks between accesses. Data on disk storage systems is lost because of: Device Failure. Block Failure. Periodically check whether we can access disks. Periodically check whether we can still read all data on disks.

Protecting Data in a Large Archival Storage System. Since we need to read all the data anyway, Since we also need to be concerned about software failures Check the signatures of data.

Protecting Data in a Large Archival Storage System. Divide disks into scrubbing blocks. Assume that the redundancy scheme creates generalized parity blocks for scrubbing blocks. Maintain a map of the signatures of the scrubbing blocks. D1 D9 D15 D23 P 2,5,13 P 1,4,7 D12 D2 D20 D22 P 2,25,31 D5 P 9,12,25 D17 D25 P 1,8,22 D3 D31 D19 D3 D10 D15 P 15,3,19

Protecting Data in a Large Archival Storage System. When data in the scrubbing block is updated change its signature. This happens rarely. When we scrub, check whether the actual signature of block coincides with the signature in metadata. If not: Something bad has happened. Typically software error, but occasionally data corruption. Comes at almost no costs. We need to read anyway.

Protecting Data in a Large Archival Storage System. Periodically check whether parity blocks and data blocks cohere. Access signatures of data blocks. Calculate signature of parity block(s). Compare with actual signature on file.

Protecting Data in a Large Archival Storage System. Conclusion Low cost scheme. Protects against data corruption and parity / data incoherence.