Presentation is loading. Please wait.

Presentation is loading. Please wait.

2 Using Classes to Manipulate Buffers  Examples of three C++ classes to encapsulate operation of buffer object Function : Pack, Unpack, Read, Write Output:

Similar presentations


Presentation on theme: "2 Using Classes to Manipulate Buffers  Examples of three C++ classes to encapsulate operation of buffer object Function : Pack, Unpack, Read, Write Output:"— Presentation transcript:

1

2 2 Using Classes to Manipulate Buffers  Examples of three C++ classes to encapsulate operation of buffer object Function : Pack, Unpack, Read, Write Output: pack into a buffer & write a buffer to a file Input: read into a buffer from a file & unpack a buffer ‘pack and unpack’ deals with only one field DelimTextBuffer class for delimited fields LengthTextBuffer class for length-based fields FixedTextBuffer class for fixed-length fields  Appendix E : Full implementation (Buggy)

3 3 Buffer Class for Delimited Text Fields(1)  Variable-length buffer  Fields are represented as delimited text Class DelimTextBuffer { public: DelimTextBuffer (char Delim = ‘|’, int maxBtytes = 1000); int Read(istream & file); int Write (ostream & file) const; int Pack(const char * str, int size = -1); int Unpack(char * str); private: char Delim; // delimiter character char * Buffer; // character array to hold field values int BufferSize;// current size of packed fields int MaxBytes;// maximum # of chars in the buffer int NextByte;// packing/unpacking position in buffer };

4 4 Buffer Class for Delimited Text Fields(2) int DelimTextBuffer::Unpack(char *str) { start = nextByte from start to buffer end search for delimter if not found return if found read from start till delimeter into str update nextByte if more data return true else return false } Unpack() extracts one field from a record in a buffer.

5 5 Buffer Class for Delimited Text Fields(3) int DelimTextBuffer::Unpack(char *str) // extract the value of the next field of the buffer { int len = -1; // length of packed string int start = NextByte; // first character to be unpacked for(int i = start; i < BufferSize; i++) if(Buffer[i] == Delim) {len = i-start; break; } if(len == -1) return FALSE;// delimiter not found NextByte += len + 1; if(NextByte > BufferSize) return FALSE; strncpy (str, &Buffer[start], len); str[len] = 0;// zero termination for string return TRUE; } Unpack() extracts one field from a record in a buffer.

6 6 Buffer Class for Delimited Text Fields(4) int DelimTextBuffer::Pack(char * str, int size) { If string is too short return If string will overflow buffer return Else write string in buffer from nextByte Add delimiter Update nextByte Return True } Pack() copies the characters of its argument to the buffer and then adds the delimiter characters.

7 7 Buffer Class for Delimited Text Fields(5) int DelimTextBuffer :: Pack (char * str, int size) // set the value of the next field of the buffer; // if size = -1 (default) use strlen(str) as Delim of field { short len; // length of string to be packed if (size >= 0) len = size; else len = strlen (str); //C-string len fn: # chars to \0 if (len > strlen(str)) // str is too short! return FALSE; int start = NextByte; // first character to be packed NextByte += len + 1; if (NextByte > MaxBytes) return FALSE; memcpy (&Buffer[start], str, len); Buffer [start+len] = Delim; // add delimeter BufferSize = NextByte; return TRUE; }  Pack() copies the characters of its argument to the buffer and then adds the delimiter characters.

8 8 Buffer Class for Delimited Text Fields (6)  Read method of DelimTextBuffer Clears the current buffer contents Extracts the record size Read the proper number of bytes into buffer Set the buffer size int DelimTextBuffer::Read(istream & stream) { Clear(); stream.read((char *)&BufferSize, sizeof(BufferSize)); if (Stream.fail()) return FALSE; if (BubberSize > MaxBytes) return FALSE; stream.read(Buffer, BufferSize); return stream.good(); }

9 9 Buffer Class for Delimited Text Fields (7)  Write method of DelimTextBuffer Write size data Write buffer content int DelimTextBuffer :: Write (ostream & stream){ stream. write ((char*)&BufferSize, sizeof(BufferSize)); stream. write (Buffer, BufferSize); return stream. good (); }

10 10

11 CS215: File Structure and Processing Chapter 5 11 Managing Files of Records

12 12

13 13 Chapter Objectives  Extend the file structure concepts of Chapter 4: Search keys and canonical forms Sequential search and Direct access Files access and file organization  Examine other kinds of the file structures in terms of Abstract data models Metadata Object-oriented file access Extensibility Examine issues of portability and standardization.

14 14 Record Access  Record Key Canonical form : a standard form of a key e.g. Ames or ames or AMES (need conversion) Distinct keys : uniquely identify a single record Primary keys, Secondary keys, Candidate keys Primary keys should be dataless (not updatable) Primary keys should be unchanging Social-securiy-number: good primary key but, for all non-registered aliens  Measurement of work:  Comparisons: occur in main memory  Disk accesses: main bottleneck

15 15 Sequential search is least efficient. Our main pursuit for the duration of the term is to present improved search methods  O (n), n : the number of records  Use record blocking to reduce work A block of several records fields < records < blocks O(n), but blocking decreases the number of seek  sequential within each block e.g records, 512 bytes each, sector size 512 bytes Unblocked (sector-sized buffers): 512 (½K buffer) => average 2000 READ() calls Blocked (16 recs / block) : 8K size buffer => average 125 READ() calls  Can further improve upon performance by using block key containing last record key to avoid searching within blocks where data can’t be Sequential Search

16 16 Sequential Search: Best Uses  When is Sequential Search Superior? Repetitive hits Searching for patterns in ASCII files Searching records with a certain secondary key value Small Search Set Processing files with few records Devices/media most hospitable to sequential access tape

17 17

18 18 Direct Access  Access a record without searching O(1) operation  RRN ( Relative Record Number ) Gives relative position of the record O(n) process with variable-length records Easy with fixed-length records: RRN*sizeof(record)  View file as collection of records, not bytes; all byte info is internal  Byte offset = N X R  r : record size  n : RRN value

19 19

20 20 Direct Access  Class IOBuffer includes direct read (DRead) direct write (DWrite) take byte offset as argument, along with stream

21 21 Record length is related to the size of the fields Access vs. fragmentaion vs. implementation Fixed length record fixed-length fields variable-length fields Unused space portion is filled with null character in C e.g. delimited OHIO COLUMBUS Choosing Record Length and Structure OHIO| |7|264.9|41330|35|3|1|1803|17|COLUMBUS\0....\0

22 22 Header Records  File as a Self-Describing Object  General information about file date and time of recent update, number of records size of record, fields (fixed-length record & field) delimiter (variable-length field)  Often placed at the beginning of the file

23 23 IO Buffer Class definition class IOBuffer Abstract base class for file buffers public : virtual int Read( istream & ) = 0; // read a buffer from the stream virtual int Write( ostream &) const = 0; // write a buffer to the stream // these are the direct access read and write operations virtual int DRead( istream &, int recref ); //read specified record virtual int DWrite( ostream &, int recref ) const; // write specified record // these header operations return the size of the header virtual int ReadHeader ( istream & ); virtual int WriteHeader ( ostream &) const; protected : int Initialized ; // TRUE if buffer is initialized char *Buffer; // character array to hold field values

24 24 Full definition of buffer class hierarchy WriteHeader method : writes the header string at the beginning of the file. Possible strings:  “Variable”  “Fixed” Returns size of header written ReadHeader method : reads the header id string. Must be the expected record type, variable or fixed length If the string matches that subclass’ header string, returns size of header any other string causes return of –1  header doesn’t match buffer IO Buffer Class definition

25 25 Full definition of buffer class hierarchy DWrite/DRead methods : operates using the byte address of the record as the record reference. Methods begin by seeking to the requested spot. IO Buffer Class definition

26 26

27 27 There is difference between file access and file organization. Variable-length records Sequential access is suitable Fixed-length records Direct access and sequential access are possible File Organization File Access Variable-length Records Sequential access Fixed-length records Direct access File Access and File Organization

28 28 Abstract Data Model  Data object such as document, images, sound e.g. images, sound  Abstract Data Model does not view data as it appears on a particular medium. application-oriented view application shielded from details of storage on medium  How to specify a file’s content? Headers and Self-describing files e.g. images: jpg: ÿØÿà JFIF gif: GIF89a e.g. sounds: mp3: ÿûD EQ¹à wav: RIFF$P WAVEfmt

29 Example: GIF Graphics Interchange Format Industry standard graphic format for on-screen viewing through the Internet and Web. Not meant to be used for printing. The best format for all images except scanned photographic images (use JPEG for these). GIF supports lossless compression.

30 30 CSc402/demo/States/DelimText/

31 31 Metadata  Data that describe the primary data in a file e.g. in html  Store in the header record  Standard format As shown on next slide

32 32 Html: Metadata

33 33 Metadata All meta information goes in the head section...

34 34 Mixing object Types in a file  Each field is identified using “keyword = value”  Index table with tags e.g.

35 35

36 36

37 37 Object-oriented file access  Separate translating to and from the physical format and application (representation-independent file access) provide a function to handle access (OO style) encapsulate details read_image() is image file type independent; method determines file type RAM image : star1star2 Disk Program find_star : read_image(“star1”, image) process image : end find_star

38 38 Extensibility  Advantage of using tags Identify object within files do not require a priori knowledge of the types of objects  New type of object implement method for reading and writing in appropriate module (separate concerns) call the method.

39 39 Factor affecting Portability  Differences among operating systems e.g. CR/LF in DOS  Differences among languages physical layout of files may be constrained by language limitation  Differences in machine architectures byte order: e.g. Unix: hton, ntoh  Differences on platforms e.g. EBCDIC vs. ASCII

40 40 Achieving Portability  Standardization Standard physical record format extensible, simple Standard binary encoding for data elements IEEE, XDR  File structure conversion  Number and text conversion  Established, well-known methods of conversion

41 41 Achieving Portability  File system difference Block size is 512 bytes on UNIX systems Block size is 2880 bytes on many non-UNIX systems  UNIX and Portability UNIX support portability by being commonly available on a large number of platforms UNIX provides a utility called dd dd : facilitates data conversion


Download ppt "2 Using Classes to Manipulate Buffers  Examples of three C++ classes to encapsulate operation of buffer object Function : Pack, Unpack, Read, Write Output:"

Similar presentations


Ads by Google