................................................. Digital Archeology: Recovering Digital Objects from Audio Waveforms Mark Guttenbrunner University of.

Slides:



Advertisements
Similar presentations
Data Transfer Chapter 10. File conversion When we upgrade a file after a big time of use, usually it is necessary to change the format of the file. For.
Advertisements

System Integration and Performance
Developement and Implementation of an MPEG1 Layer III Decoder on x86 and TMS320C6711 platforms Braidotti Enrico (Farina Simone)
Unit no. 3 Digitizing Sound and Video Adolf Knoll National Library of the Czech Republic
From Cassettes to CDs Digitizing Audio. Topics Overview Tools Required Media Types Preparing the Computer Recording the Audio Editing the Audio Creating.
4.1Different Audio Attributes 4.2Common Audio File Formats 4.3Balancing between File Size and Audio Quality 4.4Making Audio Elements Fit Our Needs.
| IFLA2010. Newspaper Section | Newspaper Resources in transition: Digital Preservation and Access - keynote - IFLA International Newspaper.
Image and Sound Editing Raed S. Rasheed Sound What is sound? How is sound recorded? How is sound recorded digitally ? How does audio get digitized.
File Management Systems
An introduction to systems programming
1 CS 502: Computing Methods for Digital Libraries Lecture 27 Preservation.
Starting Out with C++: Early Objects 5/e © 2006 Pearson Education. All Rights Reserved Starting Out with C++: Early Objects 5 th Edition Chapter 1 Introduction.
Core 3: Communication Systems. Encoding and decoding analog and digital signals…  Encoding involves converting data from its original form into another.
1 Introduction to Computers Day 5. 2 Magnetic Tapes Very popular with mainframe computers Storage density is expressed in ‘bytes per inch’ (bpi) or character.
Software Re-engineering
Different approaches to digital preservation Hilde van Wijngaarden Digital Preservation Officer Koninklijke Bibliotheek/ National Library of the Netherlands.
Shell and Flashing Images Commands and upgrades. RS-232 Driver chip – ST3232C Driver chip is ST3232C Provides electrical interface between UART port and.
COMP135/COMP535 Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 2 Lecture 2 – Digital Representations.
Introduction to Interactive Media 10: Audio in Interactive Digital Media.
GODIAN MABINDAH RUTHERFORD UNUSI RICHARD MWANGI.  Differential coding operates by making numbers small. This is a major goal in compression technology:
Topics Covered: Data preparation Data preparation Data capturing Data capturing Data verification and validation Data verification and validation Data.
Topics Introduction Hardware and Software How Computers Store Data
February 1 & 31 Csci 2111: Data and File Structures Week4, Lectures 1 & 2 Fundamental File Structure Concepts & Managing Files of Records.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Starting Out with C++ Early Objects Seventh Edition by Tony Gaddis, Judy.
3. Multimedia Systems Technology
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 1 Introduction to Computers and Programming.
Analogue vs Digital. Analogue  Lots of different frequencies, lots of different amplitudes  Wave recorded as it is.
Checking data Chapter 7 Prepared by:Sir Mazhar Javed.
File Structures Foundations of Computer Science  Cengage Learning.
Data Structure & File Systems Hun Myoung Park, Ph.D., Public Management and Policy Analysis Program Graduate School of International Relations International.
The Classic Amiga Preservation Society (CAPS) Software Preservation Society István Fábián.
Introduction to Digital Media. What is it? Digital media is what computers use to; Store, transmit, receive and manipulate data Raw data are numbers,
Chapter # 10 Data Transfer Prepared by :Sir Mazhar Javed.
Marwan Al-Namari 1 Digital Representations. Bits and Bytes Devices can only be in one of two states 0 or 1, yes or no, on or off, … Bit: a unit of data.
1 3 Computing System Fundamentals 3.6 Errors Prevention and Recovery.
COMP135/COMP535 Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 2 Lecture 2 – Digital Representations.
Processing Hardware, Software. Hardware Hardware Processing is performed by a computer ’ s central processing unit and is measured by the clock speed.
PQDIF PQDIF: A Technical Overview Prepared by: Erich Gunther, Bill Dabbs, and Rob Scott Electrotek Concepts, Inc. NEW! IMPROVED!
Sound (analogue signal). time Sound (analogue signal) time.
Cscape 8.6 August 2008 Horner APG. New Workbench Features - New Model – XL6.
© N. Ganesan, Ph.D., All rights reserved. Chapter Formatting of Data for Transmission.
A computer contains two major sets of tools, software and hardware. Software is generally divided into Systems software and Applications software. Systems.
Preservation of Digital Data by Christian Wellner Based on: Howard Besser. Digital longevity. In: Maxine Sitts (ed.) Handbook for Digital Projects: A Management.
Audio sampling as an example of analogue to digital Mr S McIntosh.
Copyright © 2014, 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Starting Out with C++ Early Objects Eighth Edition by Tony Gaddis,
Software Design and Development Storing Data Part 2 Text, sound and video Computing Science.
NXT File System Just like we’re able to store multiple programs and sound files to the NXT, we can store text files that contain information we specify.
Chapter Nine: Data Transmission. Introduction Binary data is transmitted by either by serial or parallel methods Data transmission over long distances.
Networks Standardisation & Protocols. Learning Objectives Explain the advantages of standardisation and describe some areas of standardisation such as.
1 The user’s view  A user is a person employing the computer to do useful work  Examples of useful work include spreadsheets word processing developing.
GCSE ICT Data Transfer. Data transfer Users often need to transfer data between software packages or computers. Until relatively recently this was difficult.
CS223: Software Engineering Lecture 34: Software Maintenance.
A Technical View of Risk Assessment Methods for Backup Systems Bradley Wong Life Sciences Consulting Tustin, CA – USA DIA/All Hands: 12 February 2015.
Chapter 1: Introduction to Computers and Programming
GCSE ICT Data Transfer.
Topics Introduction Hardware and Software How Computers Store Data
System Programming and administration
Multimedia: Digitised Sound Data
Software Maintenance.
PCM (Pulse Code Modulation)
Chapter 1: Introduction to Computers and Programming
Creating Transcripts of Your Narrated PowerPoints Richard Oliver Department of Information Systems 2018 Quality in Online Education Conference.
Storage Basic recommendations:
Topics Introduction Hardware and Software How Computers Store Data
Chapter Nine: Data Transmission
Understanding Hex “I hope you have the worst headache of your life, then you will begin to understand” ~unknown.
An introduction to systems programming
IPv4 Addressing By, Ishivinder Singh( ) Sharan Patil ( )
WJEC GCSE Computer Science
Software Re-engineering and Reverse Engineering
Presentation transcript:

Digital Archeology: Recovering Digital Objects from Audio Waveforms Mark Guttenbrunner University of Technology Vienna October 6 th,

 Introduction  Analyzed System  Reengineering the Waveform  Bitstream-Formats  Migration Tool  Evaluation of Extraction Methods  Evaluation Results  Conclusions Overview

Introduction  First home-computers in the late 1970’s / early 1980s  Data usually saved on magnetic tapes  No special hardware/media needed (e.g. standard audio recorder to connect the system to and audio tapes to store the data)  Data/Software still available on tapes in private collections (maybe even archives)  Future migration with no working specimen and/or knowledge about system impossible How can we migrate the data without the original system in the future?

Analyzed System  Philips Videopac+ G7400 video game system released 1983  C7420 Home Computer Module Microsoft Basic Data save/load using compact cassettes

Reengineering the Waveform  Data is encoded in bitstreams  Bitstreams are encoded in analogue waveform  Writing different test programs on the original system  Recording digitized waveform using software Audacity Analyzing waveforms and changes between waveforms Format of stored byte: sine-waves, 4.8 kHz 4 waves per bit, 1200 bps 1 start-bit, 8 data-bits, 2 ½ stop-bits

Bitstream Formats  System is able to store different file types  BASIC programs, screenshots, arrays, strings, raw memory dumps  Writing test programs using original documentation  Changes in the programs result in changes in the bitstream  File format  32 byte file-header, variable data block  fixed start block (256x 0xFF), separator (128x 0xFF),end block (10x 0x00)  File header  filename, file-type, length, checksum, start address  Data Block  structure dependent on the file type  e.g. BASIC program - for every line: line number, line in ASCII (commands encoded) and start address in memory of next line, e.g. 1 0 P R I N T “ H A L L O ” 0A C 4C 4F 22 00

Migration Tool  Features  Tool to migrate data from audio-streams to non-obsolete formats  Either from audio-file or directly from original system  Support for all file-types of G7400  Data can be encoded in bitstream and/or audio-stream  Two methods for extracting data from waveform  existing method from retro community works for data transfer from and to original system but unusable for tapes (too much noise, lost data)  novel method that interprets the arc length of the curve instead of counting highs and lows in the signal

Evaluation of Extraction Methods  Tests with new files  readable by original system and both methods  Tests with 20 years old tapes  no files recognized with original method 1  6 / 23 files recognized by original system (0 without errors)  22/23 files recognized by method 2 (3 without errors)  all files recovered from the tapes are BASIC programs

Evaluation Results  Validity of files  original system: files loaded were unusable  checked by reencoding and testing them on original system (original data for comparison not available)  most files had small errors that could be manually restored  Data is no longer readable on original system  Evaluated tapes successfully migrated using the migration tool with the method of analyzing arc length of signal

Conclusions  Data successfully preserved  some data lost due to age of tapes (20 years is expected lifetime)  media refresh for original system possible by decoding / encoding  Reengineering easier if done now  test programs can be written and analyzed with access to original system  expert knowledge still available in “retro” communities  Using bitstream for emulation  converting only the waveform to bitstream allows use of data in future emulators of the system  no emulation of C7420 available yet  Results interpreted for other systems/media types  findings valid for all systems that encode data in audio waveforms  in most other cases special devices necessary to read media  logical reengineering and extracting data from bitstreams is possible

Thank you for your attention. Tool and sample files can be downloaded from: