Working with Files CSC 161: The Art of Programming Prof. Henry Kautz 11/9/2009.

Slides:



Advertisements
Similar presentations
Some computer fundamentals and jargon Memory: Basic element is a bit – value = 0 or 1 Collection of “n” bits is a “byte” Collection of several bytes is.
Advertisements

Binary Representation Introduction to Computer Science and Programming I Chris Schmidt.
CSC /703 CTI/DePaul1 CSC-255 Lecture 3 Text and Numerical Storage (Chapter 1 from Brookshear) Modified by Ufuk Verun from Jim Janossy © 2002, DePaul.
Review Binary –Each digit place is a power of 2 –Any two state phenomenon can encode a binary number –The number of bits (digits) required directly relates.
Representing Information as Bit Patterns Lecture 4 CSCI 1405, CSCI 1301 Introduction to Computer Science Fall 2009.
Overview Digital Systems and Computer Systems Number Systems [binary, octal and hexadecimal] Arithmetic Operations Base Conversion Decimal Codes [BCD (binary.
Data Representation in Computers
Representing Information in Binary (Continued)
CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011.
COMPUTER FUNDAMENTALS David Samuel Bhatti
Introduction to Computing Using Python Chapter 6  Encoding of String Characters  Randomness and Random Sampling.
1 Spidering the Web in Python CSC 161: The Art of Programming Prof. Henry Kautz 11/23/2009.
(2.1) Fundamentals  Terms for magnitudes – logarithms and logarithmic graphs  Digital representations – Binary numbers – Text – Analog information 
Homework Reading Programming Assignments
Abstraction – Number Systems and Data Representation.
Chapter 3 Representing Numbers and Text in Binary Information Technology in Theory By Pelin Aksoy and Laura DeNardis.
CMPT 120 How computers run programs Summer 2012 Instructor: Hassan Khosravi.
Data and Program Representation
C Programming Lecture 3. The Three Stages of Compiling a Program b The preprocessor is invoked The source code is modified b The compiler itself is invoked.
Aloha Aloha What you see: What the computer sees: binary number columns binary number columns
Working with text ASCII and UNICODE.   
Computing with Numbers CSC 161: The Art of Programming Prof. Henry Kautz 9/14/2009.
Computer science Kieran elder. What is computer science What is binary Binary maths What is hexadecimal Hexadecimal numbers programs for programming Different.
General Computer Science for Engineers CISC 106 Lecture 02 Dr. John Cavazos Computer and Information Sciences 09/03/2010.
Chapter 2 Computer Hardware
Lecture # 5 Data Representation. Today Questions: From notes/reading/life? Prepare for Quiz # 1 (Multiple Choice) 1.Introduce: How do Computers store.
Data Representation and Storage Lecture 5. Representations A number value can be represented in many ways: 5 Five V IIIII Cinq Hold up my hand.
Files and Dictionaries CSC 161: The Art of Programming Prof. Henry Kautz 9/16/2009.
CS190/295 Programming in Python for Life Sciences: Lecture 3 Instructor: Xiaohui Xie University of California, Irvine.
1 INFORMATION IN DIGITAL DEVICES. 2 Digital Devices Most computers today are composed of digital devices. –Process electrical signals. –Can only have.
CS151 Introduction to Digital Design
Lecture 06 – Reading and Writing Text Files.  At the end of this lecture, students should be able to:  Read text files  Write text files  Example.
Hossain Shahriar Announcement and reminder! Tentative date for final exam need to be fixed! Topics to be covered in this lecture(s)
Floating point numerical information. Previously discussed Recall that: A byte is a memory cell consisting of 8 switches and can store a binary number.
Data Representation – Chapter 3 Section 3-1. Terminology “Digital” –Discrete, well defined values/steps –Opposite of analog –Analogy: digital is to analog.
CISC1100: Binary Numbers Fall 2014, Dr. Zhang 1. Numeral System 2  A way for expressing numbers, using symbols in a consistent manner.  " 11 " can be.
1 Data Representation Characters, Integers and Real Numbers Binary Number System Octal Number System Hexadecimal Number System Powered by DeSiaMore.
Data Collections: Lists CSC 161: The Art of Programming Prof. Henry Kautz 11/2/2009.
Data Representation Conversion 24/04/2017.
Introduction to Unix (CA263) File Editing By Tariq Ibn Aziz.
Files Tutor: You will need ….
FILES. open() The open() function takes a filename and path as input and returns a file object. file object = open(file_name [, access_mode][, buffering])
Data Representation. How is data stored on a computer? Registers, main memory, etc. consists of grids of transistors Transistors are in one of two states,
Binary 101 Gads Hill School. Aim To strengthen understanding of how computers use the binary number system to store information.
1 CSC103: Introduction to Computer and Programming Lecture No 27.
CHAPTER 1 COMPUTER SCIENCE II. HISTORY OF COMPUTERS (1.1) Eniac- one of the worlds first computers Used more electricity than an entire city block of.
Kanel Nang.  Two methods of formatting output ◦ Standard string slicing and concatenation operations ◦ str.format() method ie. >>> a = “The sum of 1.
Lecture Coding Schemes. Representing Data English language uses 26 symbols to represent an idea Different sets of bit patterns have been designed to represent.
BINARY I/O IN JAVA CSC 202 November What should be familiar concepts after this set of topics: All files are binary files. The nature of text files.
Nat 4/5 Computing Science Data Representation Lesson 3: Storing Text
Binary Representation in Text
Binary Representation in Text
Topic: Binary Encoding – Part 1
Computer Science II Chapter 1.
Data Representation ASCII.
Representing Characters
Binary Quiz UIN: ____________________
CISC101 Reminders Quiz 2 graded. Assn 2 sample solution is posted.
Chapter 2 Data Representation.
Plan Attendance Files Posted on Campus Cruiser Homework Reminder
Files Handling In today’s lesson we will look at:
Fundamentals of Python: First Programs
CS190/295 Programming in Python for Life Sciences: Lecture 3
Learning Intention I will learn how computers store text.
15-110: Principles of Computing
Abstraction – Number Systems and Data Representation
Winter 2019 CISC101 4/29/2019 CISC101 Reminders
Lecture 36 – Unit 6 – Under the Hood Binary Encoding – Part 2
ASCII and Unicode.
Presentation transcript:

Working with Files CSC 161: The Art of Programming Prof. Henry Kautz 11/9/2009

What is Text? By "plain text" we mean one of the standard methods for encoding characters as numeric values ASCII Most common, uses 8-bits (one byte) per character Example: 'A' is in binary (base 2) 101 in octal (base 8) 65 in decimal (base 10) 41 in hexadecimal (base 16) 2

3

Beyond ASCII Because ASCII uses an 8-bit code, there are 256 possible ASCII characters How to handles languages with larger character sets (e.g. Chinese)? What if we want to mix languages on a page? Unicode: international standard using 16-bit or 32-bit characters 16-bits = 64,000 different characters 32-bits = 4 million different characters Unique code for every character of every human language 4

Unicode Examples 5

Review: Working with Text Files f = open("My file.txt", "r") f gets a file object Mode is "r" for reading, "w" for writing an Ascii (8-bit) text file contents = f.read() Reads entire file, returns entire text as one long string Special characters '\r' or '\n' (or both) separate lines Note: special characters are PRINTED as two characters, but are stored as one 8-bit character! 6

Reading a Text File Incrementally f.readline() Reads one line from the return, returns that line as a string Next call to readline() will return the next line of file Special Python shortcut: for line in : do something with line Example: dickens = open('two-cities.txt','r') count = 0 for line in dickens: count = count + 1 print "The number of lines is ", count 7

Text Output f = open("results.txt", "w") Mode is 'w' for writing text files if file already exists, it is erased f.write(data) Writes data to (end of) file f.close() Tells the computer you are done using file object f If you leave it out, your file might not be properly stored on the computer's hard disk 8

Performing String Operations on Files Replacing a phrase in a file: 9

file-test.txt 10

file-result.txt 11

Performing Mathematical Operations on Files Task: Take the average of 1,000,000 numbers Scientific data is often this large or larger Warning: Don't try this in Excel! 12

Result 13

Where Did the Data Come From? 14

More Complex Pattern Matching Task: find telephone numbers in a file Example: Pattern: three digits, a dash -, four digits Patterns like this can be written as what are called "regular expressions" in linguistics and computer science [0-9]{3}-[0-9]{4} [0-9] Match any character from '0' to '9' {3} Match previous part 3 times {4} Match previous part 6 times 15

Print Matching Lines 16

Python re Module Almost any kind of text mangling can be easily programmed using the re module Can find complicated patterns, pull out pieces of the match, replace with other string... Many applications in both day to day data processing as well as linguistics, artificial intelligence, and (of course) literary analysis 17

Movies of the Day 18

Status Wednesday: Quiz on Functions, Lists, & Dictionaries Lecture: Working with Audio Saturday: Assignment 7 due 19