Presentation is loading. Please wait.

Presentation is loading. Please wait.

INTRODUCTION TO PYTHON PART 4 – TEXT AND FILE PROCESSING CSC482 Introduction to Text Analytics Thomas Tiahrt, MA, PhD.

Similar presentations


Presentation on theme: "INTRODUCTION TO PYTHON PART 4 – TEXT AND FILE PROCESSING CSC482 Introduction to Text Analytics Thomas Tiahrt, MA, PhD."— Presentation transcript:

1

2 INTRODUCTION TO PYTHON PART 4 – TEXT AND FILE PROCESSING CSC482 Introduction to Text Analytics Thomas Tiahrt, MA, PhD

3 Text and File Processing

4 Strings string: A sequence of text characters in a program. Strings start and end with quotation mark " or apostrophe ' characters. Examples: "hello" "This is a string" "This, too, is a string. It can be very long!" A string may not span across multiple lines or contain a " character. "This is not a legal String." "This is not a "legal" String either." A string can represent characters by preceding them with a backslash. \t tab character \n new line character \" quotation mark character \\ backslash character Example: "Hello\tthere\nHow are you?"

5 Strings Characters in a string are numbered with indexes starting at 0: Example: name = "P. Diddy" Accessing an individual character of a string: variableName [ index ] Example: print name, "starts with", name[0] Output: P. Diddy starts with P index01234567 character P. Diddy

6 String Operations len( string ) - number of characters in a string (including spaces) str.lower( string ) - lowercase version of a string str.upper( string ) - uppercase version of a string Example: name = "Martin Luther King" length = len(name) big_name = str.upper(name) print big_name, "has", length, "characters" Output: MARTIN LUTHER KING has 18 characters

7 raw_input raw_input : Reads a string of text from user input without evaluation. Example: name = raw_input("Howdy, pardner. What's yer name? ") print name, "... what a nice name!" Output: Howdy, pardner. What's yer name? Tweedle Dum Tweedle Dum... what a nice name!

8 text processing: Examining, editing, formatting text. often uses loops that examine the characters of a string one by one A for loop can examine each character in a string in sequence. Example: for c in "booyah": print c Output: b o y a h Text Processing

9 str( number)- converts a number into a string. Example: str(99) is “99" Text Processing

10 ord( text ) - converts a string into a number. Example: ord("a") is 97, ord("b") is 98,... Characters map to numbers using standardized mappings such as ASCII and Unicode. chr( number)- converts a number into a string. Example: chr(99) is "c“ Processing Strings and Numbers

11 name[start:end]# end is exclusive name[start:]# to end of list name[:end]# from start of list name[start:end:step] # every step'th value # lists can be printed # (or converted to string with str()) len(list) # returns a list’s length List Slicing

12 #Lists can be indexed with positive (or negative) # numbers index 0 1 2 3 4 5 6 7 value 9 14 12 19 16 18 24 15 index -8 -7 -6 -5 -4 -3 -2 -1 Indexing

13 Exercise: Write a program that performs a rotation cypher. Allow the rotation value to be input Ignore case e.g. "Attack" when rotated by 1 becomes "buubdl" Python Exercise 5

14 Many programs handle data, which often comes from files. Reading the entire contents of a file: variableName = open(" filename ").read() Example: file_text = open("bankaccount.txt").read() Reading a File

15 Reading a file line-by-line: for line in open(" filename ").readlines(): statements Example: count = 0 for line in open("bankaccount.txt").readlines(): count = count + 1 print "The file contains", count, "lines." Reading a File Line By Line

16 Exercise: Write a program to process a file of DNA text, such as: ATGCAATTGCTCGATTAG Compute the percent of C+G present in the DNA. Python Exercise 6

17 Suppose you have a list named ‘data’ containing: 21 48 59 … To write a file and isolate each entry on its own line: # Create a string with all of the items in the list # named ‘data’ separated by new-line characters with open('my_file.txt', 'w') as out_file: out_file.write('\n'.join(data)) To read a file and separate each line: with open('my_file.txt', 'rU') as in_file: data = in_file.read().split('\n') Reading and Writing with Line Breaks

18 name = open("filename", "w") # write name = open("filename", "a") # append # opens file for write (deletes previous contents), or # opens file for append (new data written at end of file) name.write(str) - writes the given string to the file name.close() - closes file once writing is done >>> out = open("output.txt", "w") >>> out.write("Hello, world!\n") >>> out.write("How are you?") >>> out.close() >>> open("output.txt").read() 'Hello, world!\nHow are you?' Writing Files

19 Conclusion of Python Part 4  The end has come.


Download ppt "INTRODUCTION TO PYTHON PART 4 – TEXT AND FILE PROCESSING CSC482 Introduction to Text Analytics Thomas Tiahrt, MA, PhD."

Similar presentations


Ads by Google