COMPUTATION WITH STRINGS 2 DAY 2 - 8/29/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.

Slides:



Advertisements
Similar presentations
Container Types in Python
Advertisements

Regular expressions Day 2
CS 100: Roadmap to Computing Fall 2014 Lecture 0.
Introduction to Computing Using Python Methods – on strings and other things  Strings, revisited  Objects and their methods  Indexing and slicing 
NLTK & Python Day 4 LING Computational Linguistics Harry Howard Tulane University.
String and Lists Dr. Benito Mendoza. 2 Outline What is a string String operations Traversing strings String slices What is a list Traversing a list List.
Strings and regular expressions Day 10 LING Computational Linguistics Harry Howard Tulane University.
UNICODE & CONTROL DAY /24/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
COMPUTATION WITH STRINGS 4 DAY 5 - 9/05/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
October 4, 2005ICP: Chapter 4: For Loops, Strings, and Tuples 1 Introduction to Computer Programming Chapter 4: For Loops, Strings, and Tuples Michael.
ON-LINE DOCUMENTS 3 DAY /17/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
UNICODE DAY /22/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
Introduction to Python September 26, /10/ Bioinformatics Languages Low-level, compiled languages: C, C++, Java… Pros: performance Cons:
Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by.
SCRIPTS & FUNCTIONS DAY /06/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
Built-in Data Structures in Python An Introduction.
WEB TEXT DAY /14/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
10. Python - Lists The list is a most versatile datatype available in Python, which can be written as a list of comma-separated values (items) between.
REGULAR EXPRESSIONS 3 DAY 8 - 9/12/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
REGULAR EXPRESSIONS 4 DAY 9 - 9/15/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
COMPUTATION WITH STRINGS 1 DAY 2 - 8/27/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
OCR Computing GCSE © Hodder Education 2013 Slide 1 OCR GCSE Computing Python programming 8: Fun with strings.
Introduction to Strings Intro to Computer Science CS1510, Section 2 Dr. Sarah Diesburg 1.
Strings The Basics. Strings a collection data type can refer to a string variable as one variable or as many different components (characters) string.
Python Mini-Course University of Oklahoma Department of Psychology Day 3 – Lesson 11 Using strings and sequences 5/02/09 Python Mini-Course: Day 3 – Lesson.
REGULAR EXPRESSIONS 1 DAY 6 - 9/08/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
Advanced Strings Intro to Computer Science CS1510, Section 2 Dr. Sarah Diesburg 1.
ON-LINE DOCUMENTS DAY /13/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
CONTROL 2 DAY /26/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
COMPUTATION WITH STRINGS 3 DAY 4 - 9/03/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
Computer Programming 2 Lab (1) I.Fatimah Alzahrani.
LECTURE 5 Strings. STRINGS We’ve already introduced the string data type a few lectures ago. Strings are subtypes of the sequence data type. Strings are.
CONTROL 3 DAY /29/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
Winter 2016CISC101 - Prof. McLeod1 CISC101 Reminders Quiz 3 this week – last section on Friday. Assignment 4 is posted. Data mining: –Designing functions.
String and Lists Dr. José M. Reyes Álamo. 2 Outline What is a string String operations Traversing strings String slices What is a list Traversing a list.
String and Lists Dr. José M. Reyes Álamo.
Lists 1 Day /17/14 LING 3820 & 6820 Natural Language Processing
Topic: Python Lists – Part 1
Containers and Lists CIS 40 – Introduction to Programming in Python
Lists 2 Day /19/14 LING 3820 & 6820 Natural Language Processing
Flat text Day 6 - 9/12/16 LING 3820 & 6820 Natural Language Processing
CMPT 120 Topic: Python strings.
Computation with strings 2 Day 3 - 9/02/16
Computation with strings 3 Day 4 - 9/07/16
Strings Part 1 Taken from notes by Dr. Neil Moore
Computation with strings 1 Day 2 - 8/31/16
Announcements 2nd homework is due this week Wednesday (October 18)
Methods – on strings and other things
Regular expressions 2 Day /23/16
control 4 Day /01/14 LING 3820 & 6820 Natural Language Processing
Summary of what we learned so far
LING 388: Computers and Language
Winter 2018 CISC101 12/1/2018 CISC101 Reminders
Python - Strings.
Announcements 3rd homework is due this week Wednesday (March 15)
CEV208 Computer Programming
String and Lists Dr. José M. Reyes Álamo.
Regular expressions 3 Day /26/16
Introduction to Strings
Methods – on strings and other things
Strings and the slice operator
CHAPTER 3: String And Numeric Data In Python
15-110: Principles of Computing
Introduction to Computer Science
Computation with strings 4 Day 5 - 9/09/16
Announcements HW1 is due TODAY.
Strings Taken from notes by Dr. Neil Moore & Dr. Debby Keen
CMPT 120 Topic: Python strings.
Control 1 Day /30/16 LING 3820 & 6820 Natural Language Processing
Introduction to Computer Science
Presentation transcript:

COMPUTATION WITH STRINGS 2 DAY 2 - 8/29/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University

Course organization 29-Aug-2014NLP, Prof. Howard, Tulane University 2   The syllabus is under construction.   Is there anyone here that wasn't here on Wednesday?  I didn't put together any practice, because we have done too little.  I will you some practice to do over the weekend.

Computer hygiene  You must turn your computer off every now and then, so that it can clean itself.  By the same token, you should close applications every now and then. 29-Aug NLP, Prof. Howard, Tulane University

What is a string? What is an escape character? What do these do: +, *, len(), sorted(), set()? What is the difference between a type & a token? Does Python know what you mean? Review 29-Aug NLP, Prof. Howard, Tulane University

A string is a sequence of characters delimited between single or double quotes.string §3. Computation with strings 29-Aug NLP, Prof. Howard, Tulane University

Open Spyder 29-Aug NLP, Prof. Howard, Tulane University

Method notation  The material aggregated to a method in parentheses is called its argument(s).  In the examples above, the argument S can be thought of linguistically as the object of a noun: the length of S, the alphabetical sorting of S, the set of S. But what if two pieces of information are needed for a method to work, for instance, to count the number of o’s in otolaryngologist?  To do so, Python allows for information to be prefixed to a method with a dot: >>> S.count('o')  The example can be read as “in S, count the o’s”, with the argument being the substring to be counted, 'o', and the attribute being the string over which the count progresses, or more generally:  attribute.method(argument)  What can be attribute and argument varies from method to method and so must be memorized. 29-Aug-2014NLP, Prof. Howard, Tulane University 7

How to clean up a string  There is a group of methods for modifying the properties of a string, illustrated below. You can guess what they do from their names: >>> S = 'i lOvE yOu' >>> S >>> S.lower() >>> S.upper() >>> S.swapcase() >>> S.capitalize() >>> S.title() >>> S.replace('O','o') >>> S.strip('i') >>> S2 = ' '+S+' ' >>> S2 >>> S2.strip() 29-Aug-2014NLP, Prof. Howard, Tulane University 8

3.3. How to find your way around a string 29-Aug NLP, Prof. Howard, Tulane University

index() or rindex()  You can ask Python for a character’s index with the index() or rindex() methods, which take the string as an attribute and the character as an argument: 1. >>> S = 'otolaryngologist' 2. >>> S.index('o') 3. >>> S.rindex('o') 4. >>> S.index('t') 5. >>> S.rindex('t') 6. >>> S.index('l') 7. >>> S.rindex('l') 8. >>> S.index('a') 9. >>> S.rindex('a') 29-Aug-2014NLP, Prof. Howard, Tulane University 10

find() & rfind()  Python also has a method find(), which appears to do the same thing as index() : 1. >>> S.find('o') 2. >>> S.rfind('o') 3. >>> S.find('t') 4. >>> S.rfind('t') 5. >>> S.find('l') 6. >>> S.rfind('l') 7. >>> S.find('a') 8. >>> S.rfind('a') 29-Aug-2014NLP, Prof. Howard, Tulane University 11

index() or find()  Where they differ lies in how they handle null responses: 1. >>> S.find('z') >>> S.index('z') 4. Traceback (most recent call last): 5. File " ", line 1, in 6. ValueError: substring not found 29-Aug-2014NLP, Prof. Howard, Tulane University 12

How to find substrings  These two methods can also find substrings: 1. >>> S.find('oto') 2. >>> S.index('oto') 3. >>> S.find('ist') 4. >>> S.index('ist') 5. >>> S.find('ly') 6. >>> S.index('ly') 29-Aug-2014NLP, Prof. Howard, Tulane University 13

Limiting the search to a substring  index() and find() allow optional arguments for the beginning and end positions of a substring, in order to limit searching to a substring’s confines: 1. >>> S.index('oto', 0, 3) 2. >>> S.index('oto', 3) 3. >>> S.find('oto', 0, 3) 4. >>> S.find('oto', 3)  index/find(string, beginning, end) 29-Aug-2014NLP, Prof. Howard, Tulane University 14

Zero-based indexation 29-Aug NLP, Prof. Howard, Tulane University

0 = 1  You probably thought that the first character in a string should be given the number 1, but Python actually gives it 0, and the second character gets 1.  There are some advantages to this format which do not concern us here, but we will mention a real- world example.  In Europe, the floors of buildings are numbered in such a way that the ground floor is considered the zeroth one, so that the first floor up from the ground is the first floor, though in the USA, it would called the second floor. 29-Aug-2014NLP, Prof. Howard, Tulane University 16

In a picture 29-Aug-2014NLP, Prof. Howard, Tulane University 17

Finding characters given a position 1. >>> S = 'abcdefgh' 2. >>> S[2] 3. >>> S[5] 4. >>> S[2:5] 5. >>> S[-6] 6. >>> S[-3] 7. >>> S[-6:-3] 8. >>> S[-6:-3] == S[2:5] 9. >>> S[-6:5] 10. >>> S[5:-6] 29-Aug-2014NLP, Prof. Howard, Tulane University 18

More slicing  If no beginning or end position is mentioned for a slice, Python defaults to the beginning or end of the string: 1. >>> S[2:] 2. >>> S[-2:] 3. >>> S[:2] 4. >>> S[:-2] 5. >>> S[:]  The result of a slice is a string object, so it can be concatenated with another string or repeated: 1. >>> S[:-1] + '!' 2. >>> S[:2] + S[2:] 3. >>> S[:2] + S[2:] == S 4. >>> S[-2:] * 2 29-Aug-2014NLP, Prof. Howard, Tulane University 19

Extended slicing  Slice syntax allows a mysterious third argument, by appending an additional colon and integer. What do these do?: 1. >>> S[::1] 2. >>> S[::2] 3. >>> S[::3] 4. >>> S[::4] 29-Aug-2014NLP, Prof. Howard, Tulane University 20

All three arguments together  Of course, you can still use the first two arguments to slice out a substring, which the third one steps through: 1. >>> S[1:7:1] 2. >>> S[1:7:2] 3. >>> S[1:7:3] 4. >>> S[1:7:6]  Thus the overall format of a slice is:  string[start:end:step] 29-Aug-2014NLP, Prof. Howard, Tulane University 21

How to reverse a string 1. >>> S[::-1] 2. >>> S[::-2] 3. >>> S[::-3] 4. >>> S[::-4] 29-Aug-2014NLP, Prof. Howard, Tulane University 22

The rest of §3 I will send you some practice for what we have done this week. Next time 29-Aug-2014NLP, Prof. Howard, Tulane University 23