NLTK & Python Day 6 LING 681.02 Computational Linguistics Harry Howard Tulane University.

Slides:



Advertisements
Similar presentations
Programming with App Inventor Computing Institute for K-12 Teachers Summer 2012 Workshop.
Advertisements

Programming for Linguists
Regular expressions Day 2
Text Corpora and Lexical Resources Chapter 2 of Natural Language Processing with Python.
Analysis of Algorithms
Chapter 3 DATA: TYPES, CLASSES, AND OBJECTS. Chapter 3 Data Abstraction Abstract data types allow you to work with data without concern for how the data.
NLTK & Python Day 4 LING Computational Linguistics Harry Howard Tulane University.
Strings and regular expressions Day 10 LING Computational Linguistics Harry Howard Tulane University.
Loops (Part 1) Computer Science Erwin High School Fall 2014.
TEXT STATISTICS 1 DAY /20/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
Geography 465 Assignments, Conditionals, and Loops.
More Summary Statistics March 5, 2013 CS Intro. to Comp. for the Humanities and Social Sciences 1.
Decisions in Python Comparing Strings – ASCII History.
Python Control of Flow.
1 Python Control of Flow and Defining Classes LING 5200 Computational Corpus Linguistics Martha Palmer.
Line Continuation, Output Formatting, and Decision Structures CS303E: Elements of Computers and Programming.
NLTK & BASIC TEXT STATS DAY /08/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
COMPUTATION WITH STRINGS 4 DAY 5 - 9/05/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
Lists and More About Strings CS303E: Elements of Computers and Programming.
Structured programming 4 Day 34 LING Computational Linguistics Harry Howard Tulane University.
Python – Making Decisions Lecture 02. Control Structures A program that only has one flow is useful but limited. We can use if statements to make these.
UNICODE DAY /22/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
Data Collections: Dictionaries CSC 161: The Art of Programming Prof. Henry Kautz 11/4/2009.
NLTK & Python Day 7 LING Computational Linguistics Harry Howard Tulane University.
Structured programming 3 Day 33 LING Computational Linguistics Harry Howard Tulane University.
SCRIPTS & FUNCTIONS DAY /06/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
TWITTER DAY /07/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
WEB TEXT DAY /14/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
Data Collections: Lists CSC 161: The Art of Programming Prof. Henry Kautz 11/2/2009.
NLTK & Python Day 5 LING Computational Linguistics Harry Howard Tulane University.
REGULAR EXPRESSIONS 4 DAY 9 - 9/15/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
COMPUTATION WITH STRINGS 1 DAY 2 - 8/27/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
More about Strings. String Formatting  So far we have used comma separators to print messages  This is fine until our messages become quite complex:
Finite-state automata Day 12 LING Computational Linguistics Harry Howard Tulane University.
NLTK & Python Day 8 LING Computational Linguistics Harry Howard Tulane University.
Python Mini-Course University of Oklahoma Department of Psychology Day 3 – Lesson 11 Using strings and sequences 5/02/09 Python Mini-Course: Day 3 – Lesson.
COMPE 111 Introduction to Computer Engineering Programming in Python Atılım University
CONTROL 2 DAY /26/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
Tutorial 9 Iteration. Reminder Assignment 8 is due Wednesday.
COMPUTATION WITH STRINGS 3 DAY 4 - 9/03/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
Regular expressions Day 11 LING Computational Linguistics Harry Howard Tulane University.
Control Flow (Python) Dr. José M. Reyes Álamo. 2 Control Flow Sequential statements Decision statements Repetition statements (loops)
Control Flow (Python) Dr. José M. Reyes Álamo. 2 Control Flow Sequential statements Decision statements Repetition statements (loops)
CMPT 120 Topic: Searching – Part 2 and Intro to Time Complexity (Algorithm Analysis)
LECTURE 5 Strings. STRINGS We’ve already introduced the string data type a few lectures ago. Strings are subtypes of the sequence data type. Strings are.
CONTROL 3 DAY /29/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
String and Lists Dr. José M. Reyes Álamo. 2 Outline What is a string String operations Traversing strings String slices What is a list Traversing a list.
Guide to Programming with Python Chapter Four Strings, and Tuples; for Loops: The Word Jumble Game.
Lecture 4 CS140 Dick Steflik. Reading Keyboard Input Import java.util.Scanner – A simple text scanner which can parse primitive types and strings using.
7 - Programming 7J, K, L, M, N, O – Handling Data.
String and Lists Dr. José M. Reyes Álamo.
Control Flow (Python) Dr. José M. Reyes Álamo.
Warm-up Program Use the same method as your first fortune cookie project and write a program that reads in a string from the user and, at random, will.
C-Character Set Dept. of Computer Applications Prof. Harpreet Kaur
Computation with strings 3 Day 4 - 9/07/16
Intro to Computer Science CS1510 Dr. Sarah Diesburg
For loops Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble Notes for 2010: I skipped slide 10. This is.
Introduction to pseudocode
control 4 Day /01/14 LING 3820 & 6820 Natural Language Processing
LING 388: Computers and Language
Selection (IF Statements)
© A+ Computer Science - What is a LOOP? © A+ Computer Science -
CEV208 Computer Programming
String and Lists Dr. José M. Reyes Álamo.
Regular expressions 3 Day /26/16
Controlling Program Flow
Introduction to Computer Science
Control 1 Day /30/16 LING 3820 & 6820 Natural Language Processing
COMPUTING.
Presentation transcript:

NLTK & Python Day 6 LING Computational Linguistics Harry Howard Tulane University

31-Aug-2009LING , Prof. Howard, Tulane University2 Course organization  I have requested that Python and NLTK be installed on the computers in this room.

NLPP §1.3. Computing with Language: Simple Statistics

31-Aug-2009LING , Prof. Howard, Tulane University4 Summary of freq dist Table 1.2 ExampleDescription fdist = FreqDist(samples) create a frequency distribution containing the samples given fdist.inc(sample) increment the count for this sample fdist['monstrous'] count of the number of times a sample occurred fdist.freq('monstrous') frequency of a sample fdist.N() total number of samples fdist.keys() the samples sorted in order of decreasing frequency for sample in fdist: iterate over the samples, in order of decreasing frequency fdist.max() sample with the greatest count fdist.tabulate() tabulate the frequency distribution fdist.plot() graphical plot of the frequency distribution fdist.plot(cumulative=True) cumulative plot of the frequency distribution fdist1 < fdist2 test if samples in fdist1 occur less frequently than in fdist2

NLPP 1.4 Back to Python: Making Decisions and Taking Control

31-Aug-2009LING , Prof. Howard, Tulane University6 Relational operators Table 1.3 OperatorRelationship <less than <=less than or equal to ==equal to (note this is two "=" signs, not one) !=not equal to >greater than >=greater than or equal to

31-Aug-2009LING , Prof. Howard, Tulane University7 String comparison operators Table 1.4 FunctionMeaning s.startswith(t) test if s starts with t s.endswith(t) test if s ends with t t in s test if t is contained inside s s.islower() test if all cased characters in s are lowercase s.isupper() test if all cased characters in s are uppercase s.isalpha() test if all characters in s are alphabetic s.isalnum() test if all characters in s are alphanumeric s.isdigit() test if all characters in s are digits s.istitle() test if s is titlecased (all words have initial caps)

31-Aug-2009LING , Prof. Howard, Tulane University8 Conditions  Both sets of operators return a value of yes or no, true or false.  Thus they can be used as conditions to test something: [w for w in text if condition]

31-Aug-2009LING , Prof. Howard, Tulane University9 Some examples >>> sorted([w for w in set(text1) if w.endswith('ableness')]) ['comfortableness', 'honourableness', 'immutableness', 'indispensableness',...] >>> sorted([term for term in set(text4) if 'gnt' in term]) ['Sovereignty', 'sovereignties', 'sovereignty'] >>> sorted([item for item in set(sent7) if item.isdigit()]) ['29', '61']  Get rid of punctuation marks: [word.lower() for word in text1 if word.isalpha()])

31-Aug-2009LING , Prof. Howard, Tulane University10 Making more complex conditions  not condition  condition and condition  condition or condition

31-Aug-2009LING , Prof. Howard, Tulane University11 Conditionals  What if you want to do something when the condition is true beside build a list?  Follow the condition with a colon: >>> word = 'cat' >>> if len(word) < 5:... print 'word length is less than 5'... word length is less than 5 >>> >>> if len(word) >= 5:... print 'word length is greater than or equal to 5'... >>>

31-Aug-2009LING , Prof. Howard, Tulane University12 More complex conditionals  if, elif, else >>> if len(word) < 5:... print 'word length is less than 5'... elif len(word) > 5:... print 'word length is greater than 5'... else len(word) == 5:... print 'word length is equal to 5'... >>>

31-Aug-2009LING , Prof. Howard, Tulane University13 Looping with for  We have already seen for as the operator for iterating through a list.  More generally, for starts a loop that runs through a number of elements: >>> for word in ['Call', 'me', 'Ishmael', '.']:... print word... Call me Ishmael. >>>

31-Aug-2009LING , Prof. Howard, Tulane University14 Conditions with for >>> sent1 = ['Call', 'me', 'Ishmael', '.'] >>> for token in sent1:... if token.islower():... print token, 'is a lowercase word'... elif token.istitle():... print token, 'is a titlecase word'... else:... print token, 'is punctuation'...

Next time First quiz/project NLPP: §2