String Processing Eric Roberts CS 106A February 1, 2016.

Slides:



Advertisements
Similar presentations
AP Computer Science Anthony Keen. Computer 101 What happens when you turn a computer on? –BIOS tries to start a system loader –A system loader tries to.
Advertisements

Chapter 1 Object-Oriented Concepts. A class consists of variables called fields together with functions called methods that act on those fields.
Programmer-defined classes Part 2. Topics Returning objects from methods The this keyword Overloading methods Class methods Packaging classes Javadoc.
1 Chapter Ten Modular Development. 2 Stepwise Refinement At first, simple programs consist of only one source file and one function (the function main.
Self Check 1.Which are the most commonly used number types in Java? 2.Suppose x is a double. When does the cast (long) x yield a different result from.
Interfaces A Java interface is a collection
Computer Science 101 Data Encryption And Computer Networks.
Chapter 7: User-Defined Functions II
Team Name: team13 Programmer: 陳則凱 b Tester: 劉典恆 b
CSCI 1100/1202 April 3, Testing A program should be executed multiple times with various input in an attempt to find errors Debugging is the process.
1 ICS103 Programming in C Lecture 3: Introduction to C (2)
String Processing Eric Roberts CS 106A February 3, 2010.
Chapter 4: Writing Classes Presentation slides for Java Software Solutions Foundations of Program Design Second Edition by John Lewis and William Loftus.
Chapter 4: Writing Classes Presentation slides for Java Software Solutions Foundations of Program Design Third Edition by John Lewis and William Loftus.
© 2006 Pearson Addison-Wesley. All rights reserved2-1 Console Input Using the Scanner Class Starting with version 5.0, Java includes a class for doing.
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4 th Ed Chapter Chapter 9 Characters and Strings (sections ,
1 10/25/06CS150 Introduction to Computer Science 1 Reading from and Writing to Files.
Methods in Java Selim Aksoy Bilkent University Department of Computer Engineering
Chapter 8—Strings and Characters The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Strings and Characters C H A P T E R 8.
Chapter 8 Characters and Strings. Principle of enumeration Computers tend to be good at working with numeric data. The ability to represent an integer.
© The McGraw-Hill Companies, 2006 Chapter 1 The first step.
Introduction to Java Appendix A. Appendix A: Introduction to Java2 Chapter Objectives To understand the essentials of object-oriented programming in Java.
Functions in C++ Eric Roberts CS 106B January 9, 2013.
The Turing machine Olena Lastivka. Definition Turing machine is a theoretical device that manipulates symbols on a strip of tape according to a table.
Team Name: team13 Programmer: 陳則凱 b Tester: 劉典恆 b
Mastering Char to ASCII AND DOING MORE RELATED STRING MANIPULATION Why VB.Net ?  The Language resembles Pseudocode - good for teaching and learning fundamentals.
Program A computer program (also software, or just a program) is a sequence of instructions written in a sequence to perform a specified task with a computer.
Fall Week 4 CSCI-141 Scott C. Johnson.  Computers can process text as well as numbers ◦ Example: a news agency might want to find all the articles.
EXAM 1 REVIEW. days until the AP Computer Science test.
A Balanced Introduction to Computer Science, 3/E David Reed, Creighton University ©2011 Pearson Prentice Hall ISBN Chapter 15 JavaScript.
CS50 Week 2. RESOURCES Office hours ( Lecture videos, slides, source code, and notes (
1 Week 12 l Overview of Streams and File I/O l Text File I/O Streams and File I/O.
Textbook: Data Structures and the Java Collections Framework 3rd Edition by William Collins William Collins.
1 Chapter 3 Syntax, Errors, and Debugging Fundamentals of Java: AP Computer Science Essentials, 4th Edition Lambert / Osborne.
CMSC 202 Java Console I/O. July 25, Introduction Displaying text to the user and allowing the user to enter text are fundamental operations performed.
1 CS161 Introduction to Computer Science Topic #9.
Fall 2002CS 150: Intro. to Computing1 Streams and File I/O (That is, Input/Output) OR How you read data from files and write data to files.
 2008 Pearson Education, Inc. All rights reserved JavaScript: Introduction to Scripting.
CSC 1010 Programming for All Lecture 3 Useful Python Elements for Designing Programs Some material based on material from Marty Stepp, Instructor, University.
CS100A, Lecture 13, 15 October CS100A Lecture Oct Discussion of Prelim 2 (Tuesday, 20 October, 7:30-9PM) Rooms for Prelim 2 A-K: Hollister.
Expressions Eric Roberts CS 106A January 13, 2016.
The Enigma Machine Eric Roberts CS 106A February 3, 2016.
 2007 Pearson Education, Inc. All rights reserved. A Simple C Program 1 /* ************************************************* *** Program: hello_world.
Slides prepared by Rose Williams, Binghamton University Console Input and Output.
CS 154 Formal Languages and Computability April 5 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
Intelligent Data Systems Lab. Department of Computer Science & Engineering Practices 컴퓨터의 개념 및 실습 4 월 11 일.
Programming – Lecture 9 Strings and Characters (Chapter 8) Enumeration ASCII / Unicode, special characters Character class Character arithmetic Strings.
Sorts, CompareTo Method and Strings
CS 106A, Lecture 4 Introduction to Java
Programming – Lecture 9 Strings and Characters (Chapter 8) Enumeration
Eric Roberts and Jerry Cain
Chapter 2 Basic Computation
Jerry Cain and Eric Roberts
ICS103 Programming in C Lecture 3: Introduction to C (2)
Introduction to Scripting
Section 3.2c Strings and Method Signatures
Characters Computers use the principle of enumeration to represent character data inside the memory of the machine. There are, after all, a finite number.
CS 106A, Lecture 9 Problem-Solving with Strings
Introduction to C++ Programming
Encryption and Decryption
CSC 221: Introduction to Programming Fall 2018
Chapter 8—Strings and Characters
CS139 October 11, 2004.
slides courtesy of Eric Roberts
slides courtesy of Eric Roberts and Jerry Cain
String Processing 1 MIS 3406 Department of MIS Fox School of Business
Unit 3: Variables in Java
Python Strings.
Presentation transcript:

String Processing Eric Roberts CS 106A February 1, 2016

Once upon a time...

Alan Turing Alan Turing ( ) The film The Imitation Game celebrated the life of Alan Turing, who made many important contributions in many areas of computer science, including hardware design, computability, and AI. During World War II, Turing headed the mathematics division at Bletchley Park in England, which broke the German Enigma code—a process you’ll simulate in Assignment #4. Tragically, Turing committed suicide in 1954 after being convicted on a charge of “gross indecency” for homosexual behavior. Prime Minister Gordon Brown issued a public apology in 2009.

String Processing

Translating Pig Latin to English Section 8.5 works through the design and implementation of a program to convert a sentence from English to Pig Latin. In this dialect, the Pig Latin version of a word is formed by applying the following rules: If the word begins with a consonant, the translateWord method moves the initial consonant string to the end of the word and then adds the suffix ay, as follows: 1. scram scram scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr am scr amscram ay If the word begins with a vowel, translateWord generates the Pig Latin version simply by adding the suffix way, like this: 2. apple way If the word contains no vowels at all, translateWord returns the original word unchanged. 3.

Pseudocode for the Pig Latin Program public void run() { Tell the user what the program does. Ask the user for a line of text. Translate the line into Pig Latin and print it on the console. } private String translateLine(String line) { Initialize a string variable called result to hold the growing string. Divide the line into individual words and separating characters called tokens. while ( there are still tokens to be processed ) { Get the next token. If the token is a word, call translateWord to translate it to Pig Latin. Concatenate the token to the end of the result variable. } private String translateWord(String word) { Find the first vowel in the word. If there are no vowels, return the original word unchanged. If the vowel appears in the first position, return the word concatenated with "way". Divide the string into two parts (head and tail) before the vowel. Return the result of concatenating the tail, the head, and the string "ay". }

Designing translateLine The translateLine method must divide the input line into words, translate each word, and then reassemble those words. Although it is not hard to write code that divides a string into words, it is easier still to make use of existing facilities in the Java library to perform this task. One strategy is to use the StringTokenizer class in the java.util package, which divides a string into independent units called tokens. The client then reads these tokens one at a time. The set of tokens delivered by the tokenizer is called the token stream. The precise definition of what constitutes a token depends on the application. For the Pig Latin problem, tokens are either words or the characters that separate words, which are called delimiters. The application cannot work with the words alone, because the delimiter characters are necessary to ensure that the words don’t run together in the output.

The StringTokenizer Class The constructor for the StringTokenizer class takes three arguments, where the last two are optional: –A string indicating the source of the tokens. –A string which specifies the delimiter characters to use. By default, the delimiter characters are set to the whitespace characters. –A flag indicating whether the tokenizer should return delimiters as part of the token stream. By default, a StringTokenizer ignores the delimiters. Once you have created a StringTokenizer, you use it by setting up a loop with the following general form: while (tokenizer.hasMoreTokens()) { String token = tokenizer.nextToken(); code to process the token }

The translateLine Method private String translateLine(String line) { String result = ""; StringTokenizer tokenizer = new StringTokenizer(line, DELIMITERS, true); while (tokenizer.hasMoreTokens()) { String token = tokenizer.nextToken(); if (isWord(token)) { token = translateWord(token); } result += token; } return result; } The existence of the StringTokenizer class makes it easy to code the translateLine method, which looks like this: The DELIMITERS constant is a string containing all the legal punctuation marks to ensure that they aren’t combined with the words.

The translateWord Method private String translateWord(String word) { int vp = findFirstVowel(word); if (vp == -1) { return word; } else if (vp == 0) { return word + "way"; } else { String head = word.substring(0, vp); String tail = word.substring(vp); return tail + head + "ay"; } The translateWord method consists of the rules for forming Pig Latin words, translated into Java: The remaining methods ( isWord and findFirstVowel ) are both straightforward. The simulation on the following slide simply assumes that these methods work as intended.

The PigLatin Program public void run() { println("This program translates a line into Pig Latin."); String line = readLine("Enter a line: "); println( translateLine(line) ); } skip simulation line this is pig latin. This program translates a line into Pig Latin. isthay isway igpay atinlay. this is pig latin.Enter a line: PigLatin private String translateLine(String line) { String result = ""; StringTokenizer tokenizer = new StringTokenizer(line, DELIMITERS, true); while ( tokenizer.hasMoreTokens() ) { String token = tokenizer.nextToken(); if ( isWord(token) ) token = translateWord(token); result += token; } return result; } lineresulttoken tokenizer this is pig latin. thisispiglatin.isthay isthay isway isthay isway igpay isthay isway igpay atinlayisthay isway igpay atinlay.isthayiswayigpayatinlay this is pig latin. private String translateWord(String word) { int vp = findFirstVowel(word); if ( vp == -1 ) { return word; } else if ( vp == 0 ) { return word + "way"; } else { String head = word.substring(0, vp); String tail = word.substring(vp); return tail + head + "ay"; } word this tailheadvp isth2 private String translateWord(String word) { int vp = findFirstVowel(word); if ( vp == -1 ) { return word; } else if ( vp == 0 ) { return word + "way"; } else { String head = word.substring(0, vp); String tail = word.substring(vp); return tail + head + "ay"; } word is tailheadvp 0 private String translateWord(String word) { int vp = findFirstVowel(word); if ( vp == -1 ) { return word; } else if ( vp == 0 ) { return word + "way"; } else { String head = word.substring(0, vp); String tail = word.substring(vp); return tail + head + "ay"; } word pig tailheadvp igp1 private String translateWord(String word) { int vp = findFirstVowel(word); if ( vp == -1 ) { return word; } else if ( vp == 0 ) { return word + "way"; } else { String head = word.substring(0, vp); String tail = word.substring(vp); return tail + head + "ay"; } word latin tailheadvp atinl1 public void run() { println("This program translates a line into Pig Latin."); String line = readLine("Enter a line: "); println( translateLine(line) ); } line this is pig latin.

Encryption Twas brillig, and the slithy toves,Did gyre and gimble in the wabe:Doo plpvb zhuh wkh erurjryhv,Lfr gax wvwx clgat vngeclzx. Twas brillig, and the slithy toves, Did gyre and gimble in the wabe: All mimsy were the borogoves, And the mome raths outgrabe. Twas brillig, and the slithy toves, Did gyre and gimble in the wabe: All mimsy were the borogoves, And the mome raths outgrabe. ABCDEFGHIJKLMNOPQRSTUVWXYZ LZDRXPEAJYBQWFVIHCTGNOMKSU

Creating a Caesar Cipher public void run() { println("This program implements a Caesar cipher."); int key = readInt("Character positions to shift: "); String plaintext = readLine("Enter a message: "); String ciphertext = encodeCaesarCipher(plaintext, key); println("Encoded message: " + ciphertext); } CaesarCipher skip simulation This program implements a Caesar cipher. Character positions to shift:3 Enter a message:JABBERWOCKY Encoded message: MDEEHUZRFNB plaintext JABBERWOCKY key 3 ciphertext MDEEHUZRFNB private String encodeCaesarCipher(String plaintext, int key) { if (key < 0) key = 26 - (-key % 26); String result = ""; for (int i = 0; i < plaintext.length(); i++) { char ch = plaintext.charAt(i); if (Character.isUpperCase(ch)) { ch = (char) ('A' + (ch - 'A' + key) % 26); } result += ch; } return result; } plaintext JABBERWOCKY key 3 iresultch MMDMDEMDEEMDEEHMDEEHUMDEEHUZMDEEHUZRMDEEHUZRFMDEEHUZRFNMDEEHUZRFNB'J' 'M' 'A''D''B''E''B''E' 'H''R''U''W''Z''O''R''C''F''K''N''Y''B' public void run() { println("This program implements a Caesar cipher."); int key = readInt("Character positions to shift: "); String plaintext = readLine("Enter a message: "); String ciphertext = encodeCaesarCipher(plaintext, key); println("Encoded message: " + ciphertext); } ciphertext MDEEHUZRFNB plaintext JABBERWOCKY key 3

Exercise: Letter-Substitution Cipher One of the simplest types of codes is a letter-substitution cipher, in which each letter in the original message is replaced by some different letter in the coded version of that message. In this type of cipher, the key is often presented as a sequence of 26 letters that shows how each of the letters in the standard alphabet are mapped into their enciphered counterparts: LetterSubstitutionCipher Letter-substitution cipher. Enter 26-letter key:LZDRXPEAJYBQWFVIHCTGNOMKSU Plaintext:LEWIS CARROLL Ciphertext: QXMJT DLCCVQQ ABCDEFGHIJKLMNOPQRSTUVWXYZ LZDRXPEAJYBQWFVIHCTGNOMKSU ||||||||||||||||||||||||||

Starting at the Top The principle of top-down design suggests starting with the run method, which has the following pseudocode form: public void run() { Tell the user what the program does. Ask the user to enter a 26-letter key. Ask the user for a line of text. Call a method to encrypt the line using the specified key. Print the result of that encryption. } This pseudocode is easy to translate to Java: public void run() { println("Letter-substitution cipher."); String key = readLine("Enter 26-letter key: "); String plaintext = readLine("Plaintext: "); String ciphertext = encrypt(plaintext, key); println("Ciphertext: " + ciphertext); } All that remains is to write the code for encrypt.

The End