Strings, Characters and Regular Expressions

Slides:



Advertisements
Similar presentations
Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All rights reserved. 1 Chapter 9 Strings.
Advertisements

Chapter 7 Strings F To process strings using the String class, the StringBuffer class, and the StringTokenizer class. F To use the String class to process.
1 Strings and Text I/O. 2 Motivations Often you encounter the problems that involve string processing and file input and output. Suppose you need to write.
Chapter 7 Strings F Processing strings using the String class, the StringBuffer class, and the StringTokenizer class. F Use the String class to process.
Java Programming Strings Chapter 7.
©2004 Brooks/Cole Chapter 7 Strings and Characters.
Primitive Data Types There are a number of common objects we encounter and are treated specially by almost any programming language These are called basic.
1 Chapter 4 Language Fundamentals. 2 Identifiers Program parts such as packages, classes, and class members have names, which are formally known as identifiers.
Chapter 9 Characters and Strings. Topics Character primitives Character Wrapper class More String Methods String Comparison String Buffer String Tokenizer.
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4 th Ed Chapter Chapter 9 Characters and Strings (sections ,
Fundamental Programming Structures in Java: Strings.
28-Jun-15 String and StringBuilder Part I: String.
 2006 Pearson Education, Inc. All rights reserved Strings, Characters and Regular Expressions.
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4 th Ed Chapter Chapter 9 Characters and Strings (sections ,
 Pearson Education, Inc. All rights reserved Strings, Characters and Regular Expressions.
String Escape Sequences
Lesson 3 – Regular Expressions Sandeepa Harshanganie Kannangara MBCS | B.Sc. (special) in MIT.
Advanced Programming Collage of Information Technology University of Palestine, Gaza Prepared by: Mahmoud Rafeek Alfarra Lecture 16: Working with Text.
Introduction to Java Appendix A. Appendix A: Introduction to Java2 Chapter Objectives To understand the essentials of object-oriented programming in Java.
Characters, String and Regular expressions. Characters char data type is used to represent a single character. Characters are stored in a computer memory.
Java How to Program, 9/e © Copyright by Pearson Education, Inc. All Rights Reserved.
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4 th Ed Chapter Characters In Java, single characters are represented.
Week 1 Algorithmization and Programming Languages.
Liang, Introduction to Java Programming, Seventh Edition, (c) 2009 Pearson Education, Inc. All rights reserved Chapter 8 Strings 1.
Regular Expressions.
Portions adapted with permission from the textbook author. CS-1020 Dr. Mark L. Hornick 1 Regular Expressions and String processing Animated Version.
An Introduction to Java Programming and Object-Oriented Application Development Chapter 7 Characters, Strings, and Formatting.
When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization.
Introduction to Java Java Translation Program Structure
Chapter 7: Characters, Strings, and the StringBuilder.
1 Textual Data Many computer applications manipulate textual data word processors web browsers online dictionaries.
String Handling StringBuffer class character class StringTokenizer class.
Strings Chapter 7 CSCI CSCI 1302 – Strings2 Outline Introduction The String class –Constructing a String –Immutable and Canonical Strings –String.
Jaeki Song JAVA Lecture 07 String. Jaeki Song JAVA Outline String class String comparisons String conversions StringBuffer class StringTokenizer class.
Strings JavaMethods An Introduction to Object-Oriented Programming Maria Litvin Gary Litvin Copyright © 2003 by Maria Litvin, Gary Litvin, and Skylight.
Strings Java Methods A & AB Object-Oriented Programming and Data Structures Maria Litvin ● Gary Litvin Copyright © 2006 by Maria Litvin, Gary Litvin, and.
String String Builder. System.String string is the alias for System.String A string is an object of class string in the System namespace representing.
Strings and Related Classes String and character processing Class java.lang.String Class java.lang.StringBuffer Class java.lang.Character Class java.util.StringTokenizer.
17-Feb-16 String and StringBuilder Part I: String.
Strings, Characters, and Regular Expressions Session 10 Mata kuliah: M0874 – Programming II Tahun: 2010.
String and StringBuffer classes
Lecture 19 Strings and Regular Expressions
Object-Oriented Java Programming
Strings, Characters and Regular Expressions
Strings, StringBuilder, and Character
University of Central Florida COP 3330 Object Oriented Programming
Strings, Characters and Regular Expressions
String and String Buffers
String String Builder.
Primitive Types Vs. Reference Types, Strings, Enumerations
Modern Programming Tools And Techniques-I Lecture 11: String Handling
String and StringBuilder
Chapter 7: Strings and Characters
Object Oriented Programming
MSIS 655 Advanced Business Applications Programming
Introduction to C++ Programming
Part a: Fundamentals & Class String
String and StringBuilder
String and StringBuilder
Java – String Handling.
Lecture 07 String Jaeki Song.
String and StringBuilder
JavaScript: Objects.
Dr. Sampath Jayarathna Cal Poly Pomona
Visual Programming COMP-315
Chapter 10 Thinking in Objects Part 2
Object-Oriented Java Programming
In Java, strings are objects that belong to class java.lang.String .
Unit-2 Objects and Classes
Presentation transcript:

Strings, Characters and Regular Expressions 30 Strings, Characters and Regular Expressions

The chief defect of Henry King Was chewing little bits of string. —Hilaire Belloc Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no unnecessary sentences. —William Strunk, Jr. I have made this letter longer than usual, because I lack the time to make it short. —Blaise Pascal

OBJECTIVES In this chapter you will learn: To create and manipulate immutable character string objects of class String. To create and manipulates mutable character string objects of class StringBuilder. To create and manipulate objects of class Character. To use a StringTokenizer object to break a String object into tokens. To use regular expressions to validate String data entered into an application.

30.1   Introduction 30.2   Fundamentals of Characters and Strings 30.3   Class String 30.3.1  String Constructors 30.3.2  String Methods length, charAt and getChars 30.3.3  Comparing Strings 30.3.4  Locating Characters and Substrings in Strings 30.3.5  Extracting Substrings from Strings 30.3.6  Concatenating Strings 30.3.7  Miscellaneous String Methods 30.3.8  String Method valueOf 30.4   Class StringBuilder 30.4.1  StringBuilder Constructors 30.4.2  StringBuilder Methods length, capacity, setLength and ensureCapacity 30.4.3  StringBuilder Methods charAt, setCharAt, getChars and reverse 30.4.4  StringBuilder append Methods 30.4.5  StringBuilder Insertion and Deletion Methods 30.5   Class Character 30.6   Class StringTokenizer 30.7   Regular Expressions, Class Pattern and Class Matcher 30.8   Wrap-Up

30.1 Introduction String and character processing Regular expressions Class java.lang.String Class java.lang.StringBuilder Class java.lang.Character Class java.util.StringTokenizer Regular expressions Valid input Package java.util.regex Classes Matcher and Pattern

30.2 Fundamentals of Characters and Strings “Building blocks” of Java source programs Character literals ‘a’ ‘2’ ‘+’ Unicode character set ( 2 byte ) – international String Series of characters treated as single unit May include letters, digits, special characters. Object of class String String literals “12345” “” “Hello”

Performance Tip 30.1 1Java treats all string literals with the same contents as a single String object that has many references to it. This conserves memory.

30.3 String Constructors Fig. 30.1 demonstrates four constructors No-argument constructor Creates a String object containing an empty String One-argument constructor A String object A char array Three-argument constructor An integer specifies the starting position An integer specifies the number of characters to access

Software Engineering Obervation 30.1 It is not necessary to copy an existing String object. String objects are immutable—their character contents cannot be changed after they are created class String does not provide any methods that allow the contents of a String object to be modified.

Common Programming Error 30.1 Attempting to access a character that is outside the bounds of a string (i.e., an index less than 0 or an index greater than or equal to the string’s length) results in a StringIndexOutOfBoundsException.

30.3.2 String Methods length, charAt and getChars Method length() Determine the length of a String Like arrays, Strings always “know” their size Unlike arrays, Strings do not have a length instance variable Method charAt() Get character at specific location in String Method getChars() Get entire set of characters in String

Copy (some of) s1’s characters to charArray

30.3 Comparing Strings Comparing String objects Method equals Method equalsIgnoreCase Method compareTo Method regionMatches

Common Programming Error 30.2 Comparing references with == can lead to logic errors, because == compares the references to determine whether they refer to the same object, not whether two objects have the same contents. When two identical (but separate) objects are compared with ==, the result will be false. When comparing objects to determine whether they have the same contents, use method equals.

30.3.4 Locating Characters and Substrings in Strings Search for characters in String Method indexOf Method lastIndexOf

30.3.5 Extracting Substrings from Strings Create Strings from other Strings Method substring

30.3.6 Concatenating Strings Method concat Concatenate two String objects

30.3.7 Miscellaneous String Methods Return modified copies of String Return character array

30.3.8 String Method valueOf String provides static class methods Returns String representation of object, data, etc.

static method valueOf of class String returns String representation of various types

30.4 Class StringBuilder Class StringBuilder When String object is created, its contents cannot change Used for creating and manipulating dynamic string data i.e., modifiable Strings Can store characters based on capacity Capacity expands dynamically to handle additional characters Uses operators + and += for String concatenation

Performance Tip 30.2 Java can perform certain optimizations involving String objects (such as sharing one String object among multiple references) because it knows these objects will not change. Strings (not StringBuilders) should be used if the data will not change.

Performance Tip 30.3 In programs that frequently perform string concatenation, or other string modifications, it is more efficient to implement the modifications with class StringBuilder (covered in Section 30.4).

30.4.1 StringBuilder Constructors Four StringBuilder constructors No-argument constructor Creates StringBuilder with no characters Initial capacity of 16 characters One-argument constructor int argument Specifies the initial capacity String argument Creates StringBuilder containing the characters in the String argument

30.4.2 StringBuilder Methods Method length() Return StringBuilder length Method capacity() Return StringBuilder capacity Method setLength() Increase or decrease StringBuilder length Method ensureCapacity() Set StringBuilder capacity Guarantee that StringBuilder has minimum capacity

Performance Tip 30.4 Dynamically increasing the capacity of a StringBuilder can take a relatively long time. Executing a large number of these operations can degrade the performance of an application. If a StringBuilder is going to increase greatly in size, possibly multiple times, setting its capacity high at the beginning will increase performance.

19.4.3 StringBuilder Methods Manipulating StringBuilder characters Method charAt() Return StringBuffer character at specified index Method setCharAt() Set StringBuffer character at specified index Method getChars() Return character array from StringBuffer Method reverse() Reverse StringBuffer contents Method append() Allow data values to be added to StringBuilder

Common Programming Error 30.3 Attempting to access a character that is outside the bounds of a StringBuilder (i.e., with an index less than 0 or greater than or equal to the StringBuilder’s length) results in a StringIndexOutOfBoundsException.

30.4.5 StringBuilder Insertion and Deletion Methods Method insert() Allow data-type values to be inserted into StringBuilder Methods delete() and deleteCharAt() Allow characters to be removed from StringBuilder

30.5 Class Character Treat primitive variables as objects Type wrapper classes Boolean Character ( char ) Double Float Byte Short Integer ( int ) Long We examine class Character

Use method forDigit to convert int digit to number-system character specified by int radix

Use method digit to convert char c to number-system integer specified by int radix

Assign two character literals to two Character objects Assign two character literals to two Character objects. Auto-boxing occurs.

30.6 Class StringTokenizer Partition String into individual substrings Use delimiter to separate strings Typically whitespace characters (space, tab, newline, etc) Java offers java.util.StringTokenizer

Display next token as long as tokens exist Use StringTokenizer to parse String using default delimiter “ \n\t\r” Display next token as long as tokens exist

30.7 Regular Expressions Regular expression Sequence of characters and symbols Useful for validating input and ensuring data format E.g., ZIP code Facilitate the construction of a compiler Regular-expression operations in String Method matches Matches the contents of a String to regular expression Returns a boolean indicating whether the match succeeded

30.7 Regular Expressions Predefined character classes Escape sequence that represents a group of characters Digit Numeric character Word character Any letter, digit, underscore Whitespace character Space, tab, carriage return, newline, form feed

Fig. 30.19 | Predefined character classes. Backslashes must be doubled in a String literal. Fig. 30.19 | Predefined character classes.

30.7 Regular Expressions Other patterns Square brackets ([]) Dash (-) Match characters that do not have a predefined character class E.g., [aeiou] matches a single character that is a vowel Dash (-) Ranges of characters E.g., [A-Z] matches a single uppercase letter ^ Don’t include the indicated characters E.g., [^Z] matches any character other than Z

30.7 Regular Expressions Quantifiers Plus (+) Asterisk (*) Match one or more occurrences E.g., A+ Matches AAA but not empty string Asterisk (*) Match zero or more occurrences E.g., A* Matches both AAA and empty string Others in Fig. 30.22

Fig. 30.22 | Quantifiers used in regular expressions.

Method matches returns true if the String matches the regular expression

Method matches returns true if the String matches the regular expression

Indicate that the entry for “zip” was invalid

30.7 Regular Expressions Replacing substrings and splitting strings String method replaceAll Replace text in a string with new text String method replaceFirst Replace the first occurrence of a pattern match String method split Divides string into several substrings

30.7 Regular Expressions Class Pattern Class Matcher Represents a regular expression Class Matcher Contains a regular-expression pattern and a CharSequence Interface CharSequence Allows read access to a sequence of characters String and StringBuilder implement CharSequence

Common Programming Error 30.4 A regular expression can be tested against an object of any class that implements interface CharSequence, but the regular expression must be a String. Attempting to create a regular expression as a StringBuilder is an error.

Common Programming Error 30.5 Method matches (from class String, Pattern or Matcher) will return true only if the entire search object matches the regular expression. Methods find and lookingAt (from class Matcher) will return true if a portion of the search object matches the regular expression.