Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 12—Searching and Sorting The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Searching and Sorting C H A P T E R 1.

Similar presentations


Presentation on theme: "Chapter 12—Searching and Sorting The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Searching and Sorting C H A P T E R 1."— Presentation transcript:

1 Chapter 12—Searching and Sorting The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Searching and Sorting C H A P T E R 1 2 “I weep for you,” the Walrus said, “I deeply sympathize.” With sobs and tears he sorted out Those of the largest size 12.1 Searching 12.2 Sorting 12.3 Assessing algorithmic efficiency 12.4 Using data files —Lewis Carroll, Through the Looking Glass, 1872

2 Searching This chapter looks at two operations on arrays—searching and sorting—both of which turn out to be important in a wide range of practical applications. The simpler of these two operations is searching, which is the process of finding a particular element in an array or some other kind of sequence. Typically, a method that implements searching will return the index at which a particular element appears, or - 1 if that element does not appear at all. The element you’re searching for is called the key. The goal of Chapter 12, however, is not simply to introduce searching and sorting but rather to use these operations to talk about algorithms and efficiency. Many different algorithms exist for both searching and sorting; choosing the right algorithm for a particular application can have a profound effect on how efficiently that application runs.

3 Linear Search The simplest strategy for searching is to start at the beginning of the array and look at each element in turn. This algorithm is called linear search. Linear search is straightforward to implement, as illustrated in the following method that returns the first index at which the value key appears in array, or - 1 if it does not appear at all: private int linearSearch(int key, int[] array) { for (int i = 0; i < array.length; i++) { if (key == array[i]) return i; } return -1; }

4 Simulating Linear Search skip simulation public void run() { int[] primes = { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 }; println("linearSearch(17) -> " + linearSearch(17, primes)); println("linearSearch(27) -> " + linearSearch(27, primes)); } primes LinearSearch linearSearch(17) -> 6 linearSearch(27) -> -1 012345678 2357111317192329 9 private int linearSearch(int key, int[] array) { for ( int i = 0 ; i < array.length ; i++ ) { if (key == array[i]) return i; } return -1; } 27 keyarrayi 0123456 private int linearSearch(int key, int[] array) { for ( int i = 0 ; i < array.length ; i++ ) { if (key == array[i]) return i; } return -1; } 27 keyarrayi 012345678910 public void run() { int[] primes = { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 }; println("linearSearch(17) -> " + linearSearch(17, primes)); println("linearSearch(27) -> " + linearSearch(27, primes)); } 012345678 2357111317192329 9 private int linearSearch(int key, int[] array) { for ( int i = 0 ; i < array.length ; i++ ) { if (key == array[i]) return i; } return -1; } 27 keyarrayi 10

5 A Larger Example To illustrate the efficiency of linear search, it is useful to work with a somewhat larger example. The example on the next slide works with an array containing the 286 telephone area codes assigned to the United States. The specific task in this example is to search this list to find the area code for the Silicon Valley area, which is 650. The linear search algorithm needs to examine each element in the array to find the matching value. As the array gets larger, the number of steps required for linear search grows in the same proportion. As you watch the slow process of searching for 650 on the next slide, try to think of a more efficient way in which you might search this particular array for a given area code.

6 Linear Search (Area Code Example) 650

7 The Idea of Binary Search The fact that the area code array is in ascending order makes it possible to find a particular value much more efficiently. The key insight is that you get more information by starting at the middle element than you do by starting at the beginning. When you look at the middle element in relation to the value you’re searching for, there are three possibilities: –If the value you are searching for is greater than the middle element, you can discount every element in the first half of the array. –If the value you are searching for is less than the middle element, you can discount every element in the second half of the array. –If the value you are searching for is equal to the middle element, you can stop because you’ve found the value you’re looking for. You can repeat this process on the elements that remain after each cycle. Because this algorithm proceeds by dividing the list in half each time, it is called binary search.

8 Binary Search (Area Code Example) 650 Start with the element at index 0 + 285 2, which is the 602 at index 142: The key 650 is greater than 602, so discard the first half.Continue with element 143 + 285 2, which is the 805 at index 214:The key 650 is less than 805, so discard the second half. Continue with element 143 + 213 2, which is the 708 at index 178: The key 650 is less than 708, so discard the second half. Continue with element 143 + 177 2, which is the 630 at index 160: The key 650 is greater than 630, so discard the first half.Continue with element 161 + 177 2, which is the 662 at index 169:The key 650 is less than 662, so discard the second half.Continue with element 161 + 168 2, which is the 646 at index 164:The key 650 is greater than 646, so discard the first half. Continue with element 165 + 168 2, which is the 651 at index 166: The key 650 is less than 651, so discard the second half. Continue with element 165 + 165 2, which is the 650 at index 165: The key 650 is equal to 650, so the process is finished.Binary search needs to look at only eight elements to find 650.

9 Implementing Binary Search The following method implements the binary search algorithm for an integer array. private int binarySearch(int key, int[] array) { int lh = 0; int rh = array.length - 1; while (lh <= rh) { int mid = (lh + rh) / 2; if (key == array[mid]) return mid; if (key < array[mid]) { rh = mid - 1; } else { lh = mid + 1; } return -1; } The text contains a similar implementation of binarySearch that operates on strings. The algorithm is the same.

10 Efficiency of Linear Search As the area code example makes clear, the running time of the linear search algorithm depends on the size of the array. The idea that the time required to search a list of values depends on how many values there are is not at all surprising. The running time of most algorithms depends on the size of the problem to which that algorithm is applied. In many applications, it is easy to come up with a numeric value that specifies the problem size, which is generally denoted by the letter N. For most array applications, the problem size is simply the size of the array. In the worst case—which occurs when the value you’re searching for comes at the end of the array or does not appear at all—linear search requires N steps. On average, it takes approximately half that time.

11 Efficiency of Binary Search On each step in the process, the binary search algorithm rules out half of the remaining possibilities. In the worst case, the number of steps required is equal to the number of times you can divide the original size of the array in half until there is only one element remaining. In other words, what you need to find is the value of k that satisfies the following equation: 1 = N / 2 / 2 / 2 / 2... / 2 k times You can simplify this formula using basic mathematics: 1 = N / 2 k 2 k = N k = log 2 N The running time of binary search also depends on the number of elements, but in a profoundly different way.

12 Comparing Search Efficiencies The difference in the number of steps required for the two search algorithms is illustrated by the following table, which compares the values of N and the closest integer to log 2 N: 3 log 2 NN 7 10 20 30 10 100 1000 1,000,000 1,000,000,000 For large values of N, the difference in the number of steps required is enormous. If you had to search through a list of a million elements, binary search would run 50,000 times faster than linear search. If there were a billion elements, that factor would grow to 33,000,000.

13 Sorting Binary search works only on arrays in which the elements are arranged in order. The process of putting the elements of an array in order is called sorting. There are many algorithms that one can use to sort an array. As with searching, these algorithms can vary substantially in their efficiency, particularly as the arrays become large. Of all the algorithms presented in this text, sorting is by far the most important in terms of its practical applications. Alphabetizing a telephone directory, arranging library records by catalogue number, and organizing a bulk mailing by ZIP code are all examples of sorting that involve reasonably large collections of data.

14 The Selection Sort Algorithm Of the many sorting algorithms, the easiest one to describe is selection sort, which is implemented by the following code: The variables lh and rh indicate the positions of the left and right hands if you were to carry out this process manually. The left hand points to each position in turn; the right hand points to the smallest value in the rest of the array. private void sort(int[] array) { for (int lh = 0; lh < array.length; lh++) { int rh = findSmallest(array, lh, array.length); swapElements(array, lh, rh); } The method findSmallest( array, p 1, p 2 ) returns the index of the smallest value in the array from position p 1 up to but not including p 2. The method swapElements( array, p 1, p 2 ) exchanges the elements at the specified positions.

15 Simulating Selection Sort skip simulation public void run() { int[] test = { 809, 503, 946, 367, 987, 838, 259, 236, 659, 361 }; sort(test); } test private void sort(int[] array) { for ( int lh = 0 ; lh < array.length ; lh++ ) { int rh = findSmallest(array, lh, array.length); swapElements(array, lh, rh); } rharraylh 012345678910 36789 0123456789 809503946367987838259236659361 private int findSmallest(int[] array, int p1, int p2) { int smallestIndex = p1; for ( int i = p1 + 1 ; i < p2 ; i++ ) { if (array[i] < array[smallestIndex]) smallestIndex = i; } return smallestIndex; } 10 p2array 0 p1ismallestIndex 12345678910 01367 public void run() { int[] test = { 809, 503, 946, 367, 987, 838, 259, 236, 659, 361 }; sort(test); } private void sort(int[] array) { for ( int lh = 0 ; lh < array.length ; lh++ ) { int rh = findSmallest(array, lh, array.length); swapElements(array, lh, rh); } rharraylh 0123456789 809503946367987838259236659361

16 Efficiency of Selection Sort As with the search algorithms presented on earlier slides, it is useful to understand how the running time of selection sort depends on the size of the array. One strategy is to measure the actual time it takes to run for arrays of different sizes. In Java, you can measure elapsed time by calling System.currentTimeMillis before and after some operation and noting the difference. Using this strategy, however, requires some care: –If an algorithm runs very quickly, the system clock will not be precise enough for accurate measurement. In such cases, you may need to run the algorithm several times and divide by the number of repetitions. –Most algorithms show some variability depending on the data. To avoid having this variability distort the results, it make sense to run several independent trials with different data and average the results. –It is important to take account of the fact that Java will sometimes need to take actions—most notably garbage collection—that can take large amounts of time but are not actually relevant to the algorithm.

17 Measuring Sort Timings.011.140 81.60 1388..039 14.32 1326..00036.0025 1.69 7.50 Because timing measurements are subject to various inaccuracies, it is best to run several trials and then to use statistics to interpret the results. The table below shows the actual running time for the selection sort algorithm for several different values of N, along with the mean (  ) and standard deviation (  ). The table entries shown in red indicate timing measurements that differ by more than two standard deviations from the average of the other trials (trial #8 for 1000 elements, for example, is more than five times larger than any other trial). Because these outliers probably include extraneous operations, it is best to discard them. The following table shows the average timing of the selection sort algorithm after removing outlying trials that differ by more than two standard deviations from the mean. The column labeled  (the Greek letter mu, which is the standard statistical symbol for the mean) is a reasonably good estimate of running time.

18 Selection Sort Running Times The linear search algorithm has the property that the running time of the algorithm is proportional to the size of the array. If you multiply the number of values by ten, you would expect the linear search algorithm to take ten times as long. 10 100 1000 10000.0024 0.162 14.32 1332. N time As the running times on the preceding slide make clear, the situation for selection sort is very different. The table on the right shows the average running time when selection sort is applied to 10, 100, 1000, and 10000 values. As a rough approximation—particularly as you work with larger values of N—it appears that every ten-fold increase in the size of the array means that selection sort takes about 100 times as long.

19 Counting Operations Another way to estimate the running time is to count how many operations are required to sort an array of size N. In the selection sort implementation, the section of code that is executed most frequently (and therefore contributes the most to the running time) is the body of the findSmallest method. The number of operations involved in each call to findSmallest changes as the algorithm proceeds: N values are considered on the first call to findSmallest. N - 1 values are considered on the second call. N - 2 values are considered on the third call, and so on. In mathematical notation, the number of values considered in findSmallest can be expressed as a summation, which can then be transformed into a simple formula: ∑ i = 1 N i 1 + 2 + 3 +... + (N - 1) + N = = N x (N + 1) 2

20 A Geometric Insight You can convince yourself that by thinking about the problem geometrically. 1 + 2 + 3 +... + (N - 2) + (N - 1) + N = N x (N + 1) 2 The terms on the left side of the formula can be arranged into a triangle, as shown at the bottom of this slide for N = 6. If you duplicate the triangle and rotate it by 180˚, you get a rectangle that in this case contains 6 x 7 dots, half of which belong to each triangle.

21 Quadratic Growth The reason behind the rapid growth in the running time of selection sort becomes clear if you make a table showing the xxx 55 N 5050 500,500 50,005,000 10 100 1000 10000 N x (N + 1) 2 2 The growth pattern in the right column is similar to that of the measured running time of the selection sort algorithm. As the x N x (N + 1) 2 value of N increases by a factor of 10, the value of xx increases by a factor of around 100, which is 10 2. Algorithms whose running times increase in proportion to the square of the problem size are said to be quadratic. value of for various values of N:

22 Finding a More Efficient Strategy As long as arrays are small, selection sort is a perfectly workable strategy. Even for 10,000 elements, the average running time of selection sort is just over a second. The quadratic behavior of selection sort, however, makes it less attractive for the very large arrays that one encounters in commercial applications. Assuming that the quadratic growth pattern continues beyond the timings reported in the table, sorting 100,000 values would require two minutes, and sorting 1,000,000 values would take more than three hours. As it turns out, there are sorting strategies that are vastly more efficient for large arrays than selection sort. Most of these algorithms, unfortunately, use programming techniques beyond the scope of this text. The next few slides, however, resurrect a sorting algorithm that was popular in the early days of computing to show that massive improvements in efficiency are possible.

23 Sorting Punched Cards From the 1880 census onward, information was often stored on punched cards like the one shown at the right, in which the number 236 has been punched in the first three columns. 01234567890123456789... 1 01234567890123456789 2 01234567890123456789 3 01234567890123456789 4 01234567890123456789 5 01234567890123456789 6 01234567890123456789 7 01234567890123456789 8 01234567890123456789 9 01234567890123456789 10 01234567890123456789 11 01234567890123456789 12 01234567890123456789 13 01234567890123456789 14 01234567890123456789 15 01234567890123456789 16 01234567890123456789 17 01234567890123456789 18 01234567890123456789 19 01234567890123456789 20 01234567890123456789 21 01234567890123456789 22 01234567890123456789 23 01234567890123456789 24 01234567890123456789 25 01234567890123456789 78 01234567890123456789 79 01234567890123456789 80 The IBM 083 Sorter Computer companies built machines to sort stacks of punched cards, such as the IBM 083 sorter on the left. The stack of cards was loaded in a large hopper at the right end of the machine, and the cards would then be distributed into the various bins on the front of the sorter according to what value was punched in a particular column.

24 The Radix Sort Algorithm The IBM 083 sorter was a significant commercial success because it made it possible to sort large sets of punched cards quickly and efficiently. The algorithm used by the IBM 083 is called radix sort and consists of the following steps: Set the machine so that it sorts on the last digit of the number.1. Put the entire stack of cards in the hopper.2. Run the machine so that the cards are distributed into the bins.3. Put the cards from the bins back in the hopper, making sure that the cards from the 0 bin are on the bottom, the cards from the 1 bin come on top of those, and so on. 4. Reset the machine so that it sorts on the preceding digit.5. Repeat steps 3 through 5 until all the digits are processed.6. The next slide illustrates this process for a set of three-digit numbers.

25 Simuating Radix Sort skip simulation 0123456789 809 503 946 367 987 838 259 236 659 361 809 503 946 367 987 838 259 236 659 809 503 946 367 987 838 259 236 809 503 946 367 987 838 259 809 503 946 367 987 838 809 503 946 367 987 809 503 946 367 809 503 946 809 503809 361 659 236 259 838 987 367 946 503 809 361 809 503 361 946 236 503 361 367 987 946 236 503 361 838 367 987 946 236 503 361 809 259 659 838 367 987 946 236 503 361 Step 1a.Sort the cards using the last digit. Step 1b.Refill the hopper by emptying the bins in order. Step 2a.Sort the cards using the middle digit. Step 2b.Again refill the hopper by emptying the bins. Step 3a.Sort the cards using the first digit. Step 3b.Refill the hopper one last time. Note that the list is now sorted by the last digit. The list is now sorted by the last two digits. The list is now completely sorted. 361361 361361 809259659838367987946236503361809259659838367987946236503361 809259659838367987946236503809259659838367987946236503 809259659838367987946236809259659838367987946236 809259659838367987946809259659838367987946 809259659838367987809259659838367987 809259659838367809259659838367 809259659838809259659838 809259659809259659 809259809259809809 503503 503503 236236 236236 946946 946946 987987 987987 367367 367367838838 838838659659 659659 259259 259259 809809 809809 987 367 361 259 659 946 838 236 809 503 367361259659946838236809503367361259659946838236809503 259659946838236809503259659946838236809503 946838236809503946838236809503 838236809503838236809503 809503809503 987 367 361 259 659 946 838 236 809 503 987 367 361 259 659 946 838 236 809 987 367 361 259 659 946 838 236 987 367 361 259 659 946 838 987 367 361 259 659 946 987 367 361 259 659 987 367 361 259 987 367 361 987 367987 503 809 236 838 946 659 259 361 367 987 259 236 367 361 259 236 503 367 361 259 236 659 503 367 361 259 236 838 809 659 503 367 361 259 236 987 946 838 809 659 503 367 361 259 236 0123456789 987 946 838 809 659 503 367 361 259 236

26 Efficiency of Radix Sort The advantage of radix sort over selection sort is that the time required to sort a stack of cards no longer grows in proportion to the square of the number of cards. The running time is instead proportional to N x D where N is the number of cards and D is the number of digits in the numeric field on which the sort is performed. To take account of the fact that larger data sets tend to use longer sort keys, analyses of sorting algorithms assume that the range of the values being sorted is proportional to the number of values. Because the number of digits in a number N is proportional to the logarithm of N, computer scientists say that the performance of radix sort is proportional to N log N

27 Assessing Algorithmic Efficiency The discussion of the efficiency of the various searching and sorting algorithms illustrates a fundamental computer science technique called algorithmic analysis. The primary purpose of algorithmic analysis is to make qualititative assessments about the efficiency of an algorithm rather than to offer precise quantitative predictions about running time. The running time of a program depends, for example, on the speed of the hardware and the extent to which the compiler is able to optimize the code. The qualitative performance of the algorithm itself is largely independent of such considerations. One of the most important problems in algorithmic analysis is deducing the computational complexity of an algorithm, which is the relationship between the size of the problem and the expected running time.

28 Big-O Notation The most common way to express computational complexity is to use big-O notation, which was introduced by the German mathematician Paul Bachmann in 1892. Big-O notation consists of the letter O followed by a formula that offers a qualitative assessment of running time as a function of the problem size, traditionally denoted as N. For example, the computational complexity of linear search is O ( N )O ( N ) and the computational complexity of radix sort is O ( N log N ) If you read these formulas aloud, you would pronounce them as “big-O of N ” and “big-O of N log N ” respectively.

29 Common Simplifications of Big-O Given that big-O notation is designed to provide a qualitative assessment, it is important to make the formula inside the parentheses as simple as possible. When you write a big-O expression, you should always make the following simplifications: Eliminate any term whose contribution to the running time ceases to be significant as N becomes large. 1. Eliminate any constant factors.2. The computational complexity of selection sort is therefore O ( N 2 )O ( N 2 ) N x (N + 1) 2 O () and not

30 Deducing Complexity from the Code In many cases, you can deduce the computational complexity of a program directly from the structure of the code. The standard approach to doing this type of analysis begins with looking for any section of code that is executed more often than other parts of the program. As long as the individual operations involved in an algorithm take roughly the same amount of time, the operations that are executed most often will come to dominate the overall running time. In the selection sort implementation, for example, the most commonly executed statement is the if statement inside the findSmallest method. This statement is part of two for loops, one in findSmallest itself and one in sort. The total number of executions is 1 + 2 + 3 +... + (N - 1) + N which is O(N 2 ).

31 Exercise: Computational Complexity Assuming that none of the steps in the body of the following for loops depend on the problem size stored in the variable n, what is the computational complexity of each of the following examples: for (int i = 0; i < n; i++) { for (int j = 0; j < i; j++) {... loop body... } a) for (int k = 1; k <= n; k *= 2) {... loop body... } b) c) for (int i = 0; i < 100; i++) { for (int j = 0; j < i; j++) {... loop body... } O ( N 2 )O ( N 2 ) This loop follows the pattern from selection sort. O ( log N ) This loop follows the pattern from binary search. O ( 1 )O ( 1 ) This loop does not depend on the value of n at all.

32 Using Data Files Applications that involve searching and sorting typically work with data collections that are too large to enter by hand. For this reason, section 12.4 includes a brief discussion of data files, which make it possible to work with stored data. A file is the generic name for any named collection of data maintained on the various types of permanent storage media attached to a computer. In most cases, a file is stored on a hard disk, but it can also be stored on removable medium, such as a CD or flash memory drive. Files can contain information of many different types. When you compile a Java program, for example, the compiler stores its output in a set of class files, each of which contains the binary data associated with a class. The most common type of file, however, is a text file, which contains character data of the sort you find in a string.

33 Text Files vs. Strings The information stored in a file is permanent. The value of a string variable persists only as long as the variable does. Local variables disappear when the method returns, and instance variables disappear when the object goes away, which typically does not occur until the program exits. Information stored in a file exists until the file is deleted. 1. Files are usually read sequentially. When you read data from a file, you usually start at the beginning and read the characters in order, either individually or in groups that are most commonly individual lines. Once you have read one set of characters, you then move on to the next set of characters until you reach the end of the file. 2. Although text files and strings both contain character data, it is important to keep in mind the following important differences between text files and strings:

34 Reading Text Files When you want to read data from a text file as part of a Java program, you need to take the following steps: Java’s BufferedReader class make it possible to read data from files in several different ways. –You can read individual characters by calling the read method. –You can read complete lines by calling the readLine method. –You can read individual tokens by using the Scanner class. Each of these strategies is described in a subsequent slide. Construct a new BufferedReader object that is tied to the data in the file. This phase of the process is called opening the file. 1. Call the methods provided by the BufferedReader class to read data from the file in sequential order. 2. Break the association between the reader and the file by calling the reader’s close method, which is called closing the file. 3.

35 Standard Reader Subclasses The java.io package defines several different subclasses of the generic Reader class that are useful in different contexts. To read text files, you need to use the following subclasses: –The FileReader class, which allows you to create a simple reader by supplying the name of the file. –The BufferedReader class, which makes all operations more efficient and enables the strategy of reading individual lines. The standard idiom for opening a text file is to call the constructors for each of these classes in a single statement: BufferedReader rd = new BufferedReader( new FileReader( filename )); The FileReader constructor takes the file name and creates a simple reader. That reader is then passed along to the BufferedReader constructor.

36 Reading Characters from a File Once you have created the BufferedReader object as shown on the preceding slide, you can then read individual characters from the file by calling the read method. The read method conceptually returns the next character in a file, although it does so as an int. If there are any characters remaining in the file, calling read returns the Unicode value of the next one. If all the characters have been exhausted, read returns - 1. The following code fragment, for example, counts the number of letters in a BufferedReader named rd : int nLetters = 0; while (true) { int ch = rd.read(); if (ch == -1) break; if (Character.isLetter(ch)) nLetters++; }

37 Reading Lines from a File You can also read entire lines from a text file by calling the readLine method, which returns the next line of data from the file as a string after discarding any end-of-line characters. If no lines remain in the file, readLine returns null. The following code fragment uses the readLine method to determine the length of the longest line in the reader rd : int maxLength = 0; while (true) { String line = rd.readLine(); if (line == null) break; maxLength = Math.max(maxLength, line.length()); } Using the readLine method makes programs more portable because it eliminates the need to think about the end-of-line characters, which differ from system to system.

38 Exception Handling Unfortunately, the process of reading data from a file is not quite as simple as the previous slides suggest. When you work with the classes in the java.io package, you must ordinarily indicate what happens if an operation fails. In the case of opening a file, for example, you need to specify what the program should do if the requested file does not exist. Java’s library classes often respond to such conditions by throwing an exception, which is one of the strategies Java methods can use to report an unexpected condition. If the FileReader constructor, for example, cannot find the requested file, it throws an IOException to signal that fact. When Java throws an exception, it stops whatever it is doing and looks back through its execution history to see if any method has indicated an interest in “catching” that exception by including a try statement as described on the next slide.

39 The try Statement Java uses the try statement to indicate an interest in catching an exception. In its simplest form, the try statement syntax is try { code in which an exception might occur } catch ( type identifier ) { code to respond to the exception } where type is the name of some exception class and identifier is the name of a variable used to hold the exception itself. The range of statements in which the exception can be caught includes not only the statements explicitly enclosed in the try body but also any methods those statements call. If the exception occurs inside some other method, any subsequent stack frames are removed until control returns to the try statement itself.

40 The try Statement The design of the java.io package forces you to use try statements to catch any exceptions that might occur. For example, if you open a file without checking for exceptions, the Java compiler will report an error in the program. To take account of these conditions, you need to enclose calls to constructors and methods in the various java.io classes inside try statements that check for IOException s. The ReverseFile program on the next few slides illustrates the use of the try statement in two different contexts: –Inside the openFileReader method, the program uses a try statement to detect whether the file exists. If it doesn’t, the catch clause prints a message to the user explaining the failure and then asks the user for a new file name. –Inside the readLineArray method, the code uses a try statement to detect whether an I/O error has occurred.

41 import acm.program.*; import acm.util.*; import java.io.*; import java.util.*; /** This program prints the lines from a file in reverse order */ public class ReverseFile extends ConsoleProgram { public void run() { println("This program reverses the lines in a file."); BufferedReader rd = openFileReader("Enter input file: "); String[] lines = readLineArray(rd); for (int i = lines.length - 1; i >= 0; i--) { println(lines[i]); } /* * Implementation note: The readLineArray method on the next slide * uses an ArrayList internally because doing so makes it possible * for the list of lines to grow dynamically. The code converts * the ArrayList to an array before returning it to the client. */ The ReverseFile Program skip code page 1 of 3

42 import acm.program.*; import acm.util.*; import java.io.*; import java.util.*; /** This program prints the lines from a file in reverse order */ public class ReverseFile extends ConsoleProgram { public void run() { println("This program reverses the lines in a file."); BufferedReader rd = openFileReader("Enter input file: "); String[] lines = readLineArray(rd); for (int i = lines.length - 1; i >= 0; i--) { println(lines[i]); } /* * Implementation note: The readLineArray method on the next slide * uses an ArrayList internally because doing so makes it possible * for the list of lines to grow dynamically. The code converts * the ArrayList to an array before returning it to the client. */ /* * Reads all available lines from the specified reader and returns * an array containing those lines. This method closes the reader * at the end of the file. */ private String[] readLineArray(BufferedReader rd) { ArrayList lineList = new ArrayList (); try { while (true) { String line = rd.readLine(); if (line == null) break; lineList.add(line); } rd.close(); } catch (IOException ex) { throw new ErrorException(ex); } String[] result = new String[lineList.size()]; for (int i = 0; i < result.length; i++) { result[i] = lineList.get(i); } return result; } The ReverseFile Program skip code page 2 of 3

43 /* * Reads all available lines from the specified reader and returns * an array containing those lines. This method closes the reader * at the end of the file. */ private String[] readLineArray(BufferedReader rd) { ArrayList lineList = new ArrayList (); try { while (true) { String line = rd.readLine(); if (line == null) break; lineList.add(line); } rd.close(); } catch (IOException ex) { throw new ErrorException(ex); } String[] result = new String[lineList.size()]; for (int i = 0; i < result.length; i++) { result[i] = lineList.get(i); } return result; } /* * Requests the name of an input file from the user and then opens * that file to obtain a BufferedReader. If the file does not * exist, the user is given a chance to reenter the file name. */ private BufferedReader openFileReader(String prompt) { BufferedReader rd = null; while (rd == null) { try { String name = readLine(prompt); rd = new BufferedReader(new FileReader(name)); } catch (IOException ex) { println("Can't open that file."); } return rd; } The ReverseFile Program page 3 of 3

44 Selecting Files Interactively The Java libraries also make it possible to select an input file interactively using a dialog box. To do so, you need to use the JFileChooser class from the javax.swing package. The JFileChooser constructor is usually called like this: JFileChooser chooser = new JFileChooser(); You can then ask the user to select a file by calling int result = chooser.showOpenDialog(this); where this indicates the program issuing this call. The return value will be JFileChooser.APPROVE_OPTION or JFileChooser.CANCEL_OPTION depending on whether the user clicks the Open or Cancel button. The caller can obtain the selected file by calling chooser.getSelectedFile().

45 Using the Scanner Class Although it is not used in the examples in the text, it is also possible to divide a text file into individual tokens by using the Scanner class from the java.util package. The most useful methods in the Scanner class are shown in the following table: Creates a new Scanner object from the reader. Returns the next whitespace-delimited token as a string. Reads the next integer and returns it as an int. Reads the next number and returns it as a double. Reads the next Boolean value ( true or false ). Returns true if the scanner has any more tokens. Returns true if the next token scans as an integer. Returns true if the next token scans as a number. new Scanner( reader ) next() nextInt() nextDouble() nextBoolean() hasNext() hasNextInt() hasNextDouble() Returns true if the next token is either true or false. Closes the scanner and the underlying reader. hasNextBoolean() close()

46 Using Files for Output The java.io package also makes it possible to create new text files by using the appropriate subclasses of the generic Writer class. When you write data to a file, the most common approach is to create a PrintWriter like this : PrintWriter wr = new PrintWriter( new FileWriter( filename )); As in the reader example, this nested constructor first creates a FileWriter for the specified file name and then passes the result along to the PrintWriter constructor. Once you have a PrintWriter object, you can write data to that writer using the println and print methods you have used all along with console programs.

47 The End


Download ppt "Chapter 12—Searching and Sorting The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Searching and Sorting C H A P T E R 1."

Similar presentations


Ads by Google