Presentation is loading. Please wait.

Presentation is loading. Please wait.

T U T O R I A L  2009 Pearson Education, Inc. All rights reserved. 1 22 Screen Scraping Application Introducing String Processing.

Similar presentations


Presentation on theme: "T U T O R I A L  2009 Pearson Education, Inc. All rights reserved. 1 22 Screen Scraping Application Introducing String Processing."— Presentation transcript:

1 T U T O R I A L  2009 Pearson Education, Inc. All rights reserved. 1 22 Screen Scraping Application Introducing String Processing

2  2009 Pearson Education, Inc. All rights reserved. 2 Outline 22.1 Test-Driving the Screen Scraping Application 22.2 Fundamentals of String s 22.3 Analyzing the Screen Scraping Application 22.4 Locating Substrings in String s 22.5 Extracting Substrings from String s 22.6 Replacing Substrings in String s 22.7 Other String Methods

3  2009 Pearson Education, Inc. All rights reserved. 3 In this tutorial you will learn: ■Manipulate String objects. ■Use properties and methods of class String. ■Search for substrings within String s. ■Extract substrings within String s. ■Replace substrings within String s. Objectives

4  2009 Pearson Education, Inc. All rights reserved. 4 ■HTML (HyperText Markup Language) is a technology for describing web pages. ■Extracting desired information from HTML is called screen scraping. Introduction

5 Application Requirements  2009 Pearson Education, Inc. All rights reserved. 5 22.1 Test-Driving the Screen Scraping Application An online European auction house wants to expand its business to include bidders from the United States. However, all of the auction house’s web pages currently display their prices in euros, not dollars. The auction house wants to generate separate web pages for American bidders that display the prices of auction items in dollars. These new web pages will be generated by using screen-scraping techniques on the already existing web pages.

6 Application Requirements  2009 Pearson Education, Inc. All rights reserved. 6 22.1 Test-Driving the Screen Scraping Application (Cont.) You have been asked to build a prototype application that tests the screen-scraping functionality. The application must search a sample string of HTML and extract information about the price of a specified auction item. For testing purposes, a ComboBox should be provided that contains auction items listed in the HTML. The selected item’s amount must then be converted to dollars. Assume the exchange rate is one euro to 1.58 dollars (that is, one euro is equivalent to $1.58).

7  2009 Pearson Education, Inc. All rights reserved. 7 Test-Driving the Screen Scraping Application ■Run the completed application (Fig. 22.1). Figure 22.1 | Screen Scraping application’s Form. Label containing HTML

8  2009 Pearson Education, Inc. All rights reserved. 8 Test-Driving the Screen Scraping Application (Cont.) ■Select an item name from the ComboBox, as shown in Figure 22.2. Figure 22.2 | Selecting an item name from the ComboBox. ComboBox’s drop-down list

9  2009 Pearson Education, Inc. All rights reserved. 9 Test-Driving the Screen Scraping Application (Cont.) ■Click the Search Button to display the price for the selected item (Fig. 22.3). Figure 22.3 | Searching for the item’s price. Extracted price (converted to dollars) Price located in HTML string (specified in Euros)

10  2009 Pearson Education, Inc. All rights reserved. 10 ■A string is a series of characters treated as a single unit. "This is a string!" ■These characters can be uppercase letters, lowercase letters, digits and various special characters, such as +, -, *, /, $ and others. ■ String property Length returns the length of the String. 22.2 Fundamentals of String s

11  2009 Pearson Education, Inc. All rights reserved. 11 ■ String property Chars returns the character located at a specific index in a String : string1.Chars(0) ■Any String method or operator that appears to modify a String actually returns a new String that contains the results. ■ String s are immutable objects—that is, characters in String s cannot be changed after the String s are created. 22.2 Fundamentals of String s (Cont.)

12  2009 Pearson Education, Inc. All rights reserved. 12 ■Figure 22.4 lists several String methods. Figure 22.4 | String methods introduced in earlier tutorials. 22.2 Fundamentals of String s (Cont.)

13  2009 Pearson Education, Inc. All rights reserved. 13 When the Form loads: Display the HTML that contains the items’ prices in a Label When the user clicks the Search Button: Search the HTML for the item the user selected from the ComboBox Extract the item’s price Convert the item’s price from euros to dollars Display the item’s price in a Label 22.3 Analyzing the Screen Scraping Application

14  2009 Pearson Education, Inc. All rights reserved. 14 ■Use an ACE table to convert pseudocode into Visual Basic (Fig. 22.5). Figure 22.5 | ACE table for Screen Scraping application. Action/Control/Event (ACE) Table for the Screen Scraping Application

15  2009 Pearson Education, Inc. All rights reserved. 15 Locating the Selected Item’s Price ■Double click the Search Button on the template application’s Form to generate an event handler (Fig. 22.6). Figure 22.6 | searchButton_Click event handler.

16  2009 Pearson Education, Inc. All rights reserved. 16 Locating the Selected Item’s Price (Cont.) ■Add lines 17–21 of Figure 22.7 to the searchButton_Click event handler. Figure 22.7 | searchButton_Click event-handler declarations.

17  2009 Pearson Education, Inc. All rights reserved. 17 ■ String method IndexOf (Fig. 22.8) locates the first occurrence of the specified item in the HTML string. –If IndexOf finds the specified item name, the index at which the substring begins in the String is returned. –If IndexOf does not find the substring, it returns –1. Locating the Selected Item’s Price (Cont.) Figure 22.8 | Locating the desired item name. Search for the SelectedItem in the String html

18  2009 Pearson Education, Inc. All rights reserved. 18 ■This version of method IndexOf (Fig. 22.9) takes two arguments—the substring to find and the index in the String to begin searching. ■In this case, the substring to find (indicating the beginning of the price) is "€“. Figure 22.9 | Locating the desired item price. Locate the beginning of the price in html Locating the Selected Item’s Price (Cont.)

19  2009 Pearson Education, Inc. All rights reserved. 19 ■A tag directly follows every price in the HTML string, so the index of the first tag after priceBegin marks the end of the current price (Fig. 22.10). Figure 22.10 | Locating the end of the item’s price. Locate the end of the price in html Locating the Selected Item’s Price (Cont.)

20  2009 Pearson Education, Inc. All rights reserved. 20 ■The LastIndexOf locates the last occurrence of a substring in a String. ■If method LastIndexOf finds the substring, it returns the starting index of the specified substring in the String ; otherwise, LastIndexOf returns –1. ■Figure 22.11 shows examples of the three versions. 22.4 Locating Substrings in String s Figure 22.11 | LastIndexOf examples.

21  2009 Pearson Education, Inc. All rights reserved. 21 ■The first argument ( priceBegin ) specifies the starting index. ■The second argument ( priceEnd - priceBegin ) specifies the length of the substring to be copied (Fig. 22.12). Figure 22.12 | Retrieving the desired price. Extract price from html Retrieving the Desired Item’s Price

22  2009 Pearson Education, Inc. All rights reserved. 22 ■ String method Replace (Fig. 22.13) is used to return a new String object in which every occurrence of substring "€" is replaced with the empty String. ■ String method Format displays the price in resultLabel as currency. Converting the Price to Dollars Figure 22.13 | Converting the price to dollars. Replace "€" with "" and convert the amount to dollars

23  2009 Pearson Education, Inc. All rights reserved. 23 ■Double click the Form to generate an empty Load event handler (Fig. 22.14). Figure 22.14 | Load event for the Form. Displaying the HTML String

24  2009 Pearson Education, Inc. All rights reserved. 24 ■ String method Replace (Fig. 22.15) replaces every occurrence of "€" with "&€". –For the text to display in a Label correctly, you must prefix it with an additional ampersand. Figure 22.15 | Displaying the HTML string in a Label. Replace all occurrences of "&euro" with "&&euro" Displaying the HTML String (Cont.)

25  2009 Pearson Education, Inc. All rights reserved. 25 22.7 Other String Methods ■Figure 22.16 lists some of the methods for manipulating String s. Figure 22.16 | Description of some other String methods. (Part 1 of 2.)

26  2009 Pearson Education, Inc. All rights reserved. 26 22.7 Other String Methods (Cont.) Figure 22.16 | Description of some other String methods. (Part 2 of 2.)

27  2009 Pearson Education, Inc. All rights reserved. 27 ■Figure 22.17 presents the source code for the Screen Scraping application. Outline (1 of 4 )

28  2009 Pearson Education, Inc. All rights reserved. 28 Outline (2 of 4 ) Search for the SelectedItem in the String html

29  2009 Pearson Education, Inc. All rights reserved. 29 Outline (3 of 4 ) Locate the beginning of the price in html Locate the end of the price in html Extract the price from html Replace "€" with the empty String

30  2009 Pearson Education, Inc. All rights reserved. 30 Outline (4 of 4 ) Replace "€" with "&&euro"


Download ppt "T U T O R I A L  2009 Pearson Education, Inc. All rights reserved. 1 22 Screen Scraping Application Introducing String Processing."

Similar presentations


Ads by Google