Presentation is loading. Please wait.

Presentation is loading. Please wait.

Topic 10 - Strings.

Similar presentations


Presentation on theme: "Topic 10 - Strings."— Presentation transcript:

1 Topic 10 - Strings

2 Characters and Strings
Thus far, we have not used the char data type too much, as most applications deal with a group of characters, a data structure known as a string. The C programming language implements the string as an array of characters (chars). CISC105 – Topic 10

3 String Declarations A string, as it is simply an array of characters, is declared as: char string1[20]; This declares a string named string1 which can hold strings from length 0 to length 19. Notice that this fact means we “lose,” or cannot use one of the 20 characters declared. We will see why this is shortly. CISC105 – Topic 10

4 String Initialization
Strings can be initialized using a string constant, in this form: char string1[20] = “Hello there”; So, what does this string look like in memory? H e l l o t h e r e \0 ? ? ? ? ? ? ? ? CISC105 – Topic 10

5 Strings in Memory H e l l o t h e r e \0 ? ? ? ? ? ? ? ? As can be seen, the first characters (array elements) are the initialization string. Then comes a \0 character. This is the null character. This is a special character that means “end of string”. The array elements that follow the null character are not initialized, therefore their contents are indeterminate and unknown. CISC105 – Topic 10

6 Strings in Memory All strings must end with a null character. When a string is enclosed in double quotes, i.e. “Hello there”, a null character is automatically inserted. The double quotes mean “the string is whatever is inside the quotes followed by a null character.” CISC105 – Topic 10

7 Strings in Memory The requirement of this null character to indicate the end of the string shows why a string declaration of char string1[20]; can only hold strings of length 0 to 19. The one “lost” character of the twenty allocated for string1 is needed for the null character. CISC105 – Topic 10

8 String Input/Output The standard C input and output functions, printf and scanf can directly work with strings using the placeholder “%s”. For example, what is the output of the following program fragment? Hello there. char string1[20] = “Hello there.”; printf(“%s”,string1); CISC105 – Topic 10

9 String Input/Output The printf function, as well as other functions that work with strings, depend on the strings they receive as arguments to be terminated with the null character. Therefore, it is important that whenever a programmer-written function constructs an array, it inserts the null character at the end of the array. CISC105 – Topic 10

10 String Input/Output We can also use the “%s” placeholder with the scanf statement. For instance, to read a string from the keyboard and place it in string1: char string1[20]; scanf(“%s”,string1); CISC105 – Topic 10

11 String Input/Output char string1[20]; scanf(“%s”,string1); Notice that we do not use the “&” operator on string1. Remember that the name of an array is the memory address of its first element. Thus, since string1 is a memory address, it does not need the “address of” operator. CISC105 – Topic 10

12 String Input/Output Using the scanf function with the “%s” placeholder stores the user input string in the character array (string) corresponding to the placeholder. This input function does NOT check to make sure the string it read from the keyboard is short enough to fit in the specified character array. Thus, it is possible to overrun the character array using a scanf statement. Thus, make sure the array that you are scanfing into is large enough to handle all possible inputs. CISC105 – Topic 10

13 String Input/Output Notice that as the scanf function treats a space as the end of a data input, we cannot input strings with spaces in them using the “%s” placeholder. So…we need a way to read in a line of text into a character array (string). This function is the gets function. CISC105 – Topic 10

14 String Input/Output The function gets works as follows: gets(char []);
Here, we simply put the name of the character array we wish to store the user input string in as the only parameter passed to gets. This function performs like scanf in that it waits for the user to input some text and press <enter>. Once the user presses <enter>, the entire line is stored in the specified character array. CISC105 – Topic 10

15 String Input/Output char string1[20]; char string2[20]; gets(string1); Here, if we were to type in “Hello there” string1 would be set to “Hello there”. CISC105 – Topic 10

16 String Operations and Assignment
Previously, we have seen the assignment operator “=” used to store data in a variable. We can use the “=” operator during string initialization (setting an initial value when the string is declared). This is the ONLY time we can use the assignment operator to set a string. CISC105 – Topic 10

17 String Operations and Assignment
We have seen before that the name of an array is the memory address of the first element of that array (in this case, the first character in the string). Thus, any statement such as: char string1[20]; string1 = “Hello”; would attempt to set the address of the first element of the array, which cannot be changed by using such an assignment statement. CISC105 – Topic 10

18 String Operations and Assignment
C provides a number of routines to manipulate strings, through library functions. All of the string manipulation functions are defined in string.h. Each of the string manipulation functions takes, and returns, variables of type char *. Keep in mind that a string is represented by the memory address of its first element (as is always true with arrays). CISC105 – Topic 10

19 String Operations and Assignments
The first, and most basic, string manipulation library function is the string-copy function, strcpy(). This function takes two parameters, the destination string and the source string. Its prototype looks like: strcpy(char *dest, const char *source); CISC105 – Topic 10

20 The strcpy Function This function makes a copy of the source string in the destination string. This is a simple example of the strcpy() function: char string1[20]; strcpy(string1,”Hello there”); This code fragment sets string1 to “Hello there” (with the last ‘e’ followed by a null character, as indicated with the double quotes). CISC105 – Topic 10

21 The strcpy Function char string1[20]; string1 ? string1 ? ? ? ? ? ? ?
strcpy(string1,”Hello there”); string1 H e l l o t h e r e \0 ? ? ? ? ? ? ? ? CISC105 – Topic 10

22 The strcpy Function It is important to note that the strcpy function does not perform any bounds checking. Therefore, just as like using the “%s” placeholder with scanf, we can overflow the destination array. It is important to ensure the destination array is large enough to hold the string that the strcpy function will place in it. CISC105 – Topic 10

23 The strcpy Function char string1[20]; string1 ? string1 i ? ? ? ? ? ?
strcpy(string1,”This is a really long string”); string1 T h i s i s a r e a l l y l o n CISC105 – Topic 10

24 The strcpy Function When such an overflow occurs, the same problems that were discussed for array overstepping can occur. Most of the time, the array will overflow into unallocated memory. However, if the memory cells immediately following the array are allocated and then the strcpy function oversteps the array bounds, it can overwrite the variables which are allocated to these memory cells. CISC105 – Topic 10

25 The strncpy Function This problem can be resolved using the strncpy function. This function is a “counted” version of the strcpy function. It takes three arguments, the destination string, the source destination, and then the number of characters to copy. Its prototype looks like: strncpy( char *dest, const char *source, size_t size); size_t is an unsigned integer. Thus, it cannot be negative CISC105 – Topic 10

26 The strncpy Function Thus, in order to prevent copying past the bounds of the array in the previous example, we could write: char string1[20]; strncpy(string1, ”This is a really long string”, 20); This would only copy the first twenty characters (into index 0 to 19) and thus would not overwrite any memory past that allocated for the array. CISC105 – Topic 10

27 The strncpy Function string1 T h i s i s a r e a l l y l o n Notice that even though we no longer overwrite memory past that allocated for the string, the string is still not valid, as it does not end with a null character. CISC105 – Topic 10

28 The strncpy Function string1 T h i s i s a r e a l l y l o \0 char string1[20]; strncpy(string1, ”This is a really long string”, 20); string1[19] = ‘\0’; Thus, we need to set the last allocated string character to the null character manually. This can be done using a simple character assignment statement. CISC105 – Topic 10

29 Substrings H e l l o t h e r string e \0 ? ? ? ? ? ? ? ? Keep in mind that string evaluates to the address of the first character, or string = &string[0]. Using this, we can see how to use a substring. For example, to reference the substring “there”, we could simply use &string[6] instead of string. CISC105 – Topic 10

30 Substrings H e l l o t h e r string e \0 ? ? ? ? ? ? ? ? So, what would be the output of the following code fragment, assuming string as shown above? printf(“%s\n”,string); printf(“%s\n”,&string[0]); printf(“%s\n”,&string[6]); printf(“%s\n”,&string[2]); Hello there there llo there CISC105 – Topic 10

31 Substrings string H e l l o t h e r e J o h n \0 ? ? ?
So, we can combine this and the strncpy function to extract a substring. Therefore, to extract “there” from string and put it in string2, we could write: strncpy(string2,&string[6],5); string2[5] = ‘\0’; Note that we need to manually insert the null character into string2, as the data strncpy copies into string2 does not end with one. CISC105 – Topic 10

32 String Lengths Notice that the size of the character array is an upper limit on the size of the string it can contain. Thus, we need some way to find the size of the string contained in a character array (the actual number of characters present before the null character). CISC105 – Topic 10

33 String Lengths This can be determined using the strlen function. This function takes one parameter, a char * (or string name) and returns one value, the length in characters of the string, a size_t value, which is an unsigned integer. CISC105 – Topic 10

34 String Lengths What would be the output of the following program fragment? H e l l o t h e r string e J o h n \0 ? ? ? Size = 16 int string_size; string_size = strlen(string); printf(“Size = %d\n”,string_size); CISC105 – Topic 10

35 String Lengths What would be the output of the following program fragment? char string[40]; int string_size; strcpy(string, ”I think this is a nice string.”); string_size = strlen(string); printf(“Size = %d\n”,string_size); printf(“%s\n”,&string[18]); Size = 30 nice string. CISC105 – Topic 10

36 String Lengths Note that the strlen function determines the number of characters between the memory address provided (normally the string name, thus the address of the first character) and the first-encountered null character. If a string does not end in a null character, the function will not work correctly. CISC105 – Topic 10

37 String Lengths Additionally, if a string is overstepped (a longer string is stored than characters allocated), the string length, as reported by strlen, could be greater than the size of the character array declared to hold the string. Note that this is NOT valid. Strings should not overflow their array size; such an overflow is not correct programming and could result in incorrect program operation. CISC105 – Topic 10

38 String Concatenation Sometimes, a program may need to append one string onto another string. This is known as concatenation, or concatenating the arrays. This is accomplished using the C library functions strcat and strncat. The strncat function is simply a “counted” version of strcat, just as strncpy is a “counted” version of strcpy. CISC105 – Topic 10

39 String Concatenation This function, strcat, takes two parameters, the destination string and the source string. This function takes the second string and appends it to the first string, overwriting the null character at the end of the first string with the first character of the second string. The last character copied onto the end of the first string is the null character of the second string. CISC105 – Topic 10

40 String Concatenation \0 string1 string2 char string1[10] = “Hello”;
J i m string1 string2 J i m \0 char string1[10] = “Hello”; char string2[10] = “Jim”; strcat(string1,string2); CISC105 – Topic 10

41 String Concatenation Bear in mind that the strcat (and strncat) functions assume ample space in the destination array to hold what is copied into it. No bounds checking is present…so make sure the first array (the one that is being copied to) is big enough to hold the contents of both strings! CISC105 – Topic 10

42 String Comparison Frequently, we may wish to compare strings.
For integers, floating-point numbers, and other such data types, we can use simple comparison operators such as: int x=9, y=27; if (x < y) { … } if (y < x) { … } if (y == x) { … } CISC105 – Topic 10

43 String Comparison If we were to do this with strings…
char string1[20] = “Hello.”; char string2[20] = “A dog.”; if (string1 < string2) { … } This is a valid comparison; however it does not compare the strings “Hello.” and “A dog.” As the name of an array is the memory address of its first element, this comparison statement tests to see if the memory address string1 begins at is less than the memory address string2 begins at. CISC105 – Topic 10

44 String Comparison Thus, C provides the strcmp function to compare strings. This function takes two strings and returns an integer value: A negative integer if str1 is less than str2. Zero if str1 equals str2. A positive integer if str1 is greater than str2. So, when is a string less than, equal to, or greater than another string? CISC105 – Topic 10

45 String Comparison The simplest case is that of equality. If the two strings are the same length (same number of characters) and each character of one string is equal to the corresponding character in the other string, the strings are considered equal. CISC105 – Topic 10

46 String Comparison There are two rules for less than/greater than and strings: If the first n characters of str1 and str2 match and str1[n] and str2[n] are the first non-matching characters, str1 is less than str2 if str1[n] < str2[n]. If str1 is shorter in length than str2 and all of the characters of str1 match the corresponding characters of str2, str1 is less than str2. CISC105 – Topic 10

47 String Comparison So what would be the result of the following comparison function calls? strcmp(“hello”,“jackson”); strcmp(“red balloon”,“red car”); strcmp(“blue”,“blue”); strcmp(“aardvark”,“aardvarks”); CISC105 – Topic 10

48 String Comparison Notice that the strcmp function uses character comparisons to compare strings; the elements of the first string (characters) are compared with elements of the second string. As such, the rules for comparing char datatypes come into play. CISC105 – Topic 10

49 String Comparison Recalling, every character (symbol) is actually a number, as defined by the ASCII code. As such, these numbers are what is actually compared when characters are being compared (which occurs when strings are being compared). This is fairly transparent as “A” is less than “B” which is less than “C”, etc… However, it is important to realize that “Z” is less than “a”. CISC105 – Topic 10

50 String Comparison The uppercase letters, in the ASCII code, have lower codes than the lowercase letters. Thus, if we were to call: strcmp(“Zebra”,“aardvark”); This function would return a negative value as ‘Z’ is less then ‘a’, thus “Zebra” is less than “aardvark”. CISC105 – Topic 10

51 String Comparison Thus, it is important to keep in mind that in the ASCII code, which governs the values of characters and thus their comparisons, uppercase letters have smaller values than lowercase letters. CISC105 – Topic 10

52 String Comparison So what would be the result of the following comparison function calls? strcmp(“hello”,“Jackson”); strcmp(“red Balloon”,“red car”); strcmp(“blue”,“blUe”); strcmp(“AARDVARK”,“AARDVARKs”); CISC105 – Topic 10

53 String Input/Output Functions
We have already seen the printf and scanf functions as the primary method of input and output in the C language. We can now introduce sprintf and sscanf which use a string instead of the principle I/O devices (the keyboard and the display screen). CISC105 – Topic 10

54 String Input/Output Functions
We can use the sprintf function to create a string. This function works the same as printf except that it takes one additional parameter, the string to store the result in. The format of sprintf is: sprintf(string,format string, print list); Notice that the format is the same as for printf with the addition of the string. CISC105 – Topic 10

55 String Input/Output Functions
So, the sprintf function works the exact same as the printf function except that instead of displaying the resulting text, it stores it in the specified string. Note that this function, like all others that operate on strings, does not do bounds checking. CISC105 – Topic 10

56 String Input/Output Functions
This program fragment int month = 12; int day = 25; int year = 2000; char string1[20]; sprintf(string1, “%d/%d/%d”, month,day,year); would store the string “12/25/2000” in the character array string1. CISC105 – Topic 10

57 String Input/Output Functions
Hello Jeff. Today is Monday, the 27 th of November. How about that? What would be the output of the following code fragment? char string1[20] = “Today is ”; char string2[20] = “How about that?”; char string3[40]; int day = 27; sprintf(string3,”Hello Jeff. %s Monday, the %d th of\nNovember. %s”, string1, day, string2); printf(“%s”,string3); CISC105 – Topic 10

58 String Input/Output Functions
Just as sprintf is simply a version of printf that outputs to a string and not the display, sscanf is a version of scanf that takes input from a string and not the keyboard. The format of sscanf is: sscanf(string,format string, scan list); Notice that the format is the same as for scanf with the addition of the string. CISC105 – Topic 10

59 String Input/Output Functions
This program fragment char string1[20] = “12/25/2000”; int month; int day; int year; sscanf(string1, “%d/%d/%d”, &month,&day,&year); would store the value 12 in month, the value 25 in day, and the value 2000 in year. CISC105 – Topic 10

60 String Input/Output Functions
What would be the output of the following code fragment? Word 1 = Hello Word 2 = is number1 = 24 number2 = char string1[30] = “Hello 24 is ”; char word1[10], word2[10]; int number1; float number2; sscanf(string1,”%s %d %s %f”, word1, &number1, word2, &number2); printf(“Word 1 = %s\nWord 2 = %s\n number1 = %d\nnumber2= %f\n”, word1,word2,number1,number2); CISC105 – Topic 10


Download ppt "Topic 10 - Strings."

Similar presentations


Ads by Google