Presentation is loading. Please wait.

Presentation is loading. Please wait.

Insight Through Computing 15. Strings Operations Subscripting Concatenation Search Numeric-String Conversions Built-Ins: int2str,num2str, str2double.

Similar presentations


Presentation on theme: "Insight Through Computing 15. Strings Operations Subscripting Concatenation Search Numeric-String Conversions Built-Ins: int2str,num2str, str2double."— Presentation transcript:

1 Insight Through Computing 15. Strings Operations Subscripting Concatenation Search Numeric-String Conversions Built-Ins: int2str,num2str, str2double

2 Insight Through Computing Previous Dealings N = input(‘Enter Degree: ’) title(‘The Sine Function’) disp( sprintf(‘N = %2d’,N) )

3 Insight Through Computing A String is an Array of Characters ‘Aa7*>@ x!’ A a 7 * > @ x ! This string has length 9.

4 Insight Through Computing Why are Stirngs Important? 1.Numerical Data often encoded as strings 2. Genomic calculation/search

5 Insight Through Computing Numerical Data is Often Encoded in Strings For example, a file containing Ithaca weather data begins with the string W07629N4226 Longitude: 76 o 29’ West Latitude: 42 o 26’ North

6 Insight Through Computing What We Would Like to Do W07629N4226 Get hold of the substring ‘07629’ Convert it to floating format so that it can be involved in numerical calculations.

7 Insight Through Computing Format Issues 9 as an IEEE floating point number: 9 as a character: 0100000blablahblah01001111000100010010 01000otherblabla Different Representation

8 Insight Through Computing Genomic Computations Looking for patterns in a DNA sequence: ‘ATTCTGACCTCGATC’ ACCT

9 Insight Through Computing Genomic Computations Quantifying Differences: ATTCTGACCTCGATC ATTGCTGACCTCGAT Remove?

10 Insight Through Computing Working With Strings

11 Insight Through Computing Strings Can Be Assigned to Variables S = ‘N = 2’ N = 2; S = sprintf(‘N = %1d’,N) ‘N = 2’ S sprintf produces a formatted string using fprintf rules

12 Insight Through Computing Strings Have a Length s = ‘abc’; n = length(s); % n = 3 s = ‘’; % the empty string n = length(s) % n = 0 s = ‘ ‘; % single blank n = length(s) % n = 1

13 Insight Through Computing Concatenation This: S = ‘abc’; T = ‘xy’ R = [S T] is the same as this: R = ‘abcxy’

14 Insight Through Computing Repeated Concatenation This: s = ‘’; for k=1:5 s = [s ‘z’]; end is the same as this: z = ‘zzzzz’

15 Insight Through Computing Replacing and Appending Characters s = ‘abc’; s(2) = ‘x’ % s = ‘axc’ t = ‘abc’ t(4) = ‘d’ % t = ‘abcd’ v = ‘’ v(5) = ‘x’ % v = ‘ x’

16 Insight Through Computing Extracting Substrings s = ‘abcdef’; x = s(3) % x = ‘c’ x = s(2:4) % x = ‘bcd’ x = s(length(s)) % x = ‘f’

17 Insight Through Computing Colon Notation s( : ) Starting Location Ending Location

18 Insight Through Computing Replacing Substrings s = ‘abcde’; s(2:4) = ‘xyz’ % s = ‘axyze’ s = ‘abcde’ s(2:4) = ‘wxyz’ % Error

19 Insight Through Computing Question Time s = ‘abcde’; for k=1:3 s = [ s(4:5) s(1:3)]; end What is the final value of s ? A abcde B. bcdea C. eabcd D. deabc

20 Insight Through Computing Problem: DNA Strand x is a string made up of the characters ‘A’, ‘C’, ‘T’, and ‘G’. Construct a string Y obtained from x by replacinig each A by T, each T by A, each C by G, and each G by C x: ACGTTGCAGTTCCATATG y: TGCAACGTCAAGGTATAC

21 Insight Through Computing function y = Strand(x) % x is a string consisting of % the characters A, C, T, and G. % y is a string obtained by % replacing A by T, T by A, % C by G and G by C.

22 Insight Through Computing Comparing Strings Built-in function strcmp strcmp(s1,s2) is true if the strings s1 and s2 are identical.

23 Insight Through Computing How y is Built Up x: ACGTTGCAGTTCCATATG y: TGCAACGTCAAGGTATAC Start: y: ‘’ After 1 pass: y: T After 2 passes: y: TG After 3 passes: y: TGC

24 Insight Through Computing for k=1:length(x) if strcmp(x(k),'A') y = [y 'T']; elseif strcmp(x(k),'T') y = [y 'A']; elseif strcmp(x(k),'C') y = [y 'G']; else y = [y 'C']; end

25 Insight Through Computing A DNA Search Problem Suppose S and T are strings, e.g., S: ‘ACCT’ T: ‘ATGACCTGA’ We’d like to know if S is a substring of T and if so, where is the first occurrance?

26 Insight Through Computing function k = FindCopy(S,T) % S and T are strings. % If S is not a substring of T, % then k=0. % Otherwise, k is the smallest % integer so that S is identical % to T(k:k+length(S)-1).

27 Insight Through Computing A DNA Search Problem S: ‘ACCT’ T: ‘ATGACCTGA’ strcmp(S,T(1:4))  False

28 Insight Through Computing A DNA Search Problem S: ‘ACCT’ T: ‘ATGACCTGA’ strcmp(S,T(2:5))  False

29 Insight Through Computing A DNA Search Problem S: ‘ACCT’ T: ‘ATGACCTGA’ strcmp(S,T(3:6))  False

30 Insight Through Computing A DNA Search Problem S: ‘ACCT’ T: ‘ATGACCTGA’ strcmp(S,T(4:7)))  True

31 Insight Through Computing Pseudocode First = 1; Last = length(S); while S is not identical to T(First:Last) First = First + 1; Last = Last + 1; end

32 Insight Through Computing Subscript Error S: ‘ACCT’ T: ‘ATGACTGA’ strcmp(S,T(6:9)) There’s a problem if S is not a substring of T.

33 Insight Through Computing Pseudocode First = 1; Last = length(s); while Last<=length(T) &&... ~strcmp(S,T(First:Last)) First = First + 1; Last = Last + 1; end

34 Insight Through Computing Post-Loop Processing Loop ends when this is false: Last<=length(T) &&... ~strcmp(S,T(First:Last))

35 Insight Through Computing Post-Loop Processing if Last>length(T) % No Match found k=0; else % There was a match k=First; end The loop ends for one of two reasons.

36 Insight Through Computing Numeric/String Conversion

37 Insight Through Computing String-to-Numeric Conversion An example… Convention: W07629N4226 Longitude: 76 o 29’ West Latitude: 42 o 26’ North

38 Insight Through Computing String-to-Numeric Conversion S = ‘W07629N4226’ s1 = s(2:4); x1 = str2double(s1); s2 = s(5:6); x2 = str2double(s2); Longitude = x1 + x2/60 There are 60 minutes in a degree.

39 Insight Through Computing Numeric-to-String Conversion x = 1234; s = int2str(x); % s = ‘1234’ x = pi; s = num2str(x,’%5.3f’); % s =‘3.142’

40 Insight Through Computing Problem Given a date in the format ‘mm/dd’ specify the next day in the same format

41 Insight Through Computing y = Tomorrow(x) x y 02/28 03/01 07/13 07/14 12/31 01/01

42 Insight Through Computing Get the Day and Month month = str2double(x(1:2)); day = str2double(x(4:5)); Thus, if x = ’02/28’ then month is assigned the numerical value of 2 and day is assigned the numerical value of 28.

43 Insight Through Computing L = [31 28 31 30 31 30 31 31 30 31 30 31]; if day+1<=L(month) % Tomorrow is in the same month newDay = day+1; newMonth = month;

44 Insight Through Computing L = [31 28 31 30 31 30 31 31 30 31 30 31]; else % Tomorrow is in the next month newDay = 1; if month <12 newMonth = month+1; else newMonth = 1; end

45 Insight Through Computing The New Day String Compute newDay (numerical) and convert… d = int2str(newDay); if length(d)==1 d = ['0' d]; end

46 Insight Through Computing The New Month String Compute newMonth (numerical) and convert… m = int2str(newMonth); if length(m)==1; m = ['0' m]; end

47 Insight Through Computing The Final Concatenation y = [m '/' d];

48 Insight Through Computing Some other useful string functions str= ‘Cs 1112’; length(str) % 7 isletter(str) % [1 1 0 0 0 0 0] isspace(str) % [0 0 1 0 0 0 0] lower(str) % ‘cs 1112’ upper(str) % ‘CS 1112’ ischar(str) % Is str a char array? True (1) strcmp(str(1:2),‘cs’) % Compare strings str(1:2) & ‘cs’. False (0) strcmp(str(1:3),‘CS’) % False (0)

49 Insight Through Computing ASCII characters (American Standard Code for Information Interchange) ascii code Character:: 65‘A’ 66‘B’ 67‘C’: 90‘Z’: ascii code Character:: 48‘0’ 49‘1’ 50‘2’: 57‘9’:

50 Insight Through Computing Character vs ASCII code str= ‘Age 19’ %a 1-d array of characters code= double(str) %convert chars to ascii values str1= char(code) %convert ascii values to chars

51 Insight Through Computing Arithmetic and relational ops on characters ‘c’-‘a’ gives 2 ‘6’-‘5’ gives 1 letter1=‘e’; letter2=‘f’; letter1-letter2 gives -1 ‘c’>’a’ gives true letter1==letter2 gives false ‘A’ + 2 gives 67 char(‘A’+2) gives ‘C’

52 Insight Through Computing Example: toUpper Write a function toUpper(cha) to convert character cha to upper case if cha is a lower case letter. Return the converted letter. If cha is not a lower case letter, simply return the character cha. Hint: Think about the distance between a letter and the base letter ‘a’ (or ‘A’). E.g., a b c d e f g h … A B C D E F G H … Of course, do not use Matlab function upper ! distance = ‘g’-‘a’ = 6 = ‘G’-‘A’

53 Insight Through Computing function up = toUpper(cha) % up is the upper case of character cha. % If cha is not a letter then up is just cha. up= cha; cha is lower case if it is between ‘a’ and ‘z’

54 Insight Through Computing function up = toUpper(cha) % up is the upper case of character cha. % If cha is not a letter then up is just cha. up= cha; if ( cha >= 'a' && cha <= 'z' ) % Find distance of cha from ‘a’ end

55 Insight Through Computing function up = toUpper(cha) % up is the upper case of character cha. % If cha is not a letter then up is just cha. up= cha; if ( cha >= 'a' && cha <= 'z' ) % Find distance of cha from ‘a’ offset= cha - 'a'; % Go same distance from ‘A’ end

56 Insight Through Computing function up = toUpper(cha) % up is the upper case of character cha. % If cha is not a letter then up is just cha. up= cha; if ( cha >= 'a' && cha <= 'z' ) % Find distance of cha from ‘a’ offset= cha - 'a'; % Go same distance from ‘A’ up= char('A' + offset); end


Download ppt "Insight Through Computing 15. Strings Operations Subscripting Concatenation Search Numeric-String Conversions Built-Ins: int2str,num2str, str2double."

Similar presentations


Ads by Google