Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regular expressions are regular Marek Pawelec

Similar presentations


Presentation on theme: "Regular expressions are regular Marek Pawelec"— Presentation transcript:

1 Regular expressions are regular Marek Pawelec

2 Outline 1.Regex vocabulary 2.Segmentation rules 3.Regex tagger 4.Regex text filter 5.Auto-translatables

3 (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_2.jpg", "name": "(?

4 Wildcards... Wildcards used in regular search: * – any text string ? – any single character...but somewhat different.

5 Regular expressions. – any character (or symbol, digit...) [ ] – a range [123] – digit 1 or 2 or 3 [1-3] – any digit from 1 to 3 [A-Za-z] – any letter [^A] – any character except A | – or 1|2|3 – 1 or 2 or 3

6 Ranges Both [ ] and | means or. What is the difference? [USDEUR] matches U or S or D or E or U or R USD|EUR matches USD or EUR

7 Special symbols \ – modifier (escape character). any character, but \. means dot \\ matches backslash \d – digit [0-9] \s – white space \w – any word character [A-Za-z0-9_] \u#### – unicode character, e.g. \u2212: –

8 Quantifiers ? – 0 or 1 \d? means zero or one digit * – 0 or more \d* means zero or more digits + – 1 or more \d+ meands at least one digit *? – zero or as little as possible +? – one or as little as possible greedy lazy

9 Quantifiers cont. {num} – value or range \d{4} = 4 digits, \d{2,4} = 2, 3 or 4 digits \d{,4} = from 1 to 4 digits \d{4,} = 4 or more

10 Groups ( ) – creates a group ($num recalls it) (?: ) – passive group (not numbered)

11 Assertions (?= ) – look ahead assertion memo(?=Q) will match memo in memoQ, but not in memory (?! ) – negative look ahead assertion memo(?!Q) will match memo in memory, but not in memoQ (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_10.jpg", "name": "Assertions (?= ) – look ahead assertion memo(?=Q) will match memo in memoQ, but not in memory (?.", "description": ") – negative look ahead assertion memo(?!Q) will match memo in memory, but not in memoQ (?<. ) – negative look back assertion (?

12 #lists# A list contains variables: #currency# (EUR|USD|GBP|HUF) #cap# (A|B|C|D) = [ABCD]

13 Regular expressions in memoQ Segmentation rules Regexp tagger Regexp text filter Auto-translatables

14 Segmentation rules

15

16

17

18

19 #end##!#[\s]+#cap# #end##!#[\s]+[\d] #end##!#[\s]+#lpar#[\s]*#cap# #end##!#[\s]+#lpar#[\s]*[\d] #end#[\s]*#rpar##!#[\s]+#cap# #end#[\s]*#rpar##!#[\s]+[\d]

20 #end##!#[\s]+#cap# #end##!#[\s]+[\d] #end##!#[\s]+#lpar#[\s]*#cap# #end##!#[\s]+#lpar#[\s]*[\d] #end#[\s]*#rpar##!#[\s]+#cap# #end#[\s]*#rpar##!#[\s]+[\d]

21

22

23

24

25 #end##!#[\s]+#cap# = [:\!\?\.]#!#\s+[A-Z]

26

27 #end##!#[\s]+#cap# Unless: #abbr_long##!#[\s]+#cap# [\s]#abbr_short##!#[\s]+#cap# \s#cap#\.#!#[\s]+#cap#

28

29

30 Regex tagger

31

32

33

34

35 \

36

37

38

39

40

41

42 / N \d{4} - \d{4} [A-Z]\d{3} - \d{4}

43

44

45

46 ERR_GRP_NO_SAMPLE [A-Z]+ _[A-Z]+( )+

47

48

49 Tip: Regex tagger without regex

50

51

52

53 Regexp text filter

54

55

56 *Popup "Putty" "c:\util\putty.exe" \s*\*(.*)

57

58

59

60 *Popup.icon="$IconDir$\Fav_Star.ico" "Quick" "!DynamicFolder:$QuickLaunch$*.lnk" \w+(\s+\w+)* " \w = [A-Za-z0-9_]

61

62 Auto-translatables

63

64

65

66 Rule for EN/DE/FR HU number format conversion (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_65.jpg", "name": "Rule for EN/DE/FR HU number format conversion (?

67 (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_66.jpg", "name": "(?

68 (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_67.jpg", "name": "(?

69 (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_68.jpg", "name": "(?

70 (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_69.jpg", "name": "(?

71 (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_70.jpg", "name": "(?

72 (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_71.jpg", "name": "(?

73 (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_72.jpg", "name": "(?

74 12 345,67 12,345,67 12, , , , ,345,67,12, , ,67, ,

75 (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_74.jpg", "name": "(?

76 Red elements are not necessary: (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_75.jpg", "name": "Red elements are not necessary: (?

77 The same rule for EN HU only (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_76.jpg", "name": "The same rule for EN HU only (?

78 (? { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/1277671/3/slides/slide_77.jpg", "name": "(?

79 Day of the week, Month Day number (st, nd, rd, th) Year day of the week day number. month year

80

81

82 (#day#),?\s(#month#)\s (\d{1,2})(?:st|nd|rd|th)? \s(\d{4}) $1 $3. $2 $4

83 (#day#),?\s(#month#)\s(\d{1,2})(?:st|nd|rd|th)?\s(\d{4}) #day#:Friday piątek($1) #month#:May maja($2) 11th 11($3) ($4) $1 $3. $2 $4

84 eat-sheets/regular-expressions/http://www.cheatography.com/davechild/ch eat-sheets/regular-expressions/ expressions.info/tutorial.htmlhttp://www.regular- expressions.info/tutorial.html

85


Download ppt "Regular expressions are regular Marek Pawelec"

Similar presentations


Ads by Google