Section for Week 2, Tuesday, 1/19/99 REGULAR EXPRESSIONS ------------------- A Regular Expression (RE or regexp for short) is a pattern to be matched against a string. Much of the power and flexibility of Perl comes from REs, so learning them is a key part of learning the language. You may remember from the Unix warmup handout (B) the `grep' utility. Using it, you could find all the lines in a file containing a given string (eg, you could use "grep oodle /usr/dict/words" to find all English words containing "oodle": doodle, noodle, oodles, poodle, toodle. This is a simple example of pattern matching, which is the key use of REs. In perl, the same functionality would be achieved by: while (<>) { # Iterate over all input lines if (/oodle/) { # Shorthand for if ($_ =~ /oodle/) print; # Shorthand for print $_; } } PATTERNS: Character Class: Signified by square brackets: [AGH] will match any of A, G, or H. [A-Z] will match any uppercase letter, [0-9] any digit. [^0-9] will match any non-digit (^ means negation). Predefined classes: \d == [0-9], \w == [a-zA-Z0-9_], \s == [ \r\n\f\t], caps for negation of these. Multipliers: * --> zero or more of the immediately preceding character + --> one or more of the immediately preceding character ? --> zero or one of the immediately preceding character /x{5,10}/ --> between 5 and 10 x's (can omit second parameter, not first) Memory: use parentheses to "remember" a pattern: /(.)\1/ will match any two consecutive identical non-newline characters. (explain \1 \2 etc) Alternation: /Ami|ami/ will match either (read "or") /(song|Black)bird/ Anchoring: \b --> word boundary (between \w and \W) (negation==\B) /\bred/ will match "hey redrobbin" but not "Fred" /^Bird/ will only match Bird at the beginning of the string. /bird$/ will only match Bird at the end of the string. Case: Append an 'i' after the closing slash to ignore case: /ami/i will match Ami, AMI, ami, aMi, etc. OPERATORS: substitute: s/old/new/ will subst the first occurence of old with new. Add 'g' to end to make global. Transliteration: You should read about this (tr///) operator. Other fun stuff about REs: see the perlre(1) man-page or ch2 of _Programming Perl_.