GNU software that deals with regular expressions provides a number of
additional regexp operators. These operators are described in this
section, and are specific to
gawk; they are not available in other
Most of the additional operators are for dealing with word matching. For our purposes, a word is a sequence of one or more letters, digits, or underscores (`_').
/\<away/matches `away', but not `stowaway'.
/stow\>/matches `stow', but not `stowaway'.
/\Brat\B/matches `crate', but it does not match `dirty rat'. `\B' is essentially the opposite of `\y'.
There are two other operators that work on buffers. In Emacs, a
buffer is, naturally, an Emacs buffer. For other programs, the
regexp library routines that
gawk uses consider the entire
string to be matched as the buffer.
awk, since `^' and `$' always work in terms
of the beginning and end of strings, these operators don't add any
new capabilities. They are provided for compatibility with other GNU
In other GNU software, the word boundary operator is `\b'. However,
that conflicts with the
awk language's definition of `\b'
as backspace, so
gawk uses a different letter.
An alternative method would have been to require two backslashes in the GNU operators, but this was deemed to be too confusing, and the current method of using `\y' for the GNU `\b' appears to be the lesser of two evils.
The various command line options
(see section Command Line Options)
gawk interprets characters in regexps.
gawkprovide all the facilities of POSIX regexps and the GNU regexp operators described above. However, interval expressions are not supported.
awkregexps are matched. The GNU operators are not special, interval expressions are not available, and neither are the POSIX character classes (
[[:alnum:]]and so on). Characters described by octal and hexadecimal escape sequences are treated literally, even if they represent regexp metacharacters.
Go to the first, previous, next, last section, table of contents.