Home | History | Annotate | Line # | Download | only in doc
      1  1.1  agc <h1>TRE Regexp Syntax</h1>
      2  1.1  agc 
      3  1.1  agc <p>
      4  1.1  agc This document describes the POSIX 1003.2 extended RE (ERE) syntax and
      5  1.1  agc the basic RE (BRE) syntax as implemented by TRE, and the TRE extensions
      6  1.1  agc to the ERE syntax.  A simple Extended Backus-Naur Form (EBNF) style
      7  1.1  agc notation is used to describe the grammar.
      8  1.1  agc </p>
      9  1.1  agc 
     10  1.1  agc <h2>ERE Syntax</h2>
     11  1.1  agc 
     12  1.1  agc <h3>Alternation operator</h3>
     13  1.1  agc <a name="alternation"></a>
     14  1.1  agc <a name="extended-regexp"></a>
     15  1.1  agc 
     16  1.1  agc <table bgcolor="#e0e0f0" cellpadding="10">
     17  1.1  agc <tr><td>
     18  1.1  agc <pre>
     19  1.1  agc <i>extended-regexp</i> ::= <a href="#branch"><i>branch</i></a>
     20  1.1  agc                 |   <i>extended-regexp</i> <b>"|"</b> <a href="#branch"><i>branch</i></a>
     21  1.1  agc </pre>
     22  1.1  agc </td></tr>
     23  1.1  agc </table>
     24  1.1  agc <p>
     25  1.1  agc An extended regexp (ERE) is one or more <i>branches</i>, separated by
     26  1.1  agc <tt>|</tt>.  An ERE matches anything that matches one or more of the
     27  1.1  agc branches.
     28  1.1  agc </p>
     29  1.1  agc 
     30  1.1  agc <h3>Catenation of REs</h3>
     31  1.1  agc <a name="catenation"></a>
     32  1.1  agc <a name="branch"></a>
     33  1.1  agc 
     34  1.1  agc <table bgcolor="#e0e0f0" cellpadding="10">
     35  1.1  agc <tr><td>
     36  1.1  agc <pre>
     37  1.1  agc <i>branch</i> ::= <i>piece</i>
     38  1.1  agc        |   <i>branch</i> <i>piece</i>
     39  1.1  agc </pre>
     40  1.1  agc </td></tr>
     41  1.1  agc </table>
     42  1.1  agc <p>
     43  1.1  agc A branch is one or more <i>pieces</i> concatenated.  It matches a
     44  1.1  agc match for the first piece, followed by a match for the second piece,
     45  1.1  agc and so on.
     46  1.1  agc </p>
     47  1.1  agc 
     48  1.1  agc 
     49  1.1  agc <table bgcolor="#e0e0f0" cellpadding="10">
     50  1.1  agc <tr><td>
     51  1.1  agc <pre>
     52  1.1  agc <i>piece</i> ::= <i>atom</i>
     53  1.1  agc       |   <i>atom</i> <a href="#repeat-operator"><i>repeat-operator</i></a>
     54  1.1  agc       |   <i>atom</i> <a href="#approx-settings"><i>approx-settings</i></a>
     55  1.1  agc </pre>
     56  1.1  agc </td></tr>
     57  1.1  agc </table>
     58  1.1  agc <p>
     59  1.1  agc A piece is an <i>atom</i> possibly followed by a repeat operator or an
     60  1.1  agc expression controlling approximate matching parameters for the <i>atom</i>.
     61  1.1  agc </p>
     62  1.1  agc 
     63  1.1  agc 
     64  1.1  agc <table bgcolor="#e0e0f0" cellpadding="10">
     65  1.1  agc <tr><td>
     66  1.1  agc <pre>
     67  1.1  agc <i>atom</i> ::= <b>"("</b> <i>extended-regexp</i> <b>")"</b>
     68  1.1  agc      |   <a href="#bracket-expression"><i>bracket-expression</i></a>
     69  1.1  agc      |   <b>"."</b>
     70  1.1  agc      |   <a href="#assertion"><i>assertion</i></a>
     71  1.1  agc      |   <a href="#literal"><i>literal</i></a>
     72  1.1  agc      |   <a href="#backref"><i>back-reference</i></a>
     73  1.1  agc      |   <b>"(?#"</b> <i>comment-text</i> <b>")"</b>
     74  1.1  agc      |   <b>"(?"</b> <a href="#options"><i>options</i></a> <b>")"</b> <i>extended-regexp</i>
     75  1.1  agc      |   <b>"(?"</b> <a href="#options"><i>options</i></a> <b>":"</b> <i>extended-regexp</i> <b>")"</b>
     76  1.1  agc </pre>
     77  1.1  agc </td></tr>
     78  1.1  agc </table>
     79  1.1  agc <p>
     80  1.1  agc An atom is either an ERE enclosed in parenthesis, a bracket
     81  1.1  agc expression, a <tt>.</tt> (period), an assertion, or a literal.
     82  1.1  agc </p>
     83  1.1  agc 
     84  1.1  agc <p>
     85  1.1  agc The dot (<tt>.</tt>) matches any single character.
     86  1.1  agc If the <code>REG_NEWLINE</code> compilation flag (see <a
     87  1.1  agc href="api.html">API manual</a>) is specified, the newline
     88  1.1  agc character is not matched.
     89  1.1  agc </p>
     90  1.1  agc 
     91  1.1  agc <p>
     92  1.1  agc <tt>Comment-text</tt> can contain any characters except for a closing parenthesis <tt>)</tt>. The text in the comment is
     93  1.1  agc completely ignored by the regex parser and it used solely for readability purposes.
     94  1.1  agc </p>
     95  1.1  agc 
     96  1.1  agc <h3>Repeat operators</h3>
     97  1.1  agc <a name="repeat-operator"></a>
     98  1.1  agc 
     99  1.1  agc <table bgcolor="#e0e0f0" cellpadding="10">
    100  1.1  agc <tr><td>
    101  1.1  agc <pre>
    102  1.1  agc <i>repeat-operator</i> ::= <b>"*"</b>
    103  1.1  agc                 |   <b>"+"</b>
    104  1.1  agc                 |   <b>"?"</b>
    105  1.1  agc                 |   <i>bound</i>
    106  1.1  agc                 |   <b>"*?"</b>
    107  1.1  agc                 |   <b>"+?"</b>
    108  1.1  agc                 |   <b>"??"</b>
    109  1.1  agc                 |   <i>bound</i> <b>?</b>
    110  1.1  agc </pre>
    111  1.1  agc </td></tr>
    112  1.1  agc </table>
    113  1.1  agc 
    114  1.1  agc <p>
    115  1.1  agc An atom followed by <tt>*</tt> matches a sequence of 0 or more matches
    116  1.1  agc of the atom.  <tt>+</tt> is similar to <tt>*</tt>, matching a sequence
    117  1.1  agc of 1 or more matches of the atom.  An atom followed by <tt>?</tt>
    118  1.1  agc matches a sequence of 0 or 1 matches of the atom.
    119  1.1  agc </p>
    120  1.1  agc 
    121  1.1  agc <p>
    122  1.1  agc A <i>bound</i> is one of the following, where <i>m</i> and <i>m</i>
    123  1.1  agc are unsigned decimal integers between <tt>0</tt> and
    124  1.1  agc <tt>RE_DUP_MAX</tt>:
    125  1.1  agc </p>
    126  1.1  agc 
    127  1.1  agc <ol>
    128  1.1  agc <li><tt>{</tt><i>m</i><tt>,</tt><i>n</i><tt>}</tt></li>
    129  1.1  agc <li><tt>{</tt><i>m</i><tt>,}</tt></li>
    130  1.1  agc <li><tt>{</tt><i>m</i><tt>}</tt></li>
    131  1.1  agc </ol>
    132  1.1  agc 
    133  1.1  agc <p>
    134  1.1  agc An atom followed by [1] matches a sequence of <i>m</i> through <i>n</i>
    135  1.1  agc (inclusive) matches of the atom.  An atom followed by [2]
    136  1.1  agc matches a sequence of <i>m</i> or more matches of the atom.  An atom
    137  1.1  agc followed by [3] matches a sequence of exactly <i>m</i> matches of the
    138  1.1  agc atom.
    139  1.1  agc </p>
    140  1.1  agc 
    141  1.1  agc 
    142  1.1  agc <p>
    143  1.1  agc Adding a <tt>?</tt> to a repeat operator makes the subexpression minimal, or
    144  1.1  agc non-greedy.  Normally a repeated expression is greedy, that is, it matches as
    145  1.1  agc many characters as possible.  A non-greedy subexpression matches as few
    146  1.1  agc characters as possible.  Note that this does not (always) mean the same thing
    147  1.1  agc as matching as many or few repetitions as possible.  Also note
    148  1.1  agc that <strong>minimal repetitions are not currently supported for approximate
    149  1.1  agc matching</strong>.
    150  1.1  agc </p>
    151  1.1  agc 
    152  1.1  agc <h3>Approximate matching settings</h3>
    153  1.1  agc <a name="approx-settings"></a>
    154  1.1  agc 
    155  1.1  agc <table bgcolor="#e0e0f0" cellpadding="10">
    156  1.1  agc <tr><td>
    157  1.1  agc <pre>
    158  1.1  agc <i>approx-settings</i> ::= <b>"{"</b> <i>count-limits</i>* <b>","</b>? <i>cost-equation</i>? <b>"}"</b>
    159  1.1  agc 
    160  1.1  agc <i>count-limits</i> ::= <b>"+"</b> <i>number</i>?
    161  1.1  agc              |   <b>"-"</b> <i>number</i>?
    162  1.1  agc              |   <b>"#"</b> <i>number</i>?
    163  1.1  agc              |   <b>"~"</b> <i>number</i>?
    164  1.1  agc 
    165  1.1  agc <i>cost-equation</i> ::= ( <i>cost-term</i> "+"? " "? )+ <b>"&lt;"</b> <i>number</i>
    166  1.1  agc 
    167  1.1  agc <i>cost-term</i> ::= <i>number</i> <b>"i"</b>
    168  1.1  agc           |   <i>number</i> <b>"d"</b>
    169  1.1  agc           |   <i>number</i> <b>"s"</b>
    170  1.1  agc 
    171  1.1  agc </pre>
    172  1.1  agc </td></tr>
    173  1.1  agc </table>
    174  1.1  agc 
    175  1.1  agc <p>
    176  1.1  agc The approximate matching settings for a subpattern can be changed
    177  1.1  agc by appending <i>approx-settings</i> to the subpattern.  Limits for
    178  1.1  agc the number of errors can be set and an expression for specifying and
    179  1.1  agc limiting the costs can be given.
    180  1.1  agc </p>
    181  1.1  agc 
    182  1.1  agc <p>
    183  1.1  agc The <i>count-limits</i> can be used to set limits for the number of
    184  1.1  agc insertions (<tt>+</tt>), deletions (<tt>-</tt>), substitutions
    185  1.1  agc (<tt>#</tt>), and total number of errors (<tt>~</tt>).  If the
    186  1.1  agc <i>number</i> part is omitted, the specified error count will be
    187  1.1  agc unlimited.
    188  1.1  agc </p>
    189  1.1  agc 
    190  1.1  agc <p>
    191  1.1  agc The <i>cost-equation</i> can be thought of as a mathematical equation,
    192  1.1  agc where <tt>i</tt>, <tt>d</tt>, and <tt>s</tt> stand for the number of
    193  1.1  agc insertions, deletions, and substitutions, respectively.  The equation
    194  1.1  agc can have a multiplier for each of <tt>i</tt>, <tt>d</tt>, and
    195  1.1  agc <tt>s</tt>.  The multiplier is the cost of the error, and the number
    196  1.1  agc after <tt>&lt;</tt> is the maximum allowed cost of a match.  Spaces
    197  1.1  agc and pluses can be inserted to make the equation readable.  In fact, when
    198  1.1  agc specifying only a cost equation, adding a space after the opening <tt>{</tt>
    199  1.1  agc is <strong>required</strong>.
    200  1.1  agc </p>
    201  1.1  agc 
    202  1.1  agc <p>
    203  1.1  agc Examples:
    204  1.1  agc <dl>
    205  1.1  agc <dt><tt>{~}</tt></dt>
    206  1.1  agc <dd>Sets the maximum number of errors to unlimited.</dd>
    207  1.1  agc <dt><tt>{~3}</tt></dt>
    208  1.1  agc <dd>Sets the maximum number of errors to three.</dd>
    209  1.1  agc <dt><tt>{+2~5}</tt></dt>
    210  1.1  agc <dd>Sets the maximum number of errors to five, and the maximum number
    211  1.1  agc of insertions to two.</dd>
    212  1.1  agc <dt><tt>{&lt;3}</tt></dt>
    213  1.1  agc <dd>Sets the maximum cost to three.
    214  1.1  agc <dt><tt>{ 2i + 1d + 2s &lt; 5 }</tt></dt>
    215  1.1  agc <dd>Sets the cost of an insertion to two, a deletion to one, a
    216  1.1  agc substitution to two, and the maximum cost to five.
    217  1.1  agc </dl>
    218  1.1  agc 
    219  1.1  agc 
    220  1.1  agc <h3>Bracket expressions</h3>
    221  1.1  agc <a name="bracket-expression"></a>
    222  1.1  agc 
    223  1.1  agc <table bgcolor="#e0e0f0" cellpadding="10">
    224  1.1  agc <tr><td>
    225  1.1  agc <pre>
    226  1.1  agc <i>bracket-expression</i> ::= <b>"["</b> <i>item</i>+ <b>"]"</b>
    227  1.1  agc                    |   <b>"[^"</b> <i>item</i>+ <b>"]"</b>
    228  1.1  agc </pre>
    229  1.1  agc </td></tr>
    230  1.1  agc </table>
    231  1.1  agc 
    232  1.1  agc <p>
    233  1.1  agc A bracket expression specifies a set of characters by enclosing a
    234  1.1  agc nonempty list of items in brackets.  Normally anything matching any
    235  1.1  agc item in the list is matched.  If the list begins with <tt>^</tt> the
    236  1.1  agc meaning is negated; any character matching no item in the list is
    237  1.1  agc matched.
    238  1.1  agc </p>
    239  1.1  agc 
    240  1.1  agc <p>
    241  1.1  agc An item is any of the following:
    242  1.1  agc </p>
    243  1.1  agc <ul>
    244  1.1  agc <li>A single character, matching that character.</li>
    245  1.1  agc <li>Two characters separated by <tt>-</tt>.  This is shorthand for the
    246  1.1  agc full range of characters  between those two (inclusive) in the
    247  1.1  agc collating sequence.  For example, <tt>[0-9]</tt> in ASCII matches any
    248  1.1  agc decimal digit.</li>
    249  1.1  agc <li>A collating element enclosed in <tt>[.</tt> and <tt>.]</tt>,
    250  1.1  agc matching the collating element.  This can be used to include a literal
    251  1.1  agc <tt>-</tt> or a multi-character collating element in the list.</li>
    252  1.1  agc <li>A collating element enclosed in <tt>[=</tt> and <tt>=]</tt> (an
    253  1.1  agc equivalence class), matching all collating elements with the same
    254  1.1  agc primary collation weight as that element, including the element
    255  1.1  agc itself.</li>
    256  1.1  agc <li>The name of a character class enclosed in <tt>[:</tt> and
    257  1.1  agc <tt>:]</tt>, matching any character belonging to the class.  The set
    258  1.1  agc of valid names depends on the <code>LC_CTYPE</code> category of the
    259  1.1  agc current locale, but the following names are valid in all locales:
    260  1.1  agc <ul>
    261  1.1  agc <li><tt>alnum</tt> - alphanumeric characters</li>
    262  1.1  agc <li><tt>alpha</tt> - alphabetic characters</li>
    263  1.1  agc <li><tt>blank</tt> - blank characters</li>
    264  1.1  agc <li><tt>cntrl</tt> - control characters</li>
    265  1.1  agc <li><tt>digit</tt> - decimal digits (0 through 9)</li>
    266  1.1  agc <li><tt>graph</tt> - all printable characters except space</li>
    267  1.1  agc <li><tt>lower</tt> - lower-case letters</li>
    268  1.1  agc <li><tt>print</tt> - printable characters including space</li>
    269  1.1  agc <li><tt>punct</tt> - printable characters not space or alphanumeric</li>
    270  1.1  agc <li><tt>space</tt> - white-space characters</li>
    271  1.1  agc <li><tt>upper</tt> - upper case letters</li>
    272  1.1  agc <li><tt>xdigit</tt> - hexadecimal digits</li>
    273  1.1  agc </ul>
    274  1.1  agc </ul>
    275  1.1  agc <p>
    276  1.1  agc To include a literal <tt>-</tt> in the list, make it either the first
    277  1.1  agc or last item, the second endpoint of a range, or enclose it in
    278  1.1  agc <tt>[.</tt> and <tt>.]</tt> to make it a collating element.  To
    279  1.1  agc include a literal <tt>]</tt> in the list, make it either the first
    280  1.1  agc item, the second endpoint of a range, or enclose it in <tt>[.</tt> and
    281  1.1  agc <tt>.]</tt>.  To use a literal <tt>-</tt> as the first
    282  1.1  agc endpoint of a range, enclose it in <tt>[.</tt> and <tt>.]</tt>.
    283  1.1  agc </p>
    284  1.1  agc 
    285  1.1  agc 
    286  1.1  agc <h3>Assertions</h3>
    287  1.1  agc <a name="assertion"></a>
    288  1.1  agc 
    289  1.1  agc <table bgcolor="#e0e0f0" cellpadding="10">
    290  1.1  agc <tr><td>
    291  1.1  agc <pre>
    292  1.1  agc <i>assertion</i> ::= <b>"^"</b>
    293  1.1  agc           |   <b>"$"</b>
    294  1.1  agc           |   <b>"\"</b> <i>assertion-character</i>
    295  1.1  agc </pre>
    296  1.1  agc </td></tr>
    297  1.1  agc </table>
    298  1.1  agc 
    299  1.1  agc <p>
    300  1.1  agc The expressions <tt>^</tt> and <tt>$</tt> are called "left anchor" and
    301  1.1  agc "right anchor", respectively.  The left anchor matches the empty
    302  1.1  agc string at the beginning of the string.  The right anchor matches the
    303  1.1  agc empty string at the end of the string.  The behaviour of both anchors
    304  1.1  agc can be varied by specifying certain execution and compilation flags;
    305  1.1  agc see the <a href="api.html">API manual</a>.
    306  1.1  agc </p>
    307  1.1  agc 
    308  1.1  agc <p>
    309  1.1  agc An assertion-character can be any of the following:
    310  1.1  agc </p>
    311  1.1  agc 
    312  1.1  agc <ul>
    313  1.1  agc <li><tt>&lt;</tt> - Beginning of word
    314  1.1  agc <li><tt>&gt;</tt> - End of word
    315  1.1  agc <li><tt>b</tt> - Word boundary
    316  1.1  agc <li><tt>B</tt> - Non-word boundary
    317  1.1  agc <li><tt>d</tt> - Digit character (equivalent to <tt>[[:digit:]]</tt>)</li>
    318  1.1  agc <li><tt>D</tt> - Non-digit character (equivalent to <tt>[^[:digit:]]</tt>)</li>
    319  1.1  agc <li><tt>s</tt> - Space character (equivalent to <tt>[[:space:]]</tt>)</li>
    320  1.1  agc <li><tt>S</tt> - Non-space character (equivalent to <tt>[^[:space:]]</tt>)</li>
    321  1.1  agc <li><tt>w</tt> - Word character (equivalent to <tt>[[:alnum:]_]</tt>)</li>
    322  1.1  agc <li><tt>W</tt> - Non-word character (equivalent to <tt>[^[:alnum:]_]</tt>)</li>
    323  1.1  agc </ul>
    324  1.1  agc 
    325  1.1  agc 
    326  1.1  agc <h3>Literals</h3>
    327  1.1  agc <a name="literal"></a>
    328  1.1  agc 
    329  1.1  agc <table bgcolor="#e0e0f0" cellpadding="10">
    330  1.1  agc <tr><td>
    331  1.1  agc <pre>
    332  1.1  agc <i>literal</i> ::= <i>ordinary-character</i>
    333  1.1  agc         |   <b>"\x"</b> [<b>"1"</b>-<b>"9"</b> <b>"a"-<b>"f"</b> <b>"A"</b>-<b>"F"</b>]{0,2}
    334  1.1  agc         |   <b>"\x{"</b> [<b>"1"</b>-<b>"9"</b> <b>"a"-<b>"f"</b> <b>"A"</b>-<b>"F"</b>]* <b>"}"</b>
    335  1.1  agc         |   <b>"\"</b> <i>character</i>
    336  1.1  agc </pre>
    337  1.1  agc </td></tr>
    338  1.1  agc </table>
    339  1.1  agc <p>
    340  1.1  agc A literal is either an ordinary character (a character that has no
    341  1.1  agc other significance in the context), an 8 bit hexadecimal encoded
    342  1.1  agc character (e.g. <tt>\x1B</tt>), a wide hexadecimal encoded character
    343  1.1  agc (e.g. <tt>\x{263a}</tt>), or an escaped character.  An escaped
    344  1.1  agc character is a <tt>\</tt> followed by any character, and matches that
    345  1.1  agc character.  Escaping can be used to match characters which have a
    346  1.1  agc special meaning in regexp syntax.  A <tt>\</tt> cannot be the last
    347  1.1  agc character of an ERE.  Escaping also allows you to include a few
    348  1.1  agc non-printable characters in the regular expression.  These special
    349  1.1  agc escape sequences include:
    350  1.1  agc </p>
    351  1.1  agc 
    352  1.1  agc <ul>
    353  1.1  agc <li><tt>\a</tt> - Bell character (ASCII code 7)
    354  1.1  agc <li><tt>\e</tt> - Escape character (ASCII code 27)
    355  1.1  agc <li><tt>\f</tt> - Form-feed character (ASCII code 12)
    356  1.1  agc <li><tt>\n</tt> - New-line/line-feed character (ASCII code 10)
    357  1.1  agc <li><tt>\r</tt> - Carriage return character (ASCII code 13)
    358  1.1  agc <li><tt>\t</tt> - Horizontal tab character (ASCII code 9)
    359  1.1  agc </ul>
    360  1.1  agc 
    361  1.1  agc <p>
    362  1.1  agc An ordinary character is just a single character with no other
    363  1.1  agc significance, and matches that character.  A <tt>{</tt> followed by
    364  1.1  agc something else than a digit is considered an ordinary character.
    365  1.1  agc </p>
    366  1.1  agc 
    367  1.1  agc 
    368  1.1  agc <h3>Back references</h3>
    369  1.1  agc <a name="backref"></a>
    370  1.1  agc 
    371  1.1  agc <table bgcolor="#e0e0f0" cellpadding="10">
    372  1.1  agc <tr><td>
    373  1.1  agc <pre>
    374  1.1  agc <i>back-reference</i> ::= <b>"\"</b> [<b>"1"</b>-<b>"9"</b>]
    375  1.1  agc </pre>
    376  1.1  agc </td></tr>
    377  1.1  agc </table>
    378  1.1  agc <p>
    379  1.1  agc A back reference is a backslash followed by a single non-zero decimal
    380  1.1  agc digit <i>d</i>.  It matches the same sequence of characters
    381  1.1  agc matched by the <i>d</i>th parenthesized subexpression.
    382  1.1  agc </p>
    383  1.1  agc 
    384  1.1  agc <p>
    385  1.1  agc Back references are not defined for POSIX EREs (for BREs they are),
    386  1.1  agc but many matchers, including TRE, implement back references for both
    387  1.1  agc EREs and BREs.
    388  1.1  agc </p>
    389  1.1  agc 
    390  1.1  agc <h3>Options</h3>
    391  1.1  agc <a name="options"></a>
    392  1.1  agc <table bgcolor="#e0e0f0" cellpadding="10">
    393  1.1  agc <tr><td>
    394  1.1  agc <pre>
    395  1.1  agc <i>options</i> ::= [<b>"i" "n" "r" "U"</b>]* (<b>"-"</b> [<b>"i" "n" "r" "U"</b>]*)?
    396  1.1  agc </pre>
    397  1.1  agc </td></tr>
    398  1.1  agc </table>
    399  1.1  agc 
    400  1.1  agc Options allow compile time options to be turned on/off for particular parts of the
    401  1.1  agc regular expression. The options equate to several compile time options specified to
    402  1.1  agc the regcomp API function. If the option is specified in the first section, it is
    403  1.1  agc turned on. If it is specified in the second section (after the <tt>-</tt>), it is
    404  1.1  agc turned off.
    405  1.1  agc <ul>
    406  1.1  agc <li>i - Case insensitive.
    407  1.1  agc <li>n - Forces special handling of the new line character. See the REG_NEWLINE flag in
    408  1.1  agc the <a href="tre-api.html">API Manual</a>.
    409  1.1  agc <li>r - Causes the regex to be matched in a right associative manner rather than the normal
    410  1.1  agc left associative manner.
    411  1.1  agc <li>U - Forces repetition operators to be non-greedy unless a <tt>?</tt> is appended.
    412  1.1  agc </ul>
    413  1.1  agc <h2>BRE Syntax</h2>
    414  1.1  agc 
    415  1.1  agc <p>
    416  1.1  agc The obsolete basic regexp (BRE) syntax differs from the ERE syntax as
    417  1.1  agc follows:
    418  1.1  agc </p>
    419  1.1  agc 
    420  1.1  agc <ul>
    421  1.1  agc <li><tt>|</tt> is an ordinary character, and there is no equivalent
    422  1.1  agc for its functionality.  <tt>+</tt>, and <tt>?</tt> are ordinary
    423  1.1  agc characters.</li>
    424  1.1  agc <li>The delimiters for bounds are <tt>\{</tt> and <tt>\}</tt>, with
    425  1.1  agc <tt>{</tt> and <tt>}</tt> by themselves ordinary characters.</li>
    426  1.1  agc <li>The parentheses for nested subexpressions are <tt>\(</tt> and
    427  1.1  agc <tt>\)</tt>, with <tt>(</tt> and <tt>)</tt> by themselves ordinary
    428  1.1  agc characters.</li>
    429  1.1  agc <li><tt>^</tt> is an ordinary character except at the beginning of the
    430  1.1  agc RE or the beginning of a parenthesized subexpression.  Similarly,
    431  1.1  agc <tt>$</tt> is an ordinary character except at the end of the
    432  1.1  agc RE or the end of a parenthesized subexpression.</li>
    433  1.1  agc </ul>
    434