123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336 |
- [/
- Copyright 2006-2007 John Maddock.
- Distributed under the Boost Software License, Version 1.0.
- (See accompanying file LICENSE_1_0.txt or copy at
- http://www.boost.org/LICENSE_1_0.txt).
- ]
- [section:syntax_option_type syntax_option_type]
- [section:syntax_option_type_synopsis syntax_option_type Synopsis]
- Type [syntax_option_type] is an implementation specific bitmask type
- that controls how a regular expression string is to be interpreted.
- For convenience note that all the constants listed here, are also
- duplicated within the scope of class template [basic_regex].
- namespace std{ namespace regex_constants{
- typedef implementation-specific-bitmask-type syntax_option_type;
- // these flags are standardized:
- static const syntax_option_type normal;
- static const syntax_option_type ECMAScript = normal;
- static const syntax_option_type JavaScript = normal;
- static const syntax_option_type JScript = normal;
- static const syntax_option_type perl = normal;
- static const syntax_option_type basic;
- static const syntax_option_type sed = basic;
- static const syntax_option_type extended;
- static const syntax_option_type awk;
- static const syntax_option_type grep;
- static const syntax_option_type egrep;
- static const syntax_option_type icase;
- static const syntax_option_type nosubs;
- static const syntax_option_type optimize;
- static const syntax_option_type collate;
-
- //
- // The remaining options are specific to Boost.Regex:
- //
-
- // Options common to both Perl and POSIX regular expressions:
- static const syntax_option_type newline_alt;
- static const syntax_option_type no_except;
- static const syntax_option_type save_subexpression_location;
-
- // Perl specific options:
- static const syntax_option_type no_mod_m;
- static const syntax_option_type no_mod_s;
- static const syntax_option_type mod_s;
- static const syntax_option_type mod_x;
- static const syntax_option_type no_empty_expressions;
-
- // POSIX extended specific options:
- static const syntax_option_type no_escape_in_lists;
- static const syntax_option_type no_bk_refs;
-
- // POSIX basic specific options:
- static const syntax_option_type no_escape_in_lists;
- static const syntax_option_type no_char_classes;
- static const syntax_option_type no_intervals;
- static const syntax_option_type bk_plus_qm;
- static const syntax_option_type bk_vbar;
- } // namespace regex_constants
- } // namespace std
- [endsect]
- [section:syntax_option_type_overview Overview of syntax_option_type]
- The type [syntax_option_type] is an implementation specific bitmask type
- (see C++ standard 17.3.2.1.2). Setting its elements has the effects listed
- in the table below, a valid value of type [syntax_option_type] will always
- have exactly one of the elements `normal`, `basic`, `extended`,
- `awk`, `grep`, `egrep`, `sed`, `literal` or `perl` set.
- Note that for convenience all the constants listed here are duplicated within
- the scope of class template [basic_regex], so you can use any of:
- boost::regex_constants::constant_name
- or
- boost::regex::constant_name
- or
- boost::wregex::constant_name
- in an interchangeable manner.
- [endsect]
- [section:syntax_option_type_perl Options for Perl Regular Expressions]
- One of the following must always be set for perl regular expressions:
- [table
- [[Element][Standardized][Effect when set]]
- [[ECMAScript][Yes][Specifies that the grammar recognized by the regular
- expression engine uses its normal semantics: that is the same as
- that given in the ECMA-262, ECMAScript Language Specification,
- Chapter 15 part 10, RegExp (Regular Expression) Objects (FWD.1).
-
- This is functionally identical to the
- [link boost_regex.syntax.perl_syntax Perl regular expression syntax].
- Boost.Regex also recognizes all of the perl-compatible `(?...)`
- extensions in this mode.]]
- [[perl][No][As above.]]
- [[normal][No][As above.]]
- [[JavaScript][No][As above.]]
- [[JScript][No][As above.]]
- ]
- The following options may also be set when using perl-style regular expressions:
- [table
- [[Element][Standardized][Effect when set]]
- [[icase][Yes][Specifies that matching of regular expressions against a
- character container sequence shall be performed without regard to case.]]
- [[nosubs][Yes][Specifies that when a regular expression is matched against
- a character container sequence, then no sub-expression matches are
- to be stored in the supplied [match_results] structure.]]
- [[optimize][Yes][Specifies that the regular expression engine should pay
- more attention to the speed with which regular expressions are matched,
- and less to the speed with which regular expression objects are
- constructed. Otherwise it has no detectable effect on the program output.
- This currently has no effect for Boost.Regex.]]
- [[collate][Yes][Specifies that character ranges of the form `[a-b]` should be
- locale sensitive.]]
- [[newline_alt][No][Specifies that the \\n character has the same effect as
- the alternation operator |. Allows newline separated lists to be
- used as a list of alternatives.]]
- [[no_except][No][Prevents [basic_regex] from throwing an exception when an
- invalid expression is encountered.]]
- [[no_mod_m][No][Normally Boost.Regex behaves as if the Perl m-modifier is on:
- so the assertions ^ and $ match after and before embedded
- newlines respectively, setting this flags is equivalent to prefixing
- the expression with (?-m).]]
- [[no_mod_s][No][Normally whether Boost.Regex will match "." against a
- newline character is determined by the match flag `match_dot_not_newline`.
- Specifying this flag is equivalent to prefixing the expression with `(?-s)`
- and therefore causes "." not to match a newline character regardless of
- whether `match_not_dot_newline` is set in the match flags.]]
- [[mod_s][No][Normally whether Boost.Regex will match "." against a newline
- character is determined by the match flag `match_dot_not_newline`.
- Specifying this flag is equivalent to prefixing the expression with `(?s)`
- and therefore causes "." to match a newline character regardless of
- whether `match_not_dot_newline` is set in the match flags.]]
- [[mod_x][No][Turns on the perl x-modifier: causes unescaped whitespace
- in the expression to be ignored.]]
- [[no_empty_expressions][No][When set then empty expressions/alternatives are prohibited.]]
- [[save_subexpression_location][No][When set then the locations of individual
- sub-expressions within the ['original regular expression string] can be accessed
- via the [link boost_regex.basic_regex.subexpression `subexpression()`] member function of `basic_regex`.]]
- ]
- [endsect]
- [section:syntax_option_type_extended Options for POSIX Extended Regular Expressions]
- Exactly one of the following must always be set for
- [link boost_regex.syntax.basic_extended POSIX extended
- regular expressions]:
- [table
- [[Element][Standardized][Effect when set]]
- [[extended][Yes][Specifies that the grammar recognized by the regular
- expression engine is the same as that used by POSIX extended regular
- expressions in IEEE Std 1003.1-2001, Portable Operating System Interface
- (POSIX ), Base Definitions and Headers, Section 9, Regular Expressions (FWD.1).
-
- Refer to the [link boost_regex.syntax.basic_extended POSIX extended
- regular expression guide] for more information.
- In addition some perl-style escape sequences are supported
- (The POSIX standard specifies that only "special" characters may be
- escaped, all other escape sequences result in undefined behavior).]]
- [[egrep][Yes][Specifies that the grammar recognized by the regular expression
- engine is the same as that used by POSIX utility grep when given the
- -E option in IEEE Std 1003.1-2001, Portable Operating System
- Interface (POSIX ), Shells and Utilities, Section 4, Utilities, grep (FWD.1).
- That is to say, the same as [link boost_regex.syntax.basic_extended
- POSIX extended syntax], but with the newline character acting as an
- alternation character in addition to "|".]]
- [[awk][Yes][Specifies that the grammar recognized by the regular
- expression engine is the same as that used by POSIX utility awk
- in IEEE Std 1003.1-2001, Portable Operating System Interface (POSIX ),
- Shells and Utilities, Section 4, awk (FWD.1).
- That is to say: the same as [link boost_regex.syntax.basic_extended
- POSIX extended syntax], but with escape sequences in character
- classes permitted.
- In addition some perl-style escape sequences are supported (actually
- the awk syntax only requires \\a \\b \\t \\v \\f \\n and \\r to be
- recognised, all other Perl-style escape sequences invoke undefined
- behavior according to the POSIX standard, but are in fact
- recognised by Boost.Regex).]]
- ]
- The following options may also be set when using POSIX extended regular expressions:
- [table
- [[Element][Standardized][Effect when set]]
- [[icase][Yes][Specifies that matching of regular expressions against a
- character container sequence shall be performed without regard to case.]]
- [[nosubs][Yes][Specifies that when a regular expression is matched against a
- character container sequence, then no sub-expression matches are
- to be stored in the supplied [match_results] structure.]]
- [[optimize][Yes][Specifies that the regular expression engine should pay
- more attention to the speed with which regular expressions are matched,
- and less to the speed with which regular expression objects are
- constructed. Otherwise it has no detectable effect on the program output.
- This currently has no effect for Boost.Regex.]]
- [[collate][Yes][Specifies that character ranges of the form `[a-b]` should be
- locale sensitive. This bit is on by default for POSIX-Extended
- regular expressions, but can be unset to force ranges to be compared
- by code point only.]]
- [[newline_alt][No][Specifies that the \\n character has the same effect as
- the alternation operator |. Allows newline separated lists to be used
- as a list of alternatives.]]
- [[no_escape_in_lists][No][When set this makes the escape character ordinary
- inside lists, so that `[\b]` would match either '\\' or 'b'. This bit
- is on by default for POSIX-Extended regular expressions, but can be
- unset to force escapes to be recognised inside lists.]]
- [[no_bk_refs][No][When set then backreferences are disabled. This bit is on
- by default for POSIX-Extended regular expressions, but can be unset
- to support for backreferences on.]]
- [[no_except][No][Prevents [basic_regex] from throwing an exception when
- an invalid expression is encountered.]]
- [[save_subexpression_location][No][When set then the locations of individual
- sub-expressions within the ['original regular expression string] can be accessed
- via the [link boost_regex.basic_regex.subexpression `subexpression()`] member function of `basic_regex`.]]
- ]
- [endsect]
- [section:syntax_option_type_basic Options for POSIX Basic Regular Expressions]
- Exactly one of the following must always be set for POSIX basic regular expressions:
- [table
- [[Element][Standardized][Effect When Set]]
- [[basic][Yes][Specifies that the grammar recognized by the regular expression
- engine is the same as that used by
- [link boost_regex.syntax.basic_syntax POSIX basic regular expressions] in IEEE Std 1003.1-2001, Portable
- Operating System Interface (POSIX ), Base Definitions and Headers,
- Section 9, Regular Expressions (FWD.1).]]
- [[sed][No][As Above.]]
- [[grep][Yes][Specifies that the grammar recognized by the regular
- expression engine is the same as that used by
- POSIX utility `grep` in IEEE Std 1003.1-2001, Portable Operating
- System Interface (POSIX ), Shells and Utilities, Section 4,
- Utilit\ies, grep (FWD.1).
- That is to say, the same as [link boost_regex.syntax.basic_syntax
- POSIX basic syntax], but with the newline character acting as an
- alternation character; the expression is treated as a newline
- separated list of alternatives.]]
- [[emacs][No][Specifies that the grammar recognised is the superset of the
- [link boost_regex.syntax.basic_syntax POSIX-Basic syntax] used by
- the emacs program.]]
- ]
- The following options may also be set when using POSIX basic regular expressions:
- [table
- [[Element][Standardized][Effect when set]]
- [[icase][Yes][Specifies that matching of regular expressions against a
- character container sequence shall be performed without regard to case.]]
- [[nosubs][Yes][Specifies that when a regular expression is matched against
- a character container sequence, then no sub-expression matches are
- to be stored in the supplied [match_results] structure.]]
- [[optimize][Yes][Specifies that the regular expression engine should pay
- more attention to the speed with which regular expressions are
- matched, and less to the speed with which regular expression objects
- are constructed. Otherwise it has no detectable effect on the program output.
- This currently has no effect for Boost.Regex.]]
- [[collate][Yes][Specifies that character ranges of the form `[a-b]` should
- be locale sensitive. This bit is on by default for
- [link boost_regex.syntax.basic_syntax POSIX-Basic regular expressions],
- but can be unset to force ranges to be compared by code point only.]]
- [[newline_alt][No][Specifies that the \\n character has the same effect as the
- alternation operator |. Allows newline separated lists to be used
- as a list of alternatives. This bit is already set, if you use the
- `grep` option.]]
- [[no_char_classes][No][When set then character classes such as `[[:alnum:]]`
- are not allowed.]]
- [[no_escape_in_lists][No][When set this makes the escape character ordinary
- inside lists, so that `[\b]` would match either '\\' or 'b'. This bit
- is on by default for [link boost_regex.syntax.basic_syntax POSIX-basic
- regular expressions], but can be unset to force escapes to be recognised
- inside lists.]]
- [[no_intervals][No][When set then bounded repeats such as a{2,3} are not permitted.]]
- [[bk_plus_qm][No][When set then `\?` acts as a zero-or-one repeat operator,
- and `\+` acts as a one-or-more repeat operator.]]
- [[bk_vbar][No][When set then `\|` acts as the alternation operator.]]
- [[no_except][No][Prevents [basic_regex] from throwing an exception when an
- invalid expression is encountered.]]
- [[save_subexpression_location][No][When set then the locations of individual
- sub-expressions within the ['original regular expression string] can be accessed
- via the [link boost_regex.basic_regex.subexpression `subexpression()`] member function of `basic_regex`.]]
- ]
- [endsect]
- [section:syntax_option_type_literal Options for Literal Strings]
- The following must always be set to interpret the expression as a string literal:
- [table
- [[Element][Standardized][Effect when set]]
- [[literal][Yes][Treat the string as a literal (no special characters).]]
- ]
- The following options may also be combined with the literal flag:
- [table
- [[Element][Standardized][Effect when set]]
- [[icase][Yes][Specifies that matching of regular expressions against a
- character container sequence shall be performed without regard to case.]]
- [[optimize][Yes][Specifies that the regular expression engine should pay
- more attention to the speed with which regular expressions are matched,
- and less to the speed with which regular expression objects are constructed.
- Otherwise it has no detectable effect on the program output. This
- currently has no effect for Boost.Regex.]]
- ]
- [endsect]
- [endsect]
|