syntax_option_type.qbk 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336
  1. [/
  2. Copyright 2006-2007 John Maddock.
  3. Distributed under the Boost Software License, Version 1.0.
  4. (See accompanying file LICENSE_1_0.txt or copy at
  5. http://www.boost.org/LICENSE_1_0.txt).
  6. ]
  7. [section:syntax_option_type syntax_option_type]
  8. [section:syntax_option_type_synopsis syntax_option_type Synopsis]
  9. Type [syntax_option_type] is an implementation specific bitmask type
  10. that controls how a regular expression string is to be interpreted.
  11. For convenience note that all the constants listed here, are also
  12. duplicated within the scope of class template [basic_regex].
  13. namespace std{ namespace regex_constants{
  14. typedef implementation-specific-bitmask-type syntax_option_type;
  15. // these flags are standardized:
  16. static const syntax_option_type normal;
  17. static const syntax_option_type ECMAScript = normal;
  18. static const syntax_option_type JavaScript = normal;
  19. static const syntax_option_type JScript = normal;
  20. static const syntax_option_type perl = normal;
  21. static const syntax_option_type basic;
  22. static const syntax_option_type sed = basic;
  23. static const syntax_option_type extended;
  24. static const syntax_option_type awk;
  25. static const syntax_option_type grep;
  26. static const syntax_option_type egrep;
  27. static const syntax_option_type icase;
  28. static const syntax_option_type nosubs;
  29. static const syntax_option_type optimize;
  30. static const syntax_option_type collate;
  31. //
  32. // The remaining options are specific to Boost.Regex:
  33. //
  34. // Options common to both Perl and POSIX regular expressions:
  35. static const syntax_option_type newline_alt;
  36. static const syntax_option_type no_except;
  37. static const syntax_option_type save_subexpression_location;
  38. // Perl specific options:
  39. static const syntax_option_type no_mod_m;
  40. static const syntax_option_type no_mod_s;
  41. static const syntax_option_type mod_s;
  42. static const syntax_option_type mod_x;
  43. static const syntax_option_type no_empty_expressions;
  44. // POSIX extended specific options:
  45. static const syntax_option_type no_escape_in_lists;
  46. static const syntax_option_type no_bk_refs;
  47. // POSIX basic specific options:
  48. static const syntax_option_type no_escape_in_lists;
  49. static const syntax_option_type no_char_classes;
  50. static const syntax_option_type no_intervals;
  51. static const syntax_option_type bk_plus_qm;
  52. static const syntax_option_type bk_vbar;
  53. } // namespace regex_constants
  54. } // namespace std
  55. [endsect]
  56. [section:syntax_option_type_overview Overview of syntax_option_type]
  57. The type [syntax_option_type] is an implementation specific bitmask type
  58. (see C++ standard 17.3.2.1.2). Setting its elements has the effects listed
  59. in the table below, a valid value of type [syntax_option_type] will always
  60. have exactly one of the elements `normal`, `basic`, `extended`,
  61. `awk`, `grep`, `egrep`, `sed`, `literal` or `perl` set.
  62. Note that for convenience all the constants listed here are duplicated within
  63. the scope of class template [basic_regex], so you can use any of:
  64. boost::regex_constants::constant_name
  65. or
  66. boost::regex::constant_name
  67. or
  68. boost::wregex::constant_name
  69. in an interchangeable manner.
  70. [endsect]
  71. [section:syntax_option_type_perl Options for Perl Regular Expressions]
  72. One of the following must always be set for perl regular expressions:
  73. [table
  74. [[Element][Standardized][Effect when set]]
  75. [[ECMAScript][Yes][Specifies that the grammar recognized by the regular
  76. expression engine uses its normal semantics: that is the same as
  77. that given in the ECMA-262, ECMAScript Language Specification,
  78. Chapter 15 part 10, RegExp (Regular Expression) Objects (FWD.1).
  79. This is functionally identical to the
  80. [link boost_regex.syntax.perl_syntax Perl regular expression syntax].
  81. Boost.Regex also recognizes all of the perl-compatible `(?...)`
  82. extensions in this mode.]]
  83. [[perl][No][As above.]]
  84. [[normal][No][As above.]]
  85. [[JavaScript][No][As above.]]
  86. [[JScript][No][As above.]]
  87. ]
  88. The following options may also be set when using perl-style regular expressions:
  89. [table
  90. [[Element][Standardized][Effect when set]]
  91. [[icase][Yes][Specifies that matching of regular expressions against a
  92. character container sequence shall be performed without regard to case.]]
  93. [[nosubs][Yes][Specifies that when a regular expression is matched against
  94. a character container sequence, then no sub-expression matches are
  95. to be stored in the supplied [match_results] structure.]]
  96. [[optimize][Yes][Specifies that the regular expression engine should pay
  97. more attention to the speed with which regular expressions are matched,
  98. and less to the speed with which regular expression objects are
  99. constructed. Otherwise it has no detectable effect on the program output.
  100. This currently has no effect for Boost.Regex.]]
  101. [[collate][Yes][Specifies that character ranges of the form `[a-b]` should be
  102. locale sensitive.]]
  103. [[newline_alt][No][Specifies that the \\n character has the same effect as
  104. the alternation operator |. Allows newline separated lists to be
  105. used as a list of alternatives.]]
  106. [[no_except][No][Prevents [basic_regex] from throwing an exception when an
  107. invalid expression is encountered.]]
  108. [[no_mod_m][No][Normally Boost.Regex behaves as if the Perl m-modifier is on:
  109. so the assertions ^ and $ match after and before embedded
  110. newlines respectively, setting this flags is equivalent to prefixing
  111. the expression with (?-m).]]
  112. [[no_mod_s][No][Normally whether Boost.Regex will match "." against a
  113. newline character is determined by the match flag `match_dot_not_newline`.
  114. Specifying this flag is equivalent to prefixing the expression with `(?-s)`
  115. and therefore causes "." not to match a newline character regardless of
  116. whether `match_not_dot_newline` is set in the match flags.]]
  117. [[mod_s][No][Normally whether Boost.Regex will match "." against a newline
  118. character is determined by the match flag `match_dot_not_newline`.
  119. Specifying this flag is equivalent to prefixing the expression with `(?s)`
  120. and therefore causes "." to match a newline character regardless of
  121. whether `match_not_dot_newline` is set in the match flags.]]
  122. [[mod_x][No][Turns on the perl x-modifier: causes unescaped whitespace
  123. in the expression to be ignored.]]
  124. [[no_empty_expressions][No][When set then empty expressions/alternatives are prohibited.]]
  125. [[save_subexpression_location][No][When set then the locations of individual
  126. sub-expressions within the ['original regular expression string] can be accessed
  127. via the [link boost_regex.basic_regex.subexpression `subexpression()`] member function of `basic_regex`.]]
  128. ]
  129. [endsect]
  130. [section:syntax_option_type_extended Options for POSIX Extended Regular Expressions]
  131. Exactly one of the following must always be set for
  132. [link boost_regex.syntax.basic_extended POSIX extended
  133. regular expressions]:
  134. [table
  135. [[Element][Standardized][Effect when set]]
  136. [[extended][Yes][Specifies that the grammar recognized by the regular
  137. expression engine is the same as that used by POSIX extended regular
  138. expressions in IEEE Std 1003.1-2001, Portable Operating System Interface
  139. (POSIX ), Base Definitions and Headers, Section 9, Regular Expressions (FWD.1).
  140. Refer to the [link boost_regex.syntax.basic_extended POSIX extended
  141. regular expression guide] for more information.
  142. In addition some perl-style escape sequences are supported
  143. (The POSIX standard specifies that only "special" characters may be
  144. escaped, all other escape sequences result in undefined behavior).]]
  145. [[egrep][Yes][Specifies that the grammar recognized by the regular expression
  146. engine is the same as that used by POSIX utility grep when given the
  147. -E option in IEEE Std 1003.1-2001, Portable Operating System
  148. Interface (POSIX ), Shells and Utilities, Section 4, Utilities, grep (FWD.1).
  149. That is to say, the same as [link boost_regex.syntax.basic_extended
  150. POSIX extended syntax], but with the newline character acting as an
  151. alternation character in addition to "|".]]
  152. [[awk][Yes][Specifies that the grammar recognized by the regular
  153. expression engine is the same as that used by POSIX utility awk
  154. in IEEE Std 1003.1-2001, Portable Operating System Interface (POSIX ),
  155. Shells and Utilities, Section 4, awk (FWD.1).
  156. That is to say: the same as [link boost_regex.syntax.basic_extended
  157. POSIX extended syntax], but with escape sequences in character
  158. classes permitted.
  159. In addition some perl-style escape sequences are supported (actually
  160. the awk syntax only requires \\a \\b \\t \\v \\f \\n and \\r to be
  161. recognised, all other Perl-style escape sequences invoke undefined
  162. behavior according to the POSIX standard, but are in fact
  163. recognised by Boost.Regex).]]
  164. ]
  165. The following options may also be set when using POSIX extended regular expressions:
  166. [table
  167. [[Element][Standardized][Effect when set]]
  168. [[icase][Yes][Specifies that matching of regular expressions against a
  169. character container sequence shall be performed without regard to case.]]
  170. [[nosubs][Yes][Specifies that when a regular expression is matched against a
  171. character container sequence, then no sub-expression matches are
  172. to be stored in the supplied [match_results] structure.]]
  173. [[optimize][Yes][Specifies that the regular expression engine should pay
  174. more attention to the speed with which regular expressions are matched,
  175. and less to the speed with which regular expression objects are
  176. constructed. Otherwise it has no detectable effect on the program output.
  177. This currently has no effect for Boost.Regex.]]
  178. [[collate][Yes][Specifies that character ranges of the form `[a-b]` should be
  179. locale sensitive. This bit is on by default for POSIX-Extended
  180. regular expressions, but can be unset to force ranges to be compared
  181. by code point only.]]
  182. [[newline_alt][No][Specifies that the \\n character has the same effect as
  183. the alternation operator |. Allows newline separated lists to be used
  184. as a list of alternatives.]]
  185. [[no_escape_in_lists][No][When set this makes the escape character ordinary
  186. inside lists, so that `[\b]` would match either '\\' or 'b'. This bit
  187. is on by default for POSIX-Extended regular expressions, but can be
  188. unset to force escapes to be recognised inside lists.]]
  189. [[no_bk_refs][No][When set then backreferences are disabled. This bit is on
  190. by default for POSIX-Extended regular expressions, but can be unset
  191. to support for backreferences on.]]
  192. [[no_except][No][Prevents [basic_regex] from throwing an exception when
  193. an invalid expression is encountered.]]
  194. [[save_subexpression_location][No][When set then the locations of individual
  195. sub-expressions within the ['original regular expression string] can be accessed
  196. via the [link boost_regex.basic_regex.subexpression `subexpression()`] member function of `basic_regex`.]]
  197. ]
  198. [endsect]
  199. [section:syntax_option_type_basic Options for POSIX Basic Regular Expressions]
  200. Exactly one of the following must always be set for POSIX basic regular expressions:
  201. [table
  202. [[Element][Standardized][Effect When Set]]
  203. [[basic][Yes][Specifies that the grammar recognized by the regular expression
  204. engine is the same as that used by
  205. [link boost_regex.syntax.basic_syntax POSIX basic regular expressions] in IEEE Std 1003.1-2001, Portable
  206. Operating System Interface (POSIX ), Base Definitions and Headers,
  207. Section 9, Regular Expressions (FWD.1).]]
  208. [[sed][No][As Above.]]
  209. [[grep][Yes][Specifies that the grammar recognized by the regular
  210. expression engine is the same as that used by
  211. POSIX utility `grep` in IEEE Std 1003.1-2001, Portable Operating
  212. System Interface (POSIX ), Shells and Utilities, Section 4,
  213. Utilit\ies, grep (FWD.1).
  214. That is to say, the same as [link boost_regex.syntax.basic_syntax
  215. POSIX basic syntax], but with the newline character acting as an
  216. alternation character; the expression is treated as a newline
  217. separated list of alternatives.]]
  218. [[emacs][No][Specifies that the grammar recognised is the superset of the
  219. [link boost_regex.syntax.basic_syntax POSIX-Basic syntax] used by
  220. the emacs program.]]
  221. ]
  222. The following options may also be set when using POSIX basic regular expressions:
  223. [table
  224. [[Element][Standardized][Effect when set]]
  225. [[icase][Yes][Specifies that matching of regular expressions against a
  226. character container sequence shall be performed without regard to case.]]
  227. [[nosubs][Yes][Specifies that when a regular expression is matched against
  228. a character container sequence, then no sub-expression matches are
  229. to be stored in the supplied [match_results] structure.]]
  230. [[optimize][Yes][Specifies that the regular expression engine should pay
  231. more attention to the speed with which regular expressions are
  232. matched, and less to the speed with which regular expression objects
  233. are constructed. Otherwise it has no detectable effect on the program output.
  234. This currently has no effect for Boost.Regex.]]
  235. [[collate][Yes][Specifies that character ranges of the form `[a-b]` should
  236. be locale sensitive. This bit is on by default for
  237. [link boost_regex.syntax.basic_syntax POSIX-Basic regular expressions],
  238. but can be unset to force ranges to be compared by code point only.]]
  239. [[newline_alt][No][Specifies that the \\n character has the same effect as the
  240. alternation operator |. Allows newline separated lists to be used
  241. as a list of alternatives. This bit is already set, if you use the
  242. `grep` option.]]
  243. [[no_char_classes][No][When set then character classes such as `[[:alnum:]]`
  244. are not allowed.]]
  245. [[no_escape_in_lists][No][When set this makes the escape character ordinary
  246. inside lists, so that `[\b]` would match either '\\' or 'b'. This bit
  247. is on by default for [link boost_regex.syntax.basic_syntax POSIX-basic
  248. regular expressions], but can be unset to force escapes to be recognised
  249. inside lists.]]
  250. [[no_intervals][No][When set then bounded repeats such as a{2,3} are not permitted.]]
  251. [[bk_plus_qm][No][When set then `\?` acts as a zero-or-one repeat operator,
  252. and `\+` acts as a one-or-more repeat operator.]]
  253. [[bk_vbar][No][When set then `\|` acts as the alternation operator.]]
  254. [[no_except][No][Prevents [basic_regex] from throwing an exception when an
  255. invalid expression is encountered.]]
  256. [[save_subexpression_location][No][When set then the locations of individual
  257. sub-expressions within the ['original regular expression string] can be accessed
  258. via the [link boost_regex.basic_regex.subexpression `subexpression()`] member function of `basic_regex`.]]
  259. ]
  260. [endsect]
  261. [section:syntax_option_type_literal Options for Literal Strings]
  262. The following must always be set to interpret the expression as a string literal:
  263. [table
  264. [[Element][Standardized][Effect when set]]
  265. [[literal][Yes][Treat the string as a literal (no special characters).]]
  266. ]
  267. The following options may also be combined with the literal flag:
  268. [table
  269. [[Element][Standardized][Effect when set]]
  270. [[icase][Yes][Specifies that matching of regular expressions against a
  271. character container sequence shall be performed without regard to case.]]
  272. [[optimize][Yes][Specifies that the regular expression engine should pay
  273. more attention to the speed with which regular expressions are matched,
  274. and less to the speed with which regular expression objects are constructed.
  275. Otherwise it has no detectable effect on the program output. This
  276. currently has no effect for Boost.Regex.]]
  277. ]
  278. [endsect]
  279. [endsect]