basic_syntax.html 30 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656
  1. <html>
  2. <head>
  3. <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
  4. <title>POSIX Basic Regular Expression Syntax</title>
  5. <link rel="stylesheet" href="../../../../../../doc/src/boostbook.css" type="text/css">
  6. <meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
  7. <link rel="home" href="../../index.html" title="Boost.Regex 5.1.4">
  8. <link rel="up" href="../syntax.html" title="Regular Expression Syntax">
  9. <link rel="prev" href="basic_extended.html" title="POSIX Extended Regular Expression Syntax">
  10. <link rel="next" href="character_classes.html" title="Character Class Names">
  11. </head>
  12. <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
  13. <table cellpadding="2" width="100%"><tr>
  14. <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../boost.png"></td>
  15. <td align="center"><a href="../../../../../../index.html">Home</a></td>
  16. <td align="center"><a href="../../../../../../libs/libraries.htm">Libraries</a></td>
  17. <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
  18. <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
  19. <td align="center"><a href="../../../../../../more/index.htm">More</a></td>
  20. </tr></table>
  21. <hr>
  22. <div class="spirit-nav">
  23. <a accesskey="p" href="basic_extended.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../syntax.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="character_classes.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
  24. </div>
  25. <div class="section">
  26. <div class="titlepage"><div><div><h3 class="title">
  27. <a name="boost_regex.syntax.basic_syntax"></a><a class="link" href="basic_syntax.html" title="POSIX Basic Regular Expression Syntax">POSIX Basic Regular
  28. Expression Syntax</a>
  29. </h3></div></div></div>
  30. <h4>
  31. <a name="boost_regex.syntax.basic_syntax.h0"></a>
  32. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.synopsis"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.synopsis">Synopsis</a>
  33. </h4>
  34. <p>
  35. The POSIX-Basic regular expression syntax is used by the Unix utility <code class="computeroutput"><span class="identifier">sed</span></code>, and variations are used by <code class="computeroutput"><span class="identifier">grep</span></code> and <code class="computeroutput"><span class="identifier">emacs</span></code>.
  36. You can construct POSIX basic regular expressions in Boost.Regex by passing
  37. the flag <code class="computeroutput"><span class="identifier">basic</span></code> to the regex
  38. constructor (see <a class="link" href="../ref/syntax_option_type.html" title="syntax_option_type"><code class="computeroutput"><span class="identifier">syntax_option_type</span></code></a>), for example:
  39. </p>
  40. <pre class="programlisting"><span class="comment">// e1 is a case sensitive POSIX-Basic expression:</span>
  41. <span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span> <span class="identifier">e1</span><span class="special">(</span><span class="identifier">my_expression</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span><span class="special">::</span><span class="identifier">basic</span><span class="special">);</span>
  42. <span class="comment">// e2 a case insensitive POSIX-Basic expression:</span>
  43. <span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span> <span class="identifier">e2</span><span class="special">(</span><span class="identifier">my_expression</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span><span class="special">::</span><span class="identifier">basic</span><span class="special">|</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span><span class="special">::</span><span class="identifier">icase</span><span class="special">);</span>
  44. </pre>
  45. <a name="boost_regex.posix_basic"></a><h4>
  46. <a name="boost_regex.syntax.basic_syntax.h1"></a>
  47. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.posix_basic_syntax"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.posix_basic_syntax">POSIX
  48. Basic Syntax</a>
  49. </h4>
  50. <p>
  51. In POSIX-Basic regular expressions, all characters are match themselves except
  52. for the following special characters:
  53. </p>
  54. <pre class="programlisting">.[\*^$</pre>
  55. <h5>
  56. <a name="boost_regex.syntax.basic_syntax.h2"></a>
  57. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.wildcard"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.wildcard">Wildcard:</a>
  58. </h5>
  59. <p>
  60. The single character '.' when used outside of a character set will match
  61. any single character except:
  62. </p>
  63. <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
  64. <li class="listitem">
  65. The NULL character when the flag <code class="computeroutput"><span class="identifier">match_no_dot_null</span></code>
  66. is passed to the matching algorithms.
  67. </li>
  68. <li class="listitem">
  69. The newline character when the flag <code class="computeroutput"><span class="identifier">match_not_dot_newline</span></code>
  70. is passed to the matching algorithms.
  71. </li>
  72. </ul></div>
  73. <h5>
  74. <a name="boost_regex.syntax.basic_syntax.h3"></a>
  75. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.anchors"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.anchors">Anchors:</a>
  76. </h5>
  77. <p>
  78. A '^' character shall match the start of a line when used as the first character
  79. of an expression, or the first character of a sub-expression.
  80. </p>
  81. <p>
  82. A '$' character shall match the end of a line when used as the last character
  83. of an expression, or the last character of a sub-expression.
  84. </p>
  85. <h5>
  86. <a name="boost_regex.syntax.basic_syntax.h4"></a>
  87. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.marked_sub_expressions"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.marked_sub_expressions">Marked sub-expressions:</a>
  88. </h5>
  89. <p>
  90. A section beginning <code class="computeroutput"><span class="special">\(</span></code> and ending
  91. <code class="computeroutput"><span class="special">\)</span></code> acts as a marked sub-expression.
  92. Whatever matched the sub-expression is split out in a separate field by the
  93. matching algorithms. Marked sub-expressions can also repeated, or referred-to
  94. by a back-reference.
  95. </p>
  96. <h5>
  97. <a name="boost_regex.syntax.basic_syntax.h5"></a>
  98. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.repeats"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.repeats">Repeats:</a>
  99. </h5>
  100. <p>
  101. Any atom (a single character, a marked sub-expression, or a character class)
  102. can be repeated with the * operator.
  103. </p>
  104. <p>
  105. For example <code class="computeroutput"><span class="identifier">a</span><span class="special">*</span></code>
  106. will match any number of letter a's repeated zero or more times (an atom
  107. repeated zero times matches an empty string), so the expression <code class="computeroutput"><span class="identifier">a</span><span class="special">*</span><span class="identifier">b</span></code>
  108. will match any of the following:
  109. </p>
  110. <pre class="programlisting">b
  111. ab
  112. aaaaaaaab
  113. </pre>
  114. <p>
  115. An atom can also be repeated with a bounded repeat:
  116. </p>
  117. <p>
  118. <code class="computeroutput"><span class="identifier">a</span><span class="special">\{</span><span class="identifier">n</span><span class="special">\}</span></code> Matches
  119. 'a' repeated exactly n times.
  120. </p>
  121. <p>
  122. <code class="computeroutput"><span class="identifier">a</span><span class="special">\{</span><span class="identifier">n</span><span class="special">,\}</span></code> Matches
  123. 'a' repeated n or more times.
  124. </p>
  125. <p>
  126. <code class="computeroutput"><span class="identifier">a</span><span class="special">\{</span><span class="identifier">n</span><span class="special">,</span> <span class="identifier">m</span><span class="special">\}</span></code> Matches 'a' repeated between n and m times
  127. inclusive.
  128. </p>
  129. <p>
  130. For example:
  131. </p>
  132. <pre class="programlisting">^a{2,3}$</pre>
  133. <p>
  134. Will match either of:
  135. </p>
  136. <pre class="programlisting">aa
  137. aaa
  138. </pre>
  139. <p>
  140. But neither of:
  141. </p>
  142. <pre class="programlisting">a
  143. aaaa
  144. </pre>
  145. <p>
  146. It is an error to use a repeat operator, if the preceding construct can not
  147. be repeated, for example:
  148. </p>
  149. <pre class="programlisting">a(*)</pre>
  150. <p>
  151. Will raise an error, as there is nothing for the * operator to be applied
  152. to.
  153. </p>
  154. <h5>
  155. <a name="boost_regex.syntax.basic_syntax.h6"></a>
  156. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.back_references"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.back_references">Back
  157. references:</a>
  158. </h5>
  159. <p>
  160. An escape character followed by a digit <span class="emphasis"><em>n</em></span>, where <span class="emphasis"><em>n</em></span>
  161. is in the range 1-9, matches the same string that was matched by sub-expression
  162. <span class="emphasis"><em>n</em></span>. For example the expression:
  163. </p>
  164. <pre class="programlisting">^\(a*\)[^a]*\1$</pre>
  165. <p>
  166. Will match the string:
  167. </p>
  168. <pre class="programlisting">aaabbaaa</pre>
  169. <p>
  170. But not the string:
  171. </p>
  172. <pre class="programlisting">aaabba</pre>
  173. <h5>
  174. <a name="boost_regex.syntax.basic_syntax.h7"></a>
  175. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.character_sets"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.character_sets">Character
  176. sets:</a>
  177. </h5>
  178. <p>
  179. A character set is a bracket-expression starting with [ and ending with ],
  180. it defines a set of characters, and matches any single character that is
  181. a member of that set.
  182. </p>
  183. <p>
  184. A bracket expression may contain any combination of the following:
  185. </p>
  186. <h6>
  187. <a name="boost_regex.syntax.basic_syntax.h8"></a>
  188. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.single_characters"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.single_characters">Single
  189. characters:</a>
  190. </h6>
  191. <p>
  192. For example <code class="computeroutput"><span class="special">[</span><span class="identifier">abc</span><span class="special">]</span></code>, will match any of the characters 'a', 'b',
  193. or 'c'.
  194. </p>
  195. <h6>
  196. <a name="boost_regex.syntax.basic_syntax.h9"></a>
  197. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.character_ranges"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.character_ranges">Character
  198. ranges:</a>
  199. </h6>
  200. <p>
  201. For example <code class="computeroutput"><span class="special">[</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">c</span><span class="special">]</span></code>
  202. will match any single character in the range 'a' to 'c'. By default, for
  203. POSIX-Basic regular expressions, a character <span class="emphasis"><em>x</em></span> is within
  204. the range <span class="emphasis"><em>y</em></span> to <span class="emphasis"><em>z</em></span>, if it collates
  205. within that range; this results in locale specific behavior. This behavior
  206. can be turned off by unsetting the <code class="computeroutput"><span class="identifier">collate</span></code>
  207. option flag when constructing the regular expression - in which case whether
  208. a character appears within a range is determined by comparing the code points
  209. of the characters only.
  210. </p>
  211. <h6>
  212. <a name="boost_regex.syntax.basic_syntax.h10"></a>
  213. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.negation"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.negation">Negation:</a>
  214. </h6>
  215. <p>
  216. If the bracket-expression begins with the ^ character, then it matches the
  217. complement of the characters it contains, for example <code class="computeroutput"><span class="special">[^</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">c</span><span class="special">]</span></code> matches any character that is not in the
  218. range a-c.
  219. </p>
  220. <h6>
  221. <a name="boost_regex.syntax.basic_syntax.h11"></a>
  222. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.character_classes"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.character_classes">Character
  223. classes:</a>
  224. </h6>
  225. <p>
  226. An expression of the form <code class="computeroutput"><span class="special">[[:</span><span class="identifier">name</span><span class="special">:]]</span></code>
  227. matches the named character class "name", for example <code class="computeroutput"><span class="special">[[:</span><span class="identifier">lower</span><span class="special">:]]</span></code> matches any lower case character. See
  228. <a class="link" href="character_classes.html" title="Character Class Names">character class names</a>.
  229. </p>
  230. <h6>
  231. <a name="boost_regex.syntax.basic_syntax.h12"></a>
  232. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.collating_elements"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.collating_elements">Collating
  233. Elements:</a>
  234. </h6>
  235. <p>
  236. An expression of the form <code class="computeroutput"><span class="special">[[.</span><span class="identifier">col</span><span class="special">.]</span></code> matches
  237. the collating element <span class="emphasis"><em>col</em></span>. A collating element is any
  238. single character, or any sequence of characters that collates as a single
  239. unit. Collating elements may also be used as the end point of a range, for
  240. example: <code class="computeroutput"><span class="special">[[.</span><span class="identifier">ae</span><span class="special">.]-</span><span class="identifier">c</span><span class="special">]</span></code>
  241. matches the character sequence "ae", plus any single character
  242. in the range "ae"-c, assuming that "ae" is treated as
  243. a single collating element in the current locale.
  244. </p>
  245. <p>
  246. Collating elements may be used in place of escapes (which are not normally
  247. allowed inside character sets), for example <code class="computeroutput"><span class="special">[[.^.]</span><span class="identifier">abc</span><span class="special">]</span></code> would
  248. match either one of the characters 'abc^'.
  249. </p>
  250. <p>
  251. As an extension, a collating element may also be specified via its symbolic
  252. name, for example:
  253. </p>
  254. <pre class="programlisting">[[.NUL.]]</pre>
  255. <p>
  256. matches a 'NUL' character. See <a class="link" href="collating_names.html" title="Collating Names">collating
  257. element names</a>.
  258. </p>
  259. <h6>
  260. <a name="boost_regex.syntax.basic_syntax.h13"></a>
  261. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.equivalence_classes"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.equivalence_classes">Equivalence
  262. classes:</a>
  263. </h6>
  264. <p>
  265. An expression of the form <code class="computeroutput"><span class="special">[[=</span><span class="identifier">col</span><span class="special">=]]</span></code>,
  266. matches any character or collating element whose primary sort key is the
  267. same as that for collating element <span class="emphasis"><em>col</em></span>, as with collating
  268. elements the name <span class="emphasis"><em>col</em></span> may be a <a class="link" href="collating_names.html" title="Collating Names">collating
  269. symbolic name</a>. A primary sort key is one that ignores case, accentation,
  270. or locale-specific tailorings; so for example <code class="computeroutput"><span class="special">[[=</span><span class="identifier">a</span><span class="special">=]]</span></code> matches
  271. any of the characters: a, &#192;, &#193;, &#194;, &#195;, &#196;, &#197;, A, &#224;, &#225;, &#226;, &#227;, &#228; and &#229;. Unfortunately implementation
  272. of this is reliant on the platform's collation and localisation support;
  273. this feature can not be relied upon to work portably across all platforms,
  274. or even all locales on one platform.
  275. </p>
  276. <h6>
  277. <a name="boost_regex.syntax.basic_syntax.h14"></a>
  278. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.combinations"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.combinations">Combinations:</a>
  279. </h6>
  280. <p>
  281. All of the above can be combined in one character set declaration, for example:
  282. <code class="computeroutput"><span class="special">[[:</span><span class="identifier">digit</span><span class="special">:]</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">c</span><span class="special">[.</span><span class="identifier">NUL</span><span class="special">.]].</span></code>
  283. </p>
  284. <h5>
  285. <a name="boost_regex.syntax.basic_syntax.h15"></a>
  286. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.escapes"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.escapes">Escapes</a>
  287. </h5>
  288. <p>
  289. With the exception of the escape sequences \{, \}, \(, and \), which are
  290. documented above, an escape followed by any character matches that character.
  291. This can be used to make the special characters
  292. </p>
  293. <pre class="programlisting">.[\*^$</pre>
  294. <p>
  295. "ordinary". Note that the escape character loses its special meaning
  296. inside a character set, so <code class="computeroutput"><span class="special">[\^]</span></code>
  297. will match either a literal '\' or a '^'.
  298. </p>
  299. <h4>
  300. <a name="boost_regex.syntax.basic_syntax.h16"></a>
  301. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.what_gets_matched"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.what_gets_matched">What
  302. Gets Matched</a>
  303. </h4>
  304. <p>
  305. When there is more that one way to match a regular expression, the "best"
  306. possible match is obtained using the <a class="link" href="leftmost_longest_rule.html" title="The Leftmost Longest Rule">leftmost-longest
  307. rule</a>.
  308. </p>
  309. <h4>
  310. <a name="boost_regex.syntax.basic_syntax.h17"></a>
  311. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.variations"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.variations">Variations</a>
  312. </h4>
  313. <a name="boost_regex.grep_syntax"></a><h5>
  314. <a name="boost_regex.syntax.basic_syntax.h18"></a>
  315. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.grep"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.grep">Grep</a>
  316. </h5>
  317. <p>
  318. When an expression is compiled with the flag <code class="computeroutput"><span class="identifier">grep</span></code>
  319. set, then the expression is treated as a newline separated list of <a class="link" href="basic_syntax.html#boost_regex.posix_basic">POSIX-Basic expressions</a>, a match
  320. is found if any of the expressions in the list match, for example:
  321. </p>
  322. <pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span> <span class="identifier">e</span><span class="special">(</span><span class="string">"abc\ndef"</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span><span class="special">::</span><span class="identifier">grep</span><span class="special">);</span>
  323. </pre>
  324. <p>
  325. will match either of the <a class="link" href="basic_syntax.html#boost_regex.posix_basic">POSIX-Basic
  326. expressions</a> "abc" or "def".
  327. </p>
  328. <p>
  329. As its name suggests, this behavior is consistent with the Unix utility grep.
  330. </p>
  331. <h5>
  332. <a name="boost_regex.syntax.basic_syntax.h19"></a>
  333. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.emacs"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.emacs">emacs</a>
  334. </h5>
  335. <p>
  336. In addition to the <a class="link" href="basic_syntax.html#boost_regex.posix_basic">POSIX-Basic features</a>
  337. the following characters are also special:
  338. </p>
  339. <div class="informaltable"><table class="table">
  340. <colgroup>
  341. <col>
  342. <col>
  343. </colgroup>
  344. <thead><tr>
  345. <th>
  346. <p>
  347. Character
  348. </p>
  349. </th>
  350. <th>
  351. <p>
  352. Description
  353. </p>
  354. </th>
  355. </tr></thead>
  356. <tbody>
  357. <tr>
  358. <td>
  359. <p>
  360. +
  361. </p>
  362. </td>
  363. <td>
  364. <p>
  365. repeats the preceding atom one or more times.
  366. </p>
  367. </td>
  368. </tr>
  369. <tr>
  370. <td>
  371. <p>
  372. ?
  373. </p>
  374. </td>
  375. <td>
  376. <p>
  377. repeats the preceding atom zero or one times.
  378. </p>
  379. </td>
  380. </tr>
  381. <tr>
  382. <td>
  383. <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem">
  384. ?
  385. </li></ul></div>
  386. </td>
  387. <td>
  388. <p>
  389. A non-greedy version of *.
  390. </p>
  391. </td>
  392. </tr>
  393. <tr>
  394. <td>
  395. <p>
  396. +?
  397. </p>
  398. </td>
  399. <td>
  400. <p>
  401. A non-greedy version of +.
  402. </p>
  403. </td>
  404. </tr>
  405. <tr>
  406. <td>
  407. <p>
  408. ??
  409. </p>
  410. </td>
  411. <td>
  412. <p>
  413. A non-greedy version of ?.
  414. </p>
  415. </td>
  416. </tr>
  417. </tbody>
  418. </table></div>
  419. <p>
  420. And the following escape sequences are also recognised:
  421. </p>
  422. <div class="informaltable"><table class="table">
  423. <colgroup>
  424. <col>
  425. <col>
  426. </colgroup>
  427. <thead><tr>
  428. <th>
  429. <p>
  430. Escape
  431. </p>
  432. </th>
  433. <th>
  434. <p>
  435. Description
  436. </p>
  437. </th>
  438. </tr></thead>
  439. <tbody>
  440. <tr>
  441. <td>
  442. <p>
  443. \|
  444. </p>
  445. </td>
  446. <td>
  447. <p>
  448. specifies an alternative.
  449. </p>
  450. </td>
  451. </tr>
  452. <tr>
  453. <td>
  454. <p>
  455. \(?: ... )
  456. </p>
  457. </td>
  458. <td>
  459. <p>
  460. is a non-marking grouping construct - allows you to lexically group
  461. something without spitting out an extra sub-expression.
  462. </p>
  463. </td>
  464. </tr>
  465. <tr>
  466. <td>
  467. <p>
  468. \w
  469. </p>
  470. </td>
  471. <td>
  472. <p>
  473. matches any word character.
  474. </p>
  475. </td>
  476. </tr>
  477. <tr>
  478. <td>
  479. <p>
  480. \W
  481. </p>
  482. </td>
  483. <td>
  484. <p>
  485. matches any non-word character.
  486. </p>
  487. </td>
  488. </tr>
  489. <tr>
  490. <td>
  491. <p>
  492. \sx
  493. </p>
  494. </td>
  495. <td>
  496. <p>
  497. matches any character in the syntax group x, the following emacs
  498. groupings are supported: 's', ' ', '_', 'w', '.', ')', '(', '"',
  499. '\'', '&gt;' and '&lt;'. Refer to the emacs docs for details.
  500. </p>
  501. </td>
  502. </tr>
  503. <tr>
  504. <td>
  505. <p>
  506. \Sx
  507. </p>
  508. </td>
  509. <td>
  510. <p>
  511. matches any character not in the syntax grouping x.
  512. </p>
  513. </td>
  514. </tr>
  515. <tr>
  516. <td>
  517. <p>
  518. \c and \C
  519. </p>
  520. </td>
  521. <td>
  522. <p>
  523. These are not supported.
  524. </p>
  525. </td>
  526. </tr>
  527. <tr>
  528. <td>
  529. <p>
  530. \`
  531. </p>
  532. </td>
  533. <td>
  534. <p>
  535. matches zero characters only at the start of a buffer (or string
  536. being matched).
  537. </p>
  538. </td>
  539. </tr>
  540. <tr>
  541. <td>
  542. <p>
  543. \'
  544. </p>
  545. </td>
  546. <td>
  547. <p>
  548. matches zero characters only at the end of a buffer (or string
  549. being matched).
  550. </p>
  551. </td>
  552. </tr>
  553. <tr>
  554. <td>
  555. <p>
  556. \b
  557. </p>
  558. </td>
  559. <td>
  560. <p>
  561. matches zero characters at a word boundary.
  562. </p>
  563. </td>
  564. </tr>
  565. <tr>
  566. <td>
  567. <p>
  568. \B
  569. </p>
  570. </td>
  571. <td>
  572. <p>
  573. matches zero characters, not at a word boundary.
  574. </p>
  575. </td>
  576. </tr>
  577. <tr>
  578. <td>
  579. <p>
  580. \&lt;
  581. </p>
  582. </td>
  583. <td>
  584. <p>
  585. matches zero characters only at the start of a word.
  586. </p>
  587. </td>
  588. </tr>
  589. <tr>
  590. <td>
  591. <p>
  592. \&gt;
  593. </p>
  594. </td>
  595. <td>
  596. <p>
  597. matches zero characters only at the end of a word.
  598. </p>
  599. </td>
  600. </tr>
  601. </tbody>
  602. </table></div>
  603. <p>
  604. Finally, you should note that emacs style regular expressions are matched
  605. according to the <a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.what_gets_matched">Perl
  606. "depth first search" rules</a>. Emacs expressions are matched
  607. this way because they contain Perl-like extensions, that do not interact
  608. well with the <a class="link" href="leftmost_longest_rule.html" title="The Leftmost Longest Rule">POSIX-style
  609. leftmost-longest rule</a>.
  610. </p>
  611. <h4>
  612. <a name="boost_regex.syntax.basic_syntax.h20"></a>
  613. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.options"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.options">Options</a>
  614. </h4>
  615. <p>
  616. There are a <a class="link" href="../ref/syntax_option_type/syntax_option_type_basic.html" title="Options for POSIX Basic Regular Expressions">variety
  617. of flags</a> that may be combined with the <code class="computeroutput"><span class="identifier">basic</span></code>
  618. and <code class="computeroutput"><span class="identifier">grep</span></code> options when constructing
  619. the regular expression, in particular note that the <a class="link" href="../ref/syntax_option_type/syntax_option_type_basic.html" title="Options for POSIX Basic Regular Expressions"><code class="computeroutput"><span class="identifier">newline_alt</span></code>, <code class="computeroutput"><span class="identifier">no_char_classes</span></code>,
  620. <code class="computeroutput"><span class="identifier">no</span><span class="special">-</span><span class="identifier">intervals</span></code>, <code class="computeroutput"><span class="identifier">bk_plus_qm</span></code>
  621. and <code class="computeroutput"><span class="identifier">bk_plus_vbar</span></code></a> options
  622. all alter the syntax, while the <a class="link" href="../ref/syntax_option_type/syntax_option_type_basic.html" title="Options for POSIX Basic Regular Expressions"><code class="computeroutput"><span class="identifier">collate</span></code> and <code class="computeroutput"><span class="identifier">icase</span></code>
  623. options</a> modify how the case and locale sensitivity are to be applied.
  624. </p>
  625. <h4>
  626. <a name="boost_regex.syntax.basic_syntax.h21"></a>
  627. <span class="phrase"><a name="boost_regex.syntax.basic_syntax.references"></a></span><a class="link" href="basic_syntax.html#boost_regex.syntax.basic_syntax.references">References</a>
  628. </h4>
  629. <p>
  630. <a href="http://www.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap09.html" target="_top">IEEE
  631. Std 1003.1-2001, Portable Operating System Interface (POSIX ), Base Definitions
  632. and Headers, Section 9, Regular Expressions (FWD.1).</a>
  633. </p>
  634. <p>
  635. <a href="http://www.opengroup.org/onlinepubs/000095399/utilities/grep.html" target="_top">IEEE
  636. Std 1003.1-2001, Portable Operating System Interface (POSIX ), Shells and
  637. Utilities, Section 4, Utilities, grep (FWD.1).</a>
  638. </p>
  639. <p>
  640. <a href="http://www.gnu.org/software/emacs/" target="_top">Emacs Version 21.3.</a>
  641. </p>
  642. </div>
  643. <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
  644. <td align="left"></td>
  645. <td align="right"><div class="copyright-footer">Copyright &#169; 1998-2013 John Maddock<p>
  646. Distributed under the Boost Software License, Version 1.0. (See accompanying
  647. file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
  648. </p>
  649. </div></td>
  650. </tr></table>
  651. <hr>
  652. <div class="spirit-nav">
  653. <a accesskey="p" href="basic_extended.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../syntax.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="character_classes.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
  654. </div>
  655. </body>
  656. </html>