rule.html 21 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231
  1. <html>
  2. <head>
  3. <title>The Rule</title>
  4. <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  5. <link rel="stylesheet" href="theme/style.css" type="text/css">
  6. </head>
  7. <body>
  8. <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
  9. <tr>
  10. <td width="10">
  11. </td>
  12. <td width="85%">
  13. <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>The Rule</b></font>
  14. </td>
  15. <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
  16. </tr>
  17. </table>
  18. <br>
  19. <table border="0">
  20. <tr>
  21. <td width="10"></td>
  22. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  23. <td width="30"><a href="numerics.html"><img src="theme/l_arr.gif" border="0"></a></td>
  24. <td width="30"><a href="epsilon.html"><img src="theme/r_arr.gif" border="0"></a></td>
  25. </tr>
  26. </table>
  27. <p>The <b>rule</b> is a polymorphic parser that acts as a named place-holder capturing
  28. the behavior of an EBNF expression assigned to it. Naming an EBNF expression
  29. allows it to be referenced later. The <tt>rule</tt> is a template class parameterized
  30. by the type of the scanner (<tt>ScannerT</tt>), the rule's <a href="indepth_the_parser_context.html">context</a>
  31. and its <a href="#tag">tag</a>. Default template parameters are provided to
  32. make it easy to use the rule.</p>
  33. <pre><code><font color="#000000"><span class=identifier> </span><span class=keyword>template</span><span class=special>&lt;
  34. </span><span class=keyword>typename </span><span class=identifier>ScannerT </span><span class=special>= </span><span class=identifier>scanner</span><span class=special>&lt;&gt;,
  35. </span><span class=keyword>typename </span><span class=identifier>ContextT </span><span class=special>= </span><span class=identifier>parser_context</span><span class=special>&lt;&gt;</span><span class=identifier>,
  36. </span><span class="keyword">typename</span><span class=identifier> TagT </span><span class="special">=</span><span class=identifier> parser_address_tag</span><span class=special>&gt;
  37. </span><span class=keyword>class </span><span class=identifier>rule</span><span class=special>;</span></font></code></pre>
  38. <p>Default template parameters are supplied to handle the most common case. <tt>ScannerT</tt>
  39. defaults to <tt>scanner&lt;&gt;</tt>, a plain vanilla scanner that acts on <tt>char
  40. const<span class="operators">*</span></tt> iterators and does nothing special
  41. at all other than iterate through all the chars in the null terminated input
  42. a character at a time. The rule tag, <tt>TagT</tt>, typically used with <a href="trees.html">ASTs</a>,
  43. is used to identify a rule; it is explained <a href="#tag">here</a>. In trivial
  44. cases, declaring a rule as <tt>rule&lt;&gt;</tt> is enough. You need not be
  45. concerned at all with the <tt>ContextT</tt> template parameter unless you wish
  46. to tweak the low level behavior of the rule. Detailed information on the <tt>ContextT</tt>
  47. template parameter is provided <a href="indepth_the_parser_context.html">elsewhere</a>.
  48. </p>
  49. <h3><a name="order_of_parameters"></a>Order of parameters</h3>
  50. <p>As of v1.8.0, the <tt>ScannerT</tt>, <tt>ContextT</tt> and <tt>TagT</tt> can
  51. be specified in any order. If a template parameter is missing, it will assume
  52. the defaults. Examples:</p>
  53. <pre><span class=identifier> rule</span><span class=special>&lt;&gt; </span><span class=identifier>rx1</span><span class=special>;
  54. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>scanner</span><span class=special>&lt;&gt; </span><span class=special>&gt; </span><span class=identifier>rx2</span><span class=special>;
  55. </span> <span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>parser_context<code><font color="#000000"><span class=special>&lt;&gt;</span></font></code> </span><span class=special>&gt; </span><span class=identifier>rx3</span><span class=special>;
  56. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>parser_context<code><font color="#000000"><span class=special>&lt;&gt;</span></font></code></span><span class=special>, </span><span class=identifier>parser_address_tag</span><span class=special>&gt; </span><span class=identifier>rx4</span><span class=special>;
  57. </span> <span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>parser_address_tag</span><span class=special>&gt; </span><span class=identifier>rx5</span><span class=special>;
  58. </span> <span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>parser_address_tag</span><span class=special>, </span><span class=identifier>scanner</span><span class=special>&lt;&gt;, </span><span class=identifier>parser_context<code><font color="#000000"><span class=special>&lt;&gt;</span></font></code> </span><span class=special>&gt; </span><span class=identifier>rx6</span><span class=special>;
  59. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>parser_context<code><font color="#000000"><span class=special>&lt;&gt;</span></font></code></span><span class=special>, </span><span class=identifier>scanner</span><span class=special>&lt;&gt;, </span><span class=identifier>parser_address_tag</span><span class=special>&gt; </span><span class=identifier>rx7</span><span class=special>;</span></pre>
  60. <h3><a name="multiple_scanner_support" id="multiple_scanner_support"></a>Multiple scanners</h3>
  61. <p>As of v1.8.0, rules can use one or more scanner types. There are cases, for
  62. instance, where we need a rule that can work on the phrase and character levels.
  63. Rule/scanner mismatch has been a source of confusion and is the no. 1 <a href="faq.html#scanner_business">FAQ</a>.
  64. To address this issue, we now have multiple scanner support. Example:</p>
  65. <pre><span class=special> </span><span class=keyword>typedef </span><span class=identifier>scanner_list</span><span class=special>&lt;</span><span class=identifier>scanner</span><span class=special>&lt;&gt;, </span><span class=identifier>phrase_scanner_t</span><span class=special>&gt; </span><span class=identifier>scanners</span><span class=special>;
  66. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>scanners</span><span class=special>&gt; </span><span class=identifier>r </span><span class=special>= </span><span class=special>+</span><span class=identifier>anychar_p</span><span class=special>;
  67. </span><span class=identifier>assert</span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=string>&quot;abcdefghijk&quot;</span><span class=special>, </span><span class=identifier>r</span><span class=special>).</span><span class=identifier>full</span><span class=special>);
  68. </span><span class=identifier>assert</span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=string>&quot;a b c d e f g h i j k&quot;</span><span class=special>, </span><span class=identifier>r</span><span class=special>, </span><span class=identifier>space_p</span><span class=special>).</span><span class=identifier>full</span><span class=special>);</span></pre>
  69. <p>Notice how rule <tt>r</tt> is used in both the phrase and character levels.
  70. </p>
  71. <p>By default support for multiple scanners is disabled. The macro
  72. <tt>BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT</tt> must be defined to the
  73. maximum number of scanners allowed in a scanner_list. The value must
  74. be greater than 1 to enable multiple scanners. Given the
  75. example above, to define a limit of two scanners for the list, the
  76. following line must be inserted into the source file before the
  77. inclusion of Spirit headers:
  78. </p>
  79. <pre><span class=special> </span><span class=preprocessor>#define </span><span class=identifier>BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT</span> <span class=literal>2</span></pre>
  80. <table width="80%" border="0" align="center">
  81. <tr>
  82. <td class="note_box"><img src="theme/bulb.gif" width="13" height="18"> See
  83. the techniques section for an <a href="techniques.html#multiple_scanner_support">example</a>
  84. of a <a href="grammar.html">grammar</a> using a multiple scanner enabled
  85. rule, <a href="scanner.html#lexeme_scanner">lexeme_scanner</a> and <a href="scanner.html#as_lower_scanner">as_lower_scanner.</a></td>
  86. </tr>
  87. </table>
  88. <h3>Rule Declarations</h3>
  89. <p>The rule class models EBNF's production rule. Example:</p>
  90. <pre><code><font color="#000000"> <span class=identifier>rule</span><span class=special>&lt;&gt; </span><span class=identifier>a_rule </span><span class=special>= </span><span class=special>*(</span><span class=identifier>a </span><span class=special>| </span><span class=identifier>b</span><span class=special>) </span><span class=special>& </span><span class=special>+(</span><span class=identifier>c </span><span class=special>| </span><span class=identifier>d </span><span class=special>| </span><span class=identifier>e</span><span class=special>);</span></font></code></pre>
  91. <p>The type and behavior of the right-hand (rhs) EBNF expression, which may be
  92. arbitrarily complex, is encoded in the rule named a_rule. a_rule may now be
  93. referenced elsewhere in the grammar:</p>
  94. <pre><code><font color="#000000"> <span class=identifier>rule</span><span class=special>&lt;&gt; </span><span class=identifier>another_rule </span><span class=special>= </span><span class=identifier>f </span><span class=special>&gt;&gt; </span><span class=identifier>g </span><span class=special>&gt;&gt; </span><span class=identifier>h </span><span class=special>&gt;&gt; </span><span class=identifier>a_rule</span><span class=special>;</span></font></code></pre>
  95. <table width="80%" border="0" align="center">
  96. <tr>
  97. <td class="note_box"><img src="theme/alert.gif" width="16" height="16"> <b>Referencing
  98. rules <br>
  99. </b><br>
  100. When a rule is referenced anywhere in the right hand side of an EBNF expression,
  101. the rule is held by the expression by reference. It is the responsibility
  102. of the client to ensure that the referenced rule stays in scope and does
  103. not get destructed while it is being referenced. </td>
  104. </tr>
  105. </table>
  106. <pre><span class=special> </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>int_p</span><span class=special>;
  107. </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>a</span><span class=special>;
  108. </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>int_p </span><span class=special>&gt;&gt; </span><span class=identifier>b</span><span class=special>;</span></pre>
  109. <h3>Copying Rules</h3>
  110. <p>The rule is a weird C++ citizen, unlike any other C++ object. It does not have
  111. the proper copy and assignment semantics and cannot be stored and passed around
  112. by value. If you need to copy a rule you have to explicitly call its member
  113. function <tt>copy()</tt>:</p>
  114. <pre><span class=special> </span><span class=identifier>r</span><span class="special">.</span><span class=identifier>copy()</span><span class=special>;</span></pre>
  115. <p>However, be warned that copying a rule will not deep copy other referenced
  116. rules of the source rule being copied. This might lead to dangling references.
  117. Again, it is the responsibility of the client to ensure that all referenced
  118. rules stay in scope and does not get destructed while it is being referenced.
  119. Caveat emptor.</p>
  120. <p>If you copy a rule, then you'll want to place it in a storage somewhere. The
  121. problem is how? The storage can't be another rule:</p>
  122. <pre> <code><font color="#000000"><span class=identifier>rule</span><span class=special>&lt;&gt;</span></font></code> r2 <span class="special">=</span> <span class=identifier>r</span><span class="special">.</span><span class=identifier>copy()</span><span class=special>; </span><span class="comment">// BAD!</span></pre>
  123. <p>because rules are weird and does not have the expected C++ copy-constructor
  124. and assignment semantics! As a general rule: <strong>Don't put a copied rule
  125. into another rule! </strong>Instead, use the <a href="stored_rule.html">stored_rule</a>
  126. for that purpose.</p>
  127. <h3>Forward declarations</h3>
  128. <p>A <tt>rule</tt> may be declared before being defined to allow cyclic structures
  129. typically found in BNF declarations. Example:</p>
  130. <pre><code><font color="#000000"><span class=special> </span><span class=identifier>rule</span><span class=special>&lt;&gt; </span><span class=identifier>a</span><span class=special>, </span><span class=identifier>b</span><span class=special>, </span><span class=identifier>c</span><span class=special>;
  131. </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>b </span><span class=special>| </span><span class=identifier>a</span><span class=special>;
  132. </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>c </span><span class=special>| </span><span class=identifier>a</span><span class=special>;</span></font></code></pre>
  133. <h3>Recursion</h3>
  134. <p>The right-hand side of a rule may reference other rules, including itself.
  135. The limitation is that direct or indirect left recursion is not allowed (this
  136. is an unchecked run-time error that results in an infinite loop). This is typical
  137. of top-down parsers. Example:</p>
  138. <pre><code><font color="#000000"><span class=special> </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>a </span><span class=special>| </span><span class=identifier>b</span><span class=special>; </span><span class=comment>// infinite loop!</span></font></code></pre>
  139. <table width="80%" border="0" align="center">
  140. <tr>
  141. <td class="note_box"><img src="theme/lens.gif" width="15" height="16"> <b>What
  142. is left recursion?<br>
  143. </b><br>
  144. Left recursion happens when you have a rule that calls itself before anything
  145. else. A top-down parser will go into an infinite loop when this happens.
  146. See the <a href="faq.html#left_recursion">FAQ</a> for details on how to
  147. eliminate left recursion.</td>
  148. </tr>
  149. </table>
  150. <h3>Undefined rules</h3>
  151. <p>An undefined rule matches nothing and is semantically equivalent to <tt>nothing_p</tt>.</p>
  152. <h3>Redeclarations</h3>
  153. <p>Like any other C++ assignment, a second assignment to a rule is destructive
  154. and will redefine it. The old definition is lost. Rules are dynamic. A rule
  155. can change its definition anytime:</p>
  156. <pre><code><font color="#000000"><span class=identifier> r </span><span class=special>= </span><span class=identifier>a_definition</span><span class=special>;
  157. </span><span class=identifier> r </span><span class=special>= </span><span class=identifier>another_definition</span><span class=special>;</span></font></code></pre>
  158. <p>Rule <tt>r</tt> loses the old definition when the second assignment is made.
  159. As mentioned, an undefined rule matches nothing and is semantically equivalent
  160. to <tt>nothing_p</tt>.
  161. <h3>Dynamic Parsers</h3>
  162. <p>Hosting declarative EBNF in imperative C++ yields an interesting blend. We
  163. have the best of both worlds. We have the ability to conveniently modify the
  164. grammar at run time using imperative constructs such as <tt>if</tt>, <tt>else</tt>
  165. statements. Example:</p>
  166. <pre><code><font color="#000000"><span class=special> </span><span class=keyword>if </span><span class=special>(</span><span class=identifier>feature_is_available</span><span class=special>)
  167. </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>add_this_feature</span><span class=special>;</span></font></code></pre>
  168. <p>Rules are essentially dynamic parsers. A dynamic parser is characterized by
  169. its ability to modify its behavior at run time. Initially, an undefined rule
  170. matches nothing. At any time, the rule may be defined and redefined, thus, dynamically
  171. altering its behavior.</p>
  172. <h3>No start rule</h3>
  173. <p>Typically, parsers have what is called a start symbol, chosen to be the root
  174. of the grammar where parsing starts. The Spirit parser framework has no notion
  175. of a start symbol. Any rule can be a start symbol. This feature promotes step-wise
  176. creation of parsers. We can build parsers from the bottom up while fully testing
  177. each level or module up untill we get to the top-most level.</p>
  178. <h3><a name="tag"></a>Parser Tags</h3>
  179. <p>Rules may be tagged for identification purposes. This is necessary, especially
  180. when dealing with <a href="trees.html">parse trees and ASTs</a> to see which
  181. rule created a specific AST/parse tree node. Each rule has an ID of type <tt>parser_id</tt>.
  182. This ID can be obtained through the rule's <tt>id()</tt> member function:</p>
  183. <pre><code><font color="#000000"><span class=identifier> my_rule</span><span class=special>.</span><span class=identifier>id</span><span class=special>(); </span><span class=comment>// get my_rule's id</span></font></code></pre>
  184. <p>The <tt>parser_id</tt> class is declared as:</p>
  185. <pre> <span class="keyword">class</span> <span class="identifier">parser_id</span><br> <span class="special">{</span><br> <span class="keyword">public</span><span class="special">:</span><br> parser_id<span class="special">();</span><br> <span class="keyword">explicit</span> parser_id<span class="special">(</span><span class="keyword">void const</span><span class="special">*</span> p<span class="special">);</span><br> parser_id<span class="special">(</span><span class="keyword">std::size_t</span> l<span class="special">);</span>
  186. <span class="keyword">bool</span> <span class="keyword">operator</span><span class="special">==(</span><span class="identifier">parser_id</span> <span class="keyword">const</span><span class="special">&amp;</span> x<span class="special">)</span> const<span class="special">;</span><br> <span class="keyword">bool</span> <span class="keyword">operator</span><span class="special">!=(</span><span class="identifier">parser_id</span> <span class="keyword">const</span><span class="special">&amp;</span> x<span class="special">)</span> const<span class="special">;</span>
  187. <span class="keyword">bool</span> <span class="keyword"> operator</span><span class="special">&lt;(</span><span class="identifier">parser_id</span> <span class="keyword">const</span><span class="special">&amp;</span> x<span class="special">)</span> const<span class="special">;</span>
  188. <span class="special"></span><span class="keyword">std::size_t</span><span class="identifier"> to_long</span><span class="special">()</span> <span class="keyword">const</span><span class="special">;
  189. };</span></pre>
  190. <h3>parser_address_tag</h3>
  191. <p>The rule's <tt>TagT</tt> template parameter supplies this ID. This defaults
  192. to <tt>parser_address_tag</tt>. The <tt>parser_address_tag</tt> uses the address
  193. of the rule as its ID. This is often not the most convenient, since it is not
  194. always possible to get the address of a rule to compare against. </p>
  195. <h3>parser_tag</h3>
  196. <p>It is possible to have specific constant integers to identify a rule. For this
  197. purpose, we can use the <tt>parser_tag&lt;N&gt;</tt>, where N is a constant
  198. integer:</p>
  199. <pre><code><font color="#000000"><span class=identifier> rule</span><span class=special>&lt;</span><span class=identifier>parser_tag</span><span class="special">&lt;</span><span class=identifier>123</span><span class="special">&gt; &gt; </span><span class="identifier">my_rule</span><span class="special">; </span><span class="comment">// set my_rule's id to 123</span></font></code></pre>
  200. <h3>dynamic_parser_tag</h3>
  201. <p>The <tt>parser_tag&lt;N&gt;</tt> can only specifiy a <strong>static ID</strong>,
  202. which is defined at compile time. If you need the ID to be <strong>dynamic</strong>
  203. (changeable at runtime), you can use the <tt>dynamic_parser_tag</tt> class as
  204. the <tt>TagT</tt> template parameter. This template parameter enables the <tt>set_id()</tt>
  205. function, which may be used to set the required id at runtime:</p>
  206. <pre><code><font color="#000000"><span class=identifier> rule</span><span class=special>&lt;</span><span class=identifier>dynamic_parser_tag</span><span class="special">&gt; </span><span class="identifier">my_dynrule</span><span class="special">;</span>
  207. my_dynrule.set_id(1234); <span class="comment">// set my_dynrule's id to 1234</span></font></code></pre>
  208. <p>If the <tt>set_id()</tt> function isn't called, the parser id defaults to the
  209. address of the rule as its ID, just like the <tt>parser_address_tag</tt> template
  210. parameter would do. </p>
  211. <table border="0">
  212. <tr>
  213. <td width="10"></td>
  214. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  215. <td width="30"><a href="numerics.html"><img src="theme/l_arr.gif" border="0"></a></td>
  216. <td width="30"><a href="epsilon.html"><img src="theme/r_arr.gif" border="0"></a></td>
  217. </tr>
  218. </table>
  219. <br>
  220. <hr size="1">
  221. <p class="copyright">Copyright &copy; 1998-2003 Joel de Guzman<br>
  222. <br>
  223. <font size="2">Use, modification and distribution is subject to the Boost Software
  224. License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
  225. http://www.boost.org/LICENSE_1_0.txt)</font></p>
  226. </body>
  227. </html>