123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231 |
- <html>
- <head>
- <title>The Rule</title>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <link rel="stylesheet" href="theme/style.css" type="text/css">
- </head>
- <body>
- <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
- <tr>
- <td width="10">
- </td>
- <td width="85%">
- <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>The Rule</b></font>
- </td>
- <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
- </tr>
- </table>
- <br>
- <table border="0">
- <tr>
- <td width="10"></td>
- <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
- <td width="30"><a href="numerics.html"><img src="theme/l_arr.gif" border="0"></a></td>
- <td width="30"><a href="epsilon.html"><img src="theme/r_arr.gif" border="0"></a></td>
- </tr>
- </table>
- <p>The <b>rule</b> is a polymorphic parser that acts as a named place-holder capturing
- the behavior of an EBNF expression assigned to it. Naming an EBNF expression
- allows it to be referenced later. The <tt>rule</tt> is a template class parameterized
- by the type of the scanner (<tt>ScannerT</tt>), the rule's <a href="indepth_the_parser_context.html">context</a>
- and its <a href="#tag">tag</a>. Default template parameters are provided to
- make it easy to use the rule.</p>
- <pre><code><font color="#000000"><span class=identifier> </span><span class=keyword>template</span><span class=special><
- </span><span class=keyword>typename </span><span class=identifier>ScannerT </span><span class=special>= </span><span class=identifier>scanner</span><span class=special><>,
- </span><span class=keyword>typename </span><span class=identifier>ContextT </span><span class=special>= </span><span class=identifier>parser_context</span><span class=special><></span><span class=identifier>,
- </span><span class="keyword">typename</span><span class=identifier> TagT </span><span class="special">=</span><span class=identifier> parser_address_tag</span><span class=special>>
- </span><span class=keyword>class </span><span class=identifier>rule</span><span class=special>;</span></font></code></pre>
- <p>Default template parameters are supplied to handle the most common case. <tt>ScannerT</tt>
- defaults to <tt>scanner<></tt>, a plain vanilla scanner that acts on <tt>char
- const<span class="operators">*</span></tt> iterators and does nothing special
- at all other than iterate through all the chars in the null terminated input
- a character at a time. The rule tag, <tt>TagT</tt>, typically used with <a href="trees.html">ASTs</a>,
- is used to identify a rule; it is explained <a href="#tag">here</a>. In trivial
- cases, declaring a rule as <tt>rule<></tt> is enough. You need not be
- concerned at all with the <tt>ContextT</tt> template parameter unless you wish
- to tweak the low level behavior of the rule. Detailed information on the <tt>ContextT</tt>
- template parameter is provided <a href="indepth_the_parser_context.html">elsewhere</a>.
- </p>
- <h3><a name="order_of_parameters"></a>Order of parameters</h3>
- <p>As of v1.8.0, the <tt>ScannerT</tt>, <tt>ContextT</tt> and <tt>TagT</tt> can
- be specified in any order. If a template parameter is missing, it will assume
- the defaults. Examples:</p>
- <pre><span class=identifier> rule</span><span class=special><> </span><span class=identifier>rx1</span><span class=special>;
- </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>scanner</span><span class=special><> </span><span class=special>> </span><span class=identifier>rx2</span><span class=special>;
- </span> <span class=identifier>rule</span><span class=special><</span><span class=identifier>parser_context<code><font color="#000000"><span class=special><></span></font></code> </span><span class=special>> </span><span class=identifier>rx3</span><span class=special>;
- </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>parser_context<code><font color="#000000"><span class=special><></span></font></code></span><span class=special>, </span><span class=identifier>parser_address_tag</span><span class=special>> </span><span class=identifier>rx4</span><span class=special>;
- </span> <span class=identifier>rule</span><span class=special><</span><span class=identifier>parser_address_tag</span><span class=special>> </span><span class=identifier>rx5</span><span class=special>;
- </span> <span class=identifier>rule</span><span class=special><</span><span class=identifier>parser_address_tag</span><span class=special>, </span><span class=identifier>scanner</span><span class=special><>, </span><span class=identifier>parser_context<code><font color="#000000"><span class=special><></span></font></code> </span><span class=special>> </span><span class=identifier>rx6</span><span class=special>;
- </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>parser_context<code><font color="#000000"><span class=special><></span></font></code></span><span class=special>, </span><span class=identifier>scanner</span><span class=special><>, </span><span class=identifier>parser_address_tag</span><span class=special>> </span><span class=identifier>rx7</span><span class=special>;</span></pre>
- <h3><a name="multiple_scanner_support" id="multiple_scanner_support"></a>Multiple scanners</h3>
- <p>As of v1.8.0, rules can use one or more scanner types. There are cases, for
- instance, where we need a rule that can work on the phrase and character levels.
- Rule/scanner mismatch has been a source of confusion and is the no. 1 <a href="faq.html#scanner_business">FAQ</a>.
- To address this issue, we now have multiple scanner support. Example:</p>
- <pre><span class=special> </span><span class=keyword>typedef </span><span class=identifier>scanner_list</span><span class=special><</span><span class=identifier>scanner</span><span class=special><>, </span><span class=identifier>phrase_scanner_t</span><span class=special>> </span><span class=identifier>scanners</span><span class=special>;
- </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>scanners</span><span class=special>> </span><span class=identifier>r </span><span class=special>= </span><span class=special>+</span><span class=identifier>anychar_p</span><span class=special>;
- </span><span class=identifier>assert</span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=string>"abcdefghijk"</span><span class=special>, </span><span class=identifier>r</span><span class=special>).</span><span class=identifier>full</span><span class=special>);
- </span><span class=identifier>assert</span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=string>"a b c d e f g h i j k"</span><span class=special>, </span><span class=identifier>r</span><span class=special>, </span><span class=identifier>space_p</span><span class=special>).</span><span class=identifier>full</span><span class=special>);</span></pre>
- <p>Notice how rule <tt>r</tt> is used in both the phrase and character levels.
- </p>
- <p>By default support for multiple scanners is disabled. The macro
- <tt>BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT</tt> must be defined to the
- maximum number of scanners allowed in a scanner_list. The value must
- be greater than 1 to enable multiple scanners. Given the
- example above, to define a limit of two scanners for the list, the
- following line must be inserted into the source file before the
- inclusion of Spirit headers:
- </p>
- <pre><span class=special> </span><span class=preprocessor>#define </span><span class=identifier>BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT</span> <span class=literal>2</span></pre>
- <table width="80%" border="0" align="center">
- <tr>
- <td class="note_box"><img src="theme/bulb.gif" width="13" height="18"> See
- the techniques section for an <a href="techniques.html#multiple_scanner_support">example</a>
- of a <a href="grammar.html">grammar</a> using a multiple scanner enabled
- rule, <a href="scanner.html#lexeme_scanner">lexeme_scanner</a> and <a href="scanner.html#as_lower_scanner">as_lower_scanner.</a></td>
- </tr>
- </table>
- <h3>Rule Declarations</h3>
- <p>The rule class models EBNF's production rule. Example:</p>
- <pre><code><font color="#000000"> <span class=identifier>rule</span><span class=special><> </span><span class=identifier>a_rule </span><span class=special>= </span><span class=special>*(</span><span class=identifier>a </span><span class=special>| </span><span class=identifier>b</span><span class=special>) </span><span class=special>& </span><span class=special>+(</span><span class=identifier>c </span><span class=special>| </span><span class=identifier>d </span><span class=special>| </span><span class=identifier>e</span><span class=special>);</span></font></code></pre>
- <p>The type and behavior of the right-hand (rhs) EBNF expression, which may be
- arbitrarily complex, is encoded in the rule named a_rule. a_rule may now be
- referenced elsewhere in the grammar:</p>
- <pre><code><font color="#000000"> <span class=identifier>rule</span><span class=special><> </span><span class=identifier>another_rule </span><span class=special>= </span><span class=identifier>f </span><span class=special>>> </span><span class=identifier>g </span><span class=special>>> </span><span class=identifier>h </span><span class=special>>> </span><span class=identifier>a_rule</span><span class=special>;</span></font></code></pre>
- <table width="80%" border="0" align="center">
- <tr>
- <td class="note_box"><img src="theme/alert.gif" width="16" height="16"> <b>Referencing
- rules <br>
- </b><br>
- When a rule is referenced anywhere in the right hand side of an EBNF expression,
- the rule is held by the expression by reference. It is the responsibility
- of the client to ensure that the referenced rule stays in scope and does
- not get destructed while it is being referenced. </td>
- </tr>
- </table>
- <pre><span class=special> </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>int_p</span><span class=special>;
- </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>a</span><span class=special>;
- </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>int_p </span><span class=special>>> </span><span class=identifier>b</span><span class=special>;</span></pre>
- <h3>Copying Rules</h3>
- <p>The rule is a weird C++ citizen, unlike any other C++ object. It does not have
- the proper copy and assignment semantics and cannot be stored and passed around
- by value. If you need to copy a rule you have to explicitly call its member
- function <tt>copy()</tt>:</p>
- <pre><span class=special> </span><span class=identifier>r</span><span class="special">.</span><span class=identifier>copy()</span><span class=special>;</span></pre>
- <p>However, be warned that copying a rule will not deep copy other referenced
- rules of the source rule being copied. This might lead to dangling references.
- Again, it is the responsibility of the client to ensure that all referenced
- rules stay in scope and does not get destructed while it is being referenced.
- Caveat emptor.</p>
- <p>If you copy a rule, then you'll want to place it in a storage somewhere. The
- problem is how? The storage can't be another rule:</p>
- <pre> <code><font color="#000000"><span class=identifier>rule</span><span class=special><></span></font></code> r2 <span class="special">=</span> <span class=identifier>r</span><span class="special">.</span><span class=identifier>copy()</span><span class=special>; </span><span class="comment">// BAD!</span></pre>
- <p>because rules are weird and does not have the expected C++ copy-constructor
- and assignment semantics! As a general rule: <strong>Don't put a copied rule
- into another rule! </strong>Instead, use the <a href="stored_rule.html">stored_rule</a>
- for that purpose.</p>
- <h3>Forward declarations</h3>
- <p>A <tt>rule</tt> may be declared before being defined to allow cyclic structures
- typically found in BNF declarations. Example:</p>
- <pre><code><font color="#000000"><span class=special> </span><span class=identifier>rule</span><span class=special><> </span><span class=identifier>a</span><span class=special>, </span><span class=identifier>b</span><span class=special>, </span><span class=identifier>c</span><span class=special>;
- </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>b </span><span class=special>| </span><span class=identifier>a</span><span class=special>;
- </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>c </span><span class=special>| </span><span class=identifier>a</span><span class=special>;</span></font></code></pre>
- <h3>Recursion</h3>
- <p>The right-hand side of a rule may reference other rules, including itself.
- The limitation is that direct or indirect left recursion is not allowed (this
- is an unchecked run-time error that results in an infinite loop). This is typical
- of top-down parsers. Example:</p>
- <pre><code><font color="#000000"><span class=special> </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>a </span><span class=special>| </span><span class=identifier>b</span><span class=special>; </span><span class=comment>// infinite loop!</span></font></code></pre>
- <table width="80%" border="0" align="center">
- <tr>
- <td class="note_box"><img src="theme/lens.gif" width="15" height="16"> <b>What
- is left recursion?<br>
- </b><br>
- Left recursion happens when you have a rule that calls itself before anything
- else. A top-down parser will go into an infinite loop when this happens.
- See the <a href="faq.html#left_recursion">FAQ</a> for details on how to
- eliminate left recursion.</td>
- </tr>
- </table>
- <h3>Undefined rules</h3>
- <p>An undefined rule matches nothing and is semantically equivalent to <tt>nothing_p</tt>.</p>
- <h3>Redeclarations</h3>
- <p>Like any other C++ assignment, a second assignment to a rule is destructive
- and will redefine it. The old definition is lost. Rules are dynamic. A rule
- can change its definition anytime:</p>
- <pre><code><font color="#000000"><span class=identifier> r </span><span class=special>= </span><span class=identifier>a_definition</span><span class=special>;
- </span><span class=identifier> r </span><span class=special>= </span><span class=identifier>another_definition</span><span class=special>;</span></font></code></pre>
- <p>Rule <tt>r</tt> loses the old definition when the second assignment is made.
- As mentioned, an undefined rule matches nothing and is semantically equivalent
- to <tt>nothing_p</tt>.
- <h3>Dynamic Parsers</h3>
- <p>Hosting declarative EBNF in imperative C++ yields an interesting blend. We
- have the best of both worlds. We have the ability to conveniently modify the
- grammar at run time using imperative constructs such as <tt>if</tt>, <tt>else</tt>
- statements. Example:</p>
- <pre><code><font color="#000000"><span class=special> </span><span class=keyword>if </span><span class=special>(</span><span class=identifier>feature_is_available</span><span class=special>)
- </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>add_this_feature</span><span class=special>;</span></font></code></pre>
- <p>Rules are essentially dynamic parsers. A dynamic parser is characterized by
- its ability to modify its behavior at run time. Initially, an undefined rule
- matches nothing. At any time, the rule may be defined and redefined, thus, dynamically
- altering its behavior.</p>
- <h3>No start rule</h3>
- <p>Typically, parsers have what is called a start symbol, chosen to be the root
- of the grammar where parsing starts. The Spirit parser framework has no notion
- of a start symbol. Any rule can be a start symbol. This feature promotes step-wise
- creation of parsers. We can build parsers from the bottom up while fully testing
- each level or module up untill we get to the top-most level.</p>
- <h3><a name="tag"></a>Parser Tags</h3>
- <p>Rules may be tagged for identification purposes. This is necessary, especially
- when dealing with <a href="trees.html">parse trees and ASTs</a> to see which
- rule created a specific AST/parse tree node. Each rule has an ID of type <tt>parser_id</tt>.
- This ID can be obtained through the rule's <tt>id()</tt> member function:</p>
- <pre><code><font color="#000000"><span class=identifier> my_rule</span><span class=special>.</span><span class=identifier>id</span><span class=special>(); </span><span class=comment>// get my_rule's id</span></font></code></pre>
- <p>The <tt>parser_id</tt> class is declared as:</p>
- <pre> <span class="keyword">class</span> <span class="identifier">parser_id</span><br> <span class="special">{</span><br> <span class="keyword">public</span><span class="special">:</span><br> parser_id<span class="special">();</span><br> <span class="keyword">explicit</span> parser_id<span class="special">(</span><span class="keyword">void const</span><span class="special">*</span> p<span class="special">);</span><br> parser_id<span class="special">(</span><span class="keyword">std::size_t</span> l<span class="special">);</span>
-
- <span class="keyword">bool</span> <span class="keyword">operator</span><span class="special">==(</span><span class="identifier">parser_id</span> <span class="keyword">const</span><span class="special">&</span> x<span class="special">)</span> const<span class="special">;</span><br> <span class="keyword">bool</span> <span class="keyword">operator</span><span class="special">!=(</span><span class="identifier">parser_id</span> <span class="keyword">const</span><span class="special">&</span> x<span class="special">)</span> const<span class="special">;</span>
- <span class="keyword">bool</span> <span class="keyword"> operator</span><span class="special"><(</span><span class="identifier">parser_id</span> <span class="keyword">const</span><span class="special">&</span> x<span class="special">)</span> const<span class="special">;</span>
- <span class="special"></span><span class="keyword">std::size_t</span><span class="identifier"> to_long</span><span class="special">()</span> <span class="keyword">const</span><span class="special">;
- };</span></pre>
- <h3>parser_address_tag</h3>
- <p>The rule's <tt>TagT</tt> template parameter supplies this ID. This defaults
- to <tt>parser_address_tag</tt>. The <tt>parser_address_tag</tt> uses the address
- of the rule as its ID. This is often not the most convenient, since it is not
- always possible to get the address of a rule to compare against. </p>
- <h3>parser_tag</h3>
- <p>It is possible to have specific constant integers to identify a rule. For this
- purpose, we can use the <tt>parser_tag<N></tt>, where N is a constant
- integer:</p>
- <pre><code><font color="#000000"><span class=identifier> rule</span><span class=special><</span><span class=identifier>parser_tag</span><span class="special"><</span><span class=identifier>123</span><span class="special">> > </span><span class="identifier">my_rule</span><span class="special">; </span><span class="comment">// set my_rule's id to 123</span></font></code></pre>
- <h3>dynamic_parser_tag</h3>
- <p>The <tt>parser_tag<N></tt> can only specifiy a <strong>static ID</strong>,
- which is defined at compile time. If you need the ID to be <strong>dynamic</strong>
- (changeable at runtime), you can use the <tt>dynamic_parser_tag</tt> class as
- the <tt>TagT</tt> template parameter. This template parameter enables the <tt>set_id()</tt>
- function, which may be used to set the required id at runtime:</p>
- <pre><code><font color="#000000"><span class=identifier> rule</span><span class=special><</span><span class=identifier>dynamic_parser_tag</span><span class="special">> </span><span class="identifier">my_dynrule</span><span class="special">;</span>
- my_dynrule.set_id(1234); <span class="comment">// set my_dynrule's id to 1234</span></font></code></pre>
- <p>If the <tt>set_id()</tt> function isn't called, the parser id defaults to the
- address of the rule as its ID, just like the <tt>parser_address_tag</tt> template
- parameter would do. </p>
- <table border="0">
- <tr>
- <td width="10"></td>
- <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
- <td width="30"><a href="numerics.html"><img src="theme/l_arr.gif" border="0"></a></td>
- <td width="30"><a href="epsilon.html"><img src="theme/r_arr.gif" border="0"></a></td>
- </tr>
- </table>
- <br>
- <hr size="1">
- <p class="copyright">Copyright © 1998-2003 Joel de Guzman<br>
- <br>
- <font size="2">Use, modification and distribution is subject to the Boost Software
- License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
- http://www.boost.org/LICENSE_1_0.txt)</font></p>
- </body>
- </html>
|