123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289 |
- <html>
- <head>
- <title>Subrules</title>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <link rel="stylesheet" href="theme/style.css" type="text/css">
- </head>
- <body>
- <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
- <tr>
- <td width="10">
- </td>
- <td width="85%">
- <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Subrules</b></font>
- </td>
- <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
- </tr>
- </table>
- <br>
- <table border="0">
- <tr>
- <td width="10"></td>
- <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
- <td width="30"><a href="grammar.html"><img src="theme/l_arr.gif" border="0"></a></td>
- <td width="30"><a href="semantic_actions.html"><img src="theme/r_arr.gif" border="0"></a></td>
- </tr>
- </table>
- <p>Spirit is implemented using expression templates. This is a very powerful technique.
- Along with its power comes some complications. We almost take for granted that
- when we write <tt>i | j >> k</tt> where <tt>i</tt>, <tt>j</tt> and <tt>k</tt>
- are all integers the result is still an integer. Yet, with expression templates,
- the same expression <tt>i | j >> k</tt> where <tt>i</tt>, <tt>j</tt> and
- <tt>k</tt> are of type <tt>T</tt>, the result is a complex composite type [see
- <a href="basic_concepts.html">Basic Concepts</a>]. Spirit expressions, which
- are combinations of primitives and composites yield an infinite set of new types.
- One problem is that C++ offers no easy facility to deduce the type of an arbitrarily
- complex expression that yields a complex type. Thus, while it is easy to write:</p>
- <pre><code><font color="#000000"><span class=identifier> </span><span class=keyword>int </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>; </span><span class=comment>// where i, j, and k are ints</span></font></code></pre>
- <p>Expression templates yield an endless supply of types. Without the <a href="rule.html">rule</a>,
- there is no easy way to do this in C++ if <tt>i</tt>, <tt>j</tt> and <tt>k</tt>
- are Spirit parsers:</p>
- <pre><code><font color="#000000"><span class=comment> </span><span class=special><</span><span class=identifier>what_type???</span><span class=special>> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>; </span><span class=comment>// where i, j, and k are Spirit parsers</span></font></code></pre>
- <p>If <tt>i</tt>, <tt>j</tt> and <tt>k</tt> are all <tt>chlit<></tt> objects,
- the type that we want is:</p>
- <pre><code><font color="#000000"><span class=comment> </span><span class=keyword>typedef
- </span><span class=identifier>alternative</span><span class=special><
- </span><span class=identifier>chlit</span><span class=special><></span><span class=comment> // i
- </span><span class=special>,</span> <span class=identifier>sequence</span><span class=special><
- </span><span class=identifier>chlit</span><span class=special><> </span><span class=comment>// j
- </span><span class=special> ,</span><span class=comment> </span><span class=identifier>chlit</span><span class=special><> </span><span class=comment>// k
- </span><span class=special>>
- >
- </span><span class=identifier>rule_t</span><span class=special>;
- </span><span class=identifier>rule_t r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>; </span><span class=comment>// where i, j, and k are chlit<> objects</span></font></code></pre>
- <p>We deliberately formatted the type declaration nicely to make it understandable.
- Try that with a more complex expression. While it can be done, explicitly spelling
- out the type of a Spirit expression template is tedious and error prone. The
- right hand side (rhs) has to mirror the type of the left hand side (lhs). (<img src="theme/lens.gif" width="15" height="16">
- Yet, if you still wish to do it, see this <a href="techniques.html#no_rules">link</a>
- for a technique). </p>
- <table width="80%" border="0" align="center">
- <tr>
- <td class="note_box"><p><img src="theme/lens.gif" width="15" height="16"><b>
- typeof and auto</b> <br>
- <br>
- Some compilers already support the <tt>typeof</tt> keyword. This can be
- used to free us from having to explicitly type the type (pun intentional).
- Using the <tt>typeof</tt>, we can rewrite the Spirit expression above
- as:<br>
- <br>
- <span class="keyword"><code>typeof</code><code></code></span><code><span class=special>(</span><span class=identifier>i
- </span><span class=special>| </span><span class=identifier>j </span><span class=special>>>
- </span><span class=identifier>k</span><span class=special>) </span><span class=identifier>r
- </span><span class=special>= </span><span class=identifier>i </span><span class=special>|
- </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>;</span></code><br>
- <br>
- While this is better than having to explicitly declare a complex type,
- it is redundant, error prone and still an eye sore. The expression is
- typed twice. The only way to simplify this is to introduce a macro (See
- this <a href="techniques.html#typeof">link</a> for more information).<br>
- <br>
- <a href="http://www.boost-consulting.com">David Abrahams</a> proposed
- in comp.std.c++ to reuse the <tt>auto</tt> keyword for type deduced variables.
- This has been extensibly discussed in <a href="http://www.boost.org">boost.org</a>. Example:
- <br>
- <br>
- <span class=keyword><code>auto </code></span><code><span class=identifier>r
- </span><span class=special>= </span><span class=identifier>i </span><span class=special>|
- </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>;</span></code><br>
- <br>
- Once such a C++ extension is accepted into the standard, this would be
- a neat solution and a nice fit for our purpose. It's not a complete solution
- though since there are still situations where we do not know the rhs beforehand;
- for instance when pre-declaring cyclic dependent rules.</p>
- </td>
- </tr>
- </table>
- <p>Fortunately, rules come to the rescue. Rules can capture the type of the expression
- assigned to it. Thus:</p>
- <pre><code><font color="#000000"> <span class=identifier>rule</span><span class=special><> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>>> </span><span class=identifier>k</span><span class=special>; </span><span class=comment>// where i, j, and k are chlit<> objects</span></font></code></pre>
- <p>It might not be apparent but behind the scenes, plain rules are actually implemented
- using a pointer to a runtime polymorphic abstract class that holds the dynamic
- type of the parser assigned to it. When a Spirit expression is assigned to a
- rule, its type is encapsulated in a concrete subclass of the abstract class.
- A virtual parse function delegates the parsing to the encapsulated object.</p>
- <p>Rules have drawbacks though:</p>
- <p><img src="theme/bullet.gif" width="12" height="12"> It is coupled to a specific
- scanner type. The rule is tied to a specific scanner [see <a href="faq.html#scanner_business">The
- Scanner Business</a>].<br>
- <img src="theme/bullet.gif" width="12" height="12"> The rule's parse member
- function has a virtual function call overhead that cannot be inlined.</p>
- <h2>Static rules: subrules</h2>
- <p>The subrule is a fully static version of the rule. The subrule does not have
- the drawbacks listed above. </p>
- <p><img src="theme/bullet.gif" width="12" height="12"> The subrule is not tied
- to a specific scanner so just about any scanner type may be used<br>
- <img src="theme/bullet.gif" width="12" height="12"> The subrule also allows
- aggressive inlining since there are no virtual function calls</p>
- <pre><code><font color="#000000"><span class=identifier> </span><span class=keyword>template</span><span class=special><</span><span class=keyword>int </span></font><span class="identifier">ID</span><font color="#000000"><span class=special>, </span><span class=keyword>typename </span><span class=identifier>ContextT </span><span class=special>= </span><span class=identifier>parser_context</span><span class=special><></span> <span class=special>>
- </span><span class=keyword>class </span><span class=identifier>subrule</span><span class=special>;</span></font></code></pre>
- <p>The first template parameter gives the subrule an identification tag. Like
- the <a href="rule.html">rule</a>, there is a ContextT template parameter that
- defaults to <code><tt>parser_context</tt></code>. You need not be concerned
- at all with the <tt>ContextT</tt> template parameter unless you wish to tweak
- the low level behavior of the subrule. Detailed information on the <tt>ContextT</tt>
- template parameter is provided <a href="indepth_the_parser_context.html">elsewhere</a>.
- </p>
- <p>Presented above is the public API. There may actually be more template parameters
- after <tt>ContextT</tt>. Everything after the <tt>ContextT</tt> parameter should
- not be of concern to the client and are strictly for internal use only.</p>
- <p>Apart from a few minor differences, the subrule follows the usage and syntax
- of the rule closely. Here's the calculator grammar using subrules:</p>
- <pre><code><font color="#000000"><span class=comment> </span><span class=keyword>struct </span><span class=identifier>calculator </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special><</span><span class=identifier>calculator</span><span class=special>>
- </span><span class=special>{
- </span><span class=keyword>template </span><span class=special><</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>>
- </span><span class=keyword>struct </span><span class=identifier>definition
- </span><span class=special>{
- </span><span class=identifier>definition</span><span class=special>(</span><span class=identifier>calculator </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>)
- </span><span class=special>{
- </span><span class=identifier>first </span><span class=special>=
- </span><span class=special>(
- </span><span class=identifier>expression </span><span class=special>= </span><span class=identifier>term </span><span class=special>>> </span><span class=special>*((</span><span class=literal>'+' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'-' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>)),
- </span><span class=identifier>term </span><span class=special>= </span><span class=identifier>factor </span><span class=special>>> </span><span class=special>*((</span><span class=literal>'*' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'/' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>)),
- </span><span class=identifier>factor </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>group</span><span class=special>,
- </span><span class=identifier>group </span><span class=special>= </span><span class=literal>'(' </span><span class=special>>> </span><span class=identifier>expression </span><span class=special>>> </span><span class=literal>')'
- </span><span class=special>);
- </span><span class=special>}
- </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>expression</span><span class=special>;
- </span><span class=identifier>subrule</span><span class=special><</span><span class=number>1</span><span class=special>> </span><span class=identifier>term</span><span class=special>;
- </span><span class=identifier>subrule</span><span class=special><</span><span class=number>2</span><span class=special>> </span><span class=identifier>factor</span><span class=special>;
- </span><span class=identifier>subrule</span><span class=special><</span><span class=number>3</span><span class=special>> </span><span class=identifier>group</span><span class=special>;
- </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=identifier>first</span><span class=special>;
- </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=keyword>const</span><span class=special>&
- </span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>first</span><span class=special>; </span><span class=special>}
- </span><span class=special>};
- </span><span class=special>};</span></font></code></pre>
- <p><img src="theme/lens.gif" width="15" height="16"> A fully working example with
- <a href="semantic_actions.html">semantic actions</a> can be <a href="../example/fundamental/subrule_calc.cpp">viewed
- here</a>. This is part of the Spirit distribution. </p>
- <table border="0" align="left">
- <tr>
- <td width="199"><img src="theme/subrule1.png" width="234" height="224"></td>
- <td width="2"></td>
- </tr>
- </table>
- <p>The subrule as an efficient version of the rule. Compiler optimizations such
- as aggressive inlining help reduce the code size and increase performance significantly.
- </p>
- <p>The subrule is not a panacea however. Subrules push the C++ compiler hard to
- its knees. For example, current compilers have a limit on recursion depth that
- may not be exceeded. Don't even think about writing a full pascal grammar using
- subrules alone. A grammar using subrules is a single C++ expression. Current
- C++ compilers cannot handle very complex expressions very well. Finally, a plain
- rule is still needed to act as place holder for subrules.</p>
- <p>The code above is a good example of the recommended way to use subrules. Notice
- the hierarchy. We have a grammar that encapsulates the whole calculator. The
- start rule is a plain rule that holds the set of subrules. The subrules in turn
- defines the actual details of the grammar.</p>
- <table width="80%" border="0" align="center">
- <tr>
- <td class="note_box"><img src="theme/lens.gif" width="15" height="16"><b>
- Template instantiation depth</b> <br> <br>
- Spirit pushes the C++ compiler hard. Current C++ compilers cannot handle
- very complex heavily nested expressions very well. One restricting factor
- is the typical compiler's limit on template recursion depth. Some, but not
- all, compilers allow this limit to be configured.<br>
- <br>
- g++'s maximum can be set using a compiler flag: -ftemplate-depth. Set this
- appropriately if you have a relatively complex grammar.<br>
- <br>
- Microsoft Visual C++ can take greater than 1000 for both template class
- and function instantiation depths. However, the linker chokes with deep
- template function instantiation unless inline recursion depth is set using
- these pragmas:<br>
- <br>
- <span class="preprocessor">#pragma</span> inline_depth<span class="special">(</span>255<span class="special">)</span><br>
- <span class="preprocessor">#pragma</span> inline_recursion<span class="special">(</span>on<span class="special">)<br>
- <br>
- </span>Perhaps this limitations no longer applies to more current versions
- of these compilers. Be sure to check your compiler documentation.</td>
- </tr>
- </table>
- <p>This setup gives a good balance. The subrules do all the work. Each grammar
- will have only one rule: <tt>first</tt>. The rule is used just to hold the subrules
- and make them visible to the grammar. </p>
- <h3>The subrule definition</h3>
- <p>Like the rule, the expression after assignment operator <tt>=</tt> defines
- the subrule:</p>
- <pre> <span class=identifier>identifier </span><span class=special>= </span><span class=identifier>expression</span></pre>
- <p>Unlike rules, subrules may be defined only once. Redefining a subrule is illegal
- and will result to a compile time assertion.</p>
- <h3>Separators [ , ]</h3>
- <p>While rules are terminated by the semicollon <tt>';'</tt>. Subrules are not
- terminated but are separated by the comma: <tt>','</tt>. Like Pascal statements,
- the last subrule in a group may not have a trailing comma.</p>
- <pre><span class=identifier> </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'a'</span><span class=special>),
- </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'b'</span><span class=special>),
- </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'c'</span><span class=special>), </span><span class=comment>// BAD, trailing comma</span><code><font color="#000000"><font color="#800000"><i></i></font></font></code><code><font color="#000000"><font color="#800000"><i></i></font></font><i></i></code></pre>
- <p>
- <pre><code><span class=comment> </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'a'</span><span class=special>),
- </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'b'</span><span class=special>),
- </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'c'</span><span class=special>) </span><span class=comment>// OK</span></code></pre>
- <h3> The start subrule</h3>
- <p>Unlike rules, parsing proceeds from the start subrule. The first (topmost)
- subrule in a group of subrules is called the <b>start subrule</b>. In our example
- above, <tt>expression</tt> is the start subrule. When a group of subrules is
- called forth, the start subrule <tt>expression</tt> is called first.</p>
- <h3>IDs</h3>
- <p>Each subrule has a corresponding ID; an integral constant that uniquely specifies
- the subrule. Our example above has four subrules. They are declared as:</p>
- <pre><code><span class=comment> </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>expression</span><span class=special>;
- </span><span class=identifier>subrule</span><span class=special><</span><span class=number>1</span><span class=special>> </span><span class=identifier>term</span><span class=special>;
- </span><span class=identifier>subrule</span><span class=special><</span><span class=number>2</span><span class=special>> </span><span class=identifier>factor</span><span class=special>;
- </span><span class=identifier>subrule</span><span class=special><</span><span class=number>3</span><span class=special>> </span><span class=identifier>group</span><span class=special>;</span></code></pre>
- <h3> Aliases</h3>
- <p>It is possible to have subrules with similar IDs. A subrule with a similar
- ID to will be an alias of the other. Both subrules may be used interchangeably.</p>
- <pre><code><span class=special> </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>a</span><span class=special>;
- </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>alias</span><span class=special>; </span><span class=comment>// alias of a</span></code></pre>
- <h3>Groups: scope and nesting</h3>
- <p>The scope of a subrule and its definition is the enclosing group, typically
- (and by convention) enclosed inside the parentheses. IDs outside a scope are
- not directly visible. Inner subrule groups can be nested by enclosing each sub-group
- inside another set of parentheses. Each group is unique and acts independently.
- Consequently, while it may not be advisable to do so, a subrule in a group may
- share the same ID as a subrule in another group since both groups are independent
- of each other.</p>
- <pre><code><span class=comment> </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>a</span><span class=special>;
- </span><span class=identifier>subrule</span><span class=special><</span><span class=number>1</span><span class=special>> </span><span class=identifier>b</span><span class=special>;
- </span><span class=identifier>subrule</span><span class=special><</span><span class=number>0</span><span class=special>> </span><span class=identifier>c</span><span class=special>;
- </span><span class=identifier>subrule</span><span class=special><</span><span class=number>1</span><span class=special>> </span><span class=identifier>d</span><span class=special>;
- </span><span class=special>( </span><span class=comment>// outer subrule group, scope of a and b
- </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'a'</span><span class=special>),
- </span><span class=identifier>b </span><span class=special>=
- </span><span class=special>( </span><span class=comment>// inner subrule group, scope of b and c
- </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'c'</span><span class=special>),
- </span><span class=identifier>d </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'d'</span><span class=special>)
- </span><span class=special>)
- </span><span class=special>)</span></code></pre>
- <p>Subrule IDs need to be unique only within a group. A grammar is an implicit
- group. Furthermore, even subrules in a grammar may have the same IDs without
- clashing if they are inside a group. Subrules may be explicitly grouped using
- the parentheses. Parenthesized groups have unique scopes. In the code above,
- the outer subrule group defines the subrules <tt>a</tt> and <tt>b</tt> while
- the inner subrule group defines the subrules <tt>c</tt> and <tt>d</tt>. Notice
- that the definition of <tt>b</tt> is the inner subrule.</p>
- <table border="0">
- <tr>
- <td width="10"></td>
- <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
- <td width="30"><a href="grammar.html"><img src="theme/l_arr.gif" border="0"></a></td>
- <td width="30"><a href="semantic_actions.html"><img src="theme/r_arr.gif" border="0"></a></td>
- </tr>
- </table>
- <br>
- <hr size="1">
- <p class="copyright">Copyright © 1998-2003 Joel de Guzman<br>
- <br>
- <font size="2">Use, modification and distribution is subject to the Boost Software
- License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
- http://www.boost.org/LICENSE_1_0.txt)</font></p>
- <p> </p>
- <p><code><font color="#000000"><font color="#0000ff"></font></font></code></p>
- </body>
- </html>
|