123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185 |
- <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
- <html>
- <head>
- <title>Confix Parsers</title>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <link rel="stylesheet" href="theme/style.css" type="text/css">
- </head>
- <body>
- <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
- <tr>
- <td width="10"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b> </b></font></td>
- <td width="85%"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Confix Parsers</b></font></td>
- <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
- </tr>
- </table>
- <br>
- <table border="0">
- <tr>
- <td width="10"></td>
- <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
- <td width="30"><a href="character_sets.html"><img src="theme/l_arr.gif" border="0"></a></td>
- <td width="30"><a href="list_parsers.html"><img src="theme/r_arr.gif" border="0"></a></td>
- </tr>
- </table>
- <p><a name="confix_parser"></a><b>Confix Parsers</b></p>
- <p>Confix Parsers recognize a sequence out of three independent elements: an
- opening, an expression and a closing. A simple example is a C comment:
- </p>
- <pre><code class="comment"> /* This is a C comment */</code></pre>
- <p>which could be parsed through the following rule definition:<code><font color="#000000">
- </font></code> </p>
- <pre><span class=identifier> </span><span class=identifier>rule</span><span class=special><> </span><span class=identifier>c_comment_rule
- </span><span class=special>= </span><span class=identifier>confix_p</span><span class=special>(</span><span class=literal>"/*"</span><span class=special>, </span><span class=special>*</span><span class=identifier>anychar_p</span><span class=special>, </span><span class=literal>"*/"</span><span class=special>)
- </span><span class=special>;</span></pre>
- <p>The <tt>confix_p</tt> parser generator
- should be used for generating the required Confix Parser. The
- three parameters to <tt>confix_p</tt> can be single
- characters (as above), strings or, if more complex parsing logic is required,
- auxiliary parsers, each of which is automatically converted to the corresponding
- parser type needed for successful parsing.</p>
- <p>The generated parser is equivalent to the following rule: </p>
- <pre><code> <span class=identifier>open </span><span class=special>>> (</span><span class=identifier>expr </span><span class=special>- </span><span class=identifier>close</span><span class=special>) >> </span><span class=identifier>close</span></code></pre>
- <p>If the expr parser is an <tt>action_parser_category</tt> type parser (a parser
- with an attached semantic action) we have to do something special. This happens,
- if the user wrote something like:</p>
- <pre><code><span class=identifier> confix_p</span><span class=special>(</span><span class=identifier>open</span><span class=special>, </span><span class=identifier>expr</span><span class=special>[</span><span class=identifier>func</span><span class=special>], </span><span class=identifier>close</span><span class=special>)</span></code></pre>
- <p>where <code>expr</code> is the parser matching the expr of the confix sequence
- and <code>func</code> is a functor to be called after matching the <code>expr</code>.
- If we would do nothing, the resulting code would parse the sequence as follows:</p>
- <pre><code> <span class=identifier>open </span><span class=special>>> (</span><span class=identifier>expr</span><span class=special>[</span><span class=identifier>func</span><span class=special>] - </span><span class=identifier>close</span><span class=special>) >> </span><span class=identifier>close</span></code></pre>
- <p>which in most cases is not what the user expects. (If this <u>is</u> what you've
- expected, then please use the <tt>confix_p</tt> generator
- function <tt>direct()</tt>, which will inhibit the parser refactoring). To make
- the confix parser behave as expected:</p>
- <pre><code><span class=identifier> open </span><span class=special>>> (</span><span class=identifier>expr </span><span class=special>- </span><span class=identifier>close</span><span class=special>)[</span><span class=identifier>func</span><span class=special>] >> </span><span class=identifier>close</span></code></pre>
- <p>the actor attached to the <code>expr</code> parser has to be re-attached to
- the <code>(expr - close)</code> parser construct, which will make the resulting
- confix parser 'do the right thing'. This refactoring is done by the help of
- the <a href="refactoring.html">Refactoring Parsers</a>. Additionally special
- care must be taken, if the expr parser is a <tt>unary_parser_category</tt> type
- parser as </p>
- <pre><code><span class=identifier> confix_p</span><span class=special>(</span><span class=identifier>open</span><span class=special>, *</span><span class=identifier>anychar_p</span><span class=special>, </span><span class=identifier>close</span><span class=special>)</span></code></pre>
- <p>which without any refactoring would result in </p>
- <pre><code> <span class=identifier>open</span> <span class=special>>> (*</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>close</span><span class=special>) >> </span><span class=identifier>close</span></code></pre>
- <p>and will not give the expected result (*anychar_p will eat up all the input up
- to the end of the input stream). So we have to refactor this into:
- <pre><code><span class=identifier> open </span><span class=special>>> *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>close</span><span class=special>) >> </span><span class=identifier>close</span></code></pre>
- <p>what will give the correct result. </p>
- <p>The case, where the expr parser is a combination of the two mentioned problems
- (i.e. the expr parser is a unary parser with an attached action), is handled
- accordingly too, so: </p>
- <pre><code><span class=identifier> confix_p</span><span class=special>(</span><span class=identifier>open</span><span class=special>, (*</span><span class=identifier>anychar_p</span><span class=special>)[</span><span class=identifier>func</span><span class=special>], </span>close<span class=special>)</span></code></pre>
- <p>will be parsed as expected: </p>
- <pre><code> <span class=identifier>open</span> <span class=special>>> (*(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>end</span><span class=special>))[</span><span class=identifier>func</span><span class=special>] >> </span>close</code></pre>
- <p>The required refactoring is implemented here with the help of the <a href="refactoring.html">Refactoring
- Parsers</a> too.</p>
- <table width="90%" border="0" align="center">
- <tr>
- <td colspan="2" class="table_title"><b>Summary of Confix Parser refactorings</b></td>
- </tr>
- <tr class="table_title">
- <td width="40%"><b>You write it as:</b></td>
- <td width="60%"><code><font face="Verdana, Arial, Helvetica, sans-serif">It
- is refactored to:</font></code></td>
- </tr>
- <tr>
- <td width="40%" class="table_cells"><code>confix_p<span class="special">(</span><span class=identifier>open</span><span class="special">,</span>
- expr<span class="special">,</span> close<span class="special">)</span></code></td>
- <td width="60%" class="table_cells"> <p><code>open <span class=special>>>
- (</span>expr <span class=special>-</span> close<span class=special>)</span><font color="#0000FF">
- </font><span class=special>>></span> close</code></p>
- </td>
- </tr>
- <tr>
- <td width="40%" class="table_cells"><code>confix_p<span class="special">(</span><span class=identifier>open</span><span class="special">,</span>
- expr<span class="special">[</span>func<span class="special">],</span> close<span class="special">)</span></code></td>
- <td width="60%" class="table_cells"> <p><code>open <span class=special>>>
- (</span>expr <span class=special>-</span> close<span class="special">)[</span>func<span class="special">]
- <font color="#0000FF" class="special">>></font></span> close</code></p>
- </td>
- </tr>
- <tr>
- <td width="40%" class="table_cells" height="9"><code>confix_p<span class="special">(</span><span class=identifier>open</span><span class="special">,
- *</span>expr<span class="special">,</span> close<span class="special">)</span></code></td>
- <td width="60%" class="table_cells" height="9"> <p><code>open <font color="#0000FF"><span class="special">>></span></font>
- <span class="special"><font color="#0000FF" class="special">*</font>(</span>expr
- <font color="#0000FF" class="special">-</font> close<span class="special">)
- <font color="#0000FF" class="special">>></font></span> close</code></p>
- </td>
- </tr>
- <tr>
- <td width="40%" class="table_cells"><code>confix_p<span class="special">(</span><span class=identifier>open</span><span class="special">,
- (*</span>expr<span class="special">)[</span>func<span class="special">],
- close</span><span class="special">)</span></code></td>
- <td width="60%" class="table_cells"> <p><code>open <font color="#0000FF"><span class="special">>></span></font><span class="special">
- (<font color="#0000FF" class="special">*</font>(</span>expr <font color="#0000FF" class="special">-</font>
- close<span class="special">))[</span>func<span class="special">] <font color="#0000FF" class="special">>></font></span>
- close</code></p>
- </td>
- </tr>
- </table>
- <p><a name="comment_parsers"></a><b>Comment Parsers</b></p>
- <p>The Comment Parser generator template <tt>comment_p</tt>
- is helper for generating a correct <a href="#confix_parser">Confix Parser</a>
- from auxiliary parameters, which is able to parse comment constructs as follows:
- </p>
- <pre><code> StartCommentToken <span class="special">>></span> Comment text <span class="special">>></span> EndCommentToken</code></pre>
- <p>There are the following types supported as parameters: parsers, single
- characters and strings (see as_parser). If it
- is used with one parameter, a comment starting with the given first parser
- parameter up to the end of the line is matched. So for instance the following
- parser matches C++ style comments:</p>
-
- <pre><code><span class=identifier> comment_p</span><span class=special>(</span><span class=string>"//"</span><span class=special>)</span></code></pre>
- <p>If it is used with two parameters, a comment starting with the first parser
- parameter up to the second parser parameter is matched. For instance a C style
- comment parser could be constrcuted as:</p>
- <pre><code> <span class=identifier>comment_p</span><span class=special>(</span><span class=string>"/*"</span><span class=special>, </span><span class=string>"*/"</span><span class=special>)</span></code></pre>
- <p>The <tt>comment_p</tt> parser generator allows to generate parsers for matching
- non-nested comments (as for C/C++ comments). Sometimes it is necessary to parse
- nested comments as for instance allowed in Pascal.</p>
- <pre><code class="comment"> { This is a { nested } PASCAL-comment }</code></pre>
- <p>Such nested comments are
- parseable through parsers generated by the <tt>comment_nest_p</tt> generator
- template functor. The following example shows a parser, which can be used for
- parsing the two different (nestable) Pascal comment styles:</p>
- <pre><code> <span class=identifier>rule</span><span class=special><> </span><span class=identifier>pascal_comment
- </span><span class=special>= </span><span class=identifier>comment_nest_p</span><span class=special>(</span><span class=string>"(*"</span><span class=special>, </span><span class=string>"*)"</span><span class=special>)
- | </span><span class=identifier>comment_nest_p</span><span class=special>(</span><span class=literal>'{'</span><span class=special>, </span><span class=literal>'}'</span><span class=special>)
- ;</span></code></pre>
- <p>Please note, that a comment is parsed implicitly as if the whole <tt>comment_p(...)</tt>
- statement were embedded into a <tt>lexeme_d[]</tt> directive, i.e. during parsing
- of a comment no token skipping will occur, even if you've defined a skip parser
- for your whole parsing process.</p>
- <p> <img height="16" width="15" src="theme/lens.gif"> <a href="../example/fundamental/comments.cpp">comments.cpp</a> demonstrates various comment parsing schemes: </p>
- <ol>
- <li>Parsing of different comment styles </li>
- <ul>
- <li>parsing C/C++-style comment</li>
- <li>parsing C++-style comment</li>
- <li>parsing PASCAL-style comment</li>
- </ul>
- <li>Parsing tagged data with the help of the confix_parser</li>
- <li>Parsing tagged data with the help of the confix_parser but the semantic<br>
- action is directly attached to the body sequence parser</li>
- </ol>
- <p>This is part of the Spirit distribution.</p>
- <table border="0">
- <tr>
- <td width="10"></td>
- <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
- <td width="30"><a href="character_sets.html"><img src="theme/l_arr.gif" border="0"></a></td>
- <td width="30"><a href="list_parsers.html"><img src="theme/r_arr.gif" border="0"></a></td>
- </tr>
- </table>
- <br>
- <hr size="1">
- <p class="copyright">Copyright © 2001-2002 Hartmut Kaiser<br>
- <br>
- <font size="2">Use, modification and distribution is subject to the Boost Software
- License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
- http://www.boost.org/LICENSE_1_0.txt) </font> </p>
- </body>
- </html>
|