confix.html 14 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185
  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
  2. <html>
  3. <head>
  4. <title>Confix Parsers</title>
  5. <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  6. <link rel="stylesheet" href="theme/style.css" type="text/css">
  7. </head>
  8. <body>
  9. <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
  10. <tr>
  11. <td width="10"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>&nbsp;</b></font></td>
  12. <td width="85%"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Confix Parsers</b></font></td>
  13. <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
  14. </tr>
  15. </table>
  16. <br>
  17. <table border="0">
  18. <tr>
  19. <td width="10"></td>
  20. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  21. <td width="30"><a href="character_sets.html"><img src="theme/l_arr.gif" border="0"></a></td>
  22. <td width="30"><a href="list_parsers.html"><img src="theme/r_arr.gif" border="0"></a></td>
  23. </tr>
  24. </table>
  25. <p><a name="confix_parser"></a><b>Confix Parsers</b></p>
  26. <p>Confix Parsers recognize a sequence out of three independent elements: an
  27. opening, an expression and a closing. A simple example is a C comment:
  28. </p>
  29. <pre><code class="comment"> /* This is a C comment */</code></pre>
  30. <p>which could be parsed through the following rule definition:<code><font color="#000000">
  31. </font></code> </p>
  32. <pre><span class=identifier> </span><span class=identifier>rule</span><span class=special>&lt;&gt; </span><span class=identifier>c_comment_rule
  33. </span><span class=special>= </span><span class=identifier>confix_p</span><span class=special>(</span><span class=literal>"/*"</span><span class=special>, </span><span class=special>*</span><span class=identifier>anychar_p</span><span class=special>, </span><span class=literal>"*/"</span><span class=special>)
  34. </span><span class=special>;</span></pre>
  35. <p>The <tt>confix_p</tt> parser generator
  36. should be used for generating the required Confix Parser. The
  37. three parameters to <tt>confix_p</tt> can be single
  38. characters (as above), strings or, if more complex parsing logic is required,
  39. auxiliary parsers, each of which is automatically converted to the corresponding
  40. parser type needed for successful parsing.</p>
  41. <p>The generated parser is equivalent to the following rule: </p>
  42. <pre><code> <span class=identifier>open </span><span class=special>&gt;&gt; (</span><span class=identifier>expr </span><span class=special>- </span><span class=identifier>close</span><span class=special>) &gt;&gt; </span><span class=identifier>close</span></code></pre>
  43. <p>If the expr parser is an <tt>action_parser_category</tt> type parser (a parser
  44. with an attached semantic action) we have to do something special. This happens,
  45. if the user wrote something like:</p>
  46. <pre><code><span class=identifier> confix_p</span><span class=special>(</span><span class=identifier>open</span><span class=special>, </span><span class=identifier>expr</span><span class=special>[</span><span class=identifier>func</span><span class=special>], </span><span class=identifier>close</span><span class=special>)</span></code></pre>
  47. <p>where <code>expr</code> is the parser matching the expr of the confix sequence
  48. and <code>func</code> is a functor to be called after matching the <code>expr</code>.
  49. If we would do nothing, the resulting code would parse the sequence as follows:</p>
  50. <pre><code> <span class=identifier>open </span><span class=special>&gt;&gt; (</span><span class=identifier>expr</span><span class=special>[</span><span class=identifier>func</span><span class=special>] - </span><span class=identifier>close</span><span class=special>) &gt;&gt; </span><span class=identifier>close</span></code></pre>
  51. <p>which in most cases is not what the user expects. (If this <u>is</u> what you've
  52. expected, then please use the <tt>confix_p</tt> generator
  53. function <tt>direct()</tt>, which will inhibit the parser refactoring). To make
  54. the confix parser behave as expected:</p>
  55. <pre><code><span class=identifier> open </span><span class=special>&gt;&gt; (</span><span class=identifier>expr </span><span class=special>- </span><span class=identifier>close</span><span class=special>)[</span><span class=identifier>func</span><span class=special>] &gt;&gt; </span><span class=identifier>close</span></code></pre>
  56. <p>the actor attached to the <code>expr</code> parser has to be re-attached to
  57. the <code>(expr - close)</code> parser construct, which will make the resulting
  58. confix parser 'do the right thing'. This refactoring is done by the help of
  59. the <a href="refactoring.html">Refactoring Parsers</a>. Additionally special
  60. care must be taken, if the expr parser is a <tt>unary_parser_category</tt> type
  61. parser as </p>
  62. <pre><code><span class=identifier> confix_p</span><span class=special>(</span><span class=identifier>open</span><span class=special>, *</span><span class=identifier>anychar_p</span><span class=special>, </span><span class=identifier>close</span><span class=special>)</span></code></pre>
  63. <p>which without any refactoring would result in </p>
  64. <pre><code> <span class=identifier>open</span> <span class=special>&gt;&gt; (*</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>close</span><span class=special>) &gt;&gt; </span><span class=identifier>close</span></code></pre>
  65. <p>and will not give the expected result (*anychar_p will eat up all the input up
  66. to the end of the input stream). So we have to refactor this into:
  67. <pre><code><span class=identifier> open </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>close</span><span class=special>) &gt;&gt; </span><span class=identifier>close</span></code></pre>
  68. <p>what will give the correct result. </p>
  69. <p>The case, where the expr parser is a combination of the two mentioned problems
  70. (i.e. the expr parser is a unary parser with an attached action), is handled
  71. accordingly too, so: </p>
  72. <pre><code><span class=identifier> confix_p</span><span class=special>(</span><span class=identifier>open</span><span class=special>, (*</span><span class=identifier>anychar_p</span><span class=special>)[</span><span class=identifier>func</span><span class=special>], </span>close<span class=special>)</span></code></pre>
  73. <p>will be parsed as expected: </p>
  74. <pre><code> <span class=identifier>open</span> <span class=special>&gt;&gt; (*(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=identifier>end</span><span class=special>))[</span><span class=identifier>func</span><span class=special>] &gt;&gt; </span>close</code></pre>
  75. <p>The required refactoring is implemented here with the help of the <a href="refactoring.html">Refactoring
  76. Parsers</a> too.</p>
  77. <table width="90%" border="0" align="center">
  78. <tr>
  79. <td colspan="2" class="table_title"><b>Summary of Confix Parser refactorings</b></td>
  80. </tr>
  81. <tr class="table_title">
  82. <td width="40%"><b>You write it as:</b></td>
  83. <td width="60%"><code><font face="Verdana, Arial, Helvetica, sans-serif">It
  84. is refactored to:</font></code></td>
  85. </tr>
  86. <tr>
  87. <td width="40%" class="table_cells"><code>confix_p<span class="special">(</span><span class=identifier>open</span><span class="special">,</span>
  88. expr<span class="special">,</span> close<span class="special">)</span></code></td>
  89. <td width="60%" class="table_cells"> <p><code>open <span class=special>&gt;&gt;
  90. (</span>expr <span class=special>-</span> close<span class=special>)</span><font color="#0000FF">
  91. </font><span class=special>&gt;&gt;</span> close</code></p>
  92. </td>
  93. </tr>
  94. <tr>
  95. <td width="40%" class="table_cells"><code>confix_p<span class="special">(</span><span class=identifier>open</span><span class="special">,</span>
  96. expr<span class="special">[</span>func<span class="special">],</span> close<span class="special">)</span></code></td>
  97. <td width="60%" class="table_cells"> <p><code>open <span class=special>&gt;&gt;
  98. (</span>expr <span class=special>-</span> close<span class="special">)[</span>func<span class="special">]
  99. <font color="#0000FF" class="special">&gt;&gt;</font></span> close</code></p>
  100. </td>
  101. </tr>
  102. <tr>
  103. <td width="40%" class="table_cells" height="9"><code>confix_p<span class="special">(</span><span class=identifier>open</span><span class="special">,
  104. *</span>expr<span class="special">,</span> close<span class="special">)</span></code></td>
  105. <td width="60%" class="table_cells" height="9"> <p><code>open <font color="#0000FF"><span class="special">&gt;&gt;</span></font>
  106. <span class="special"><font color="#0000FF" class="special">*</font>(</span>expr
  107. <font color="#0000FF" class="special">-</font> close<span class="special">)
  108. <font color="#0000FF" class="special">&gt;&gt;</font></span> close</code></p>
  109. </td>
  110. </tr>
  111. <tr>
  112. <td width="40%" class="table_cells"><code>confix_p<span class="special">(</span><span class=identifier>open</span><span class="special">,
  113. (*</span>expr<span class="special">)[</span>func<span class="special">],
  114. close</span><span class="special">)</span></code></td>
  115. <td width="60%" class="table_cells"> <p><code>open <font color="#0000FF"><span class="special">&gt;&gt;</span></font><span class="special">
  116. (<font color="#0000FF" class="special">*</font>(</span>expr <font color="#0000FF" class="special">-</font>
  117. close<span class="special">))[</span>func<span class="special">] <font color="#0000FF" class="special">&gt;&gt;</font></span>
  118. close</code></p>
  119. </td>
  120. </tr>
  121. </table>
  122. <p><a name="comment_parsers"></a><b>Comment Parsers</b></p>
  123. <p>The Comment Parser generator template <tt>comment_p</tt>
  124. is helper for generating a correct <a href="#confix_parser">Confix Parser</a>
  125. from auxiliary parameters, which is able to parse comment constructs as follows:
  126. </p>
  127. <pre><code> StartCommentToken <span class="special">&gt;&gt;</span> Comment text <span class="special">&gt;&gt;</span> EndCommentToken</code></pre>
  128. <p>There are the following types supported as parameters: parsers, single
  129. characters and strings (see as_parser). If it
  130. is used with one parameter, a comment starting with the given first parser
  131. parameter up to the end of the line is matched. So for instance the following
  132. parser matches C++ style comments:</p>
  133. <pre><code><span class=identifier> comment_p</span><span class=special>(</span><span class=string>"//"</span><span class=special>)</span></code></pre>
  134. <p>If it is used with two parameters, a comment starting with the first parser
  135. parameter up to the second parser parameter is matched. For instance a C style
  136. comment parser could be constrcuted as:</p>
  137. <pre><code> <span class=identifier>comment_p</span><span class=special>(</span><span class=string>"/*"</span><span class=special>, </span><span class=string>"*/"</span><span class=special>)</span></code></pre>
  138. <p>The <tt>comment_p</tt> parser generator allows to generate parsers for matching
  139. non-nested comments (as for C/C++ comments). Sometimes it is necessary to parse
  140. nested comments as for instance allowed in Pascal.</p>
  141. <pre><code class="comment"> { This is a { nested } PASCAL-comment }</code></pre>
  142. <p>Such nested comments are
  143. parseable through parsers generated by the <tt>comment_nest_p</tt> generator
  144. template functor. The following example shows a parser, which can be used for
  145. parsing the two different (nestable) Pascal comment styles:</p>
  146. <pre><code> <span class=identifier>rule</span><span class=special>&lt;&gt; </span><span class=identifier>pascal_comment
  147. </span><span class=special>= </span><span class=identifier>comment_nest_p</span><span class=special>(</span><span class=string>"(*"</span><span class=special>, </span><span class=string>"*)"</span><span class=special>)
  148. | </span><span class=identifier>comment_nest_p</span><span class=special>(</span><span class=literal>'{'</span><span class=special>, </span><span class=literal>'}'</span><span class=special>)
  149. ;</span></code></pre>
  150. <p>Please note, that a comment is parsed implicitly as if the whole <tt>comment_p(...)</tt>
  151. statement were embedded into a <tt>lexeme_d[]</tt> directive, i.e. during parsing
  152. of a comment no token skipping will occur, even if you've defined a skip parser
  153. for your whole parsing process.</p>
  154. <p> <img height="16" width="15" src="theme/lens.gif"> <a href="../example/fundamental/comments.cpp">comments.cpp</a> demonstrates various comment parsing schemes: </p>
  155. <ol>
  156. <li>Parsing of different comment styles </li>
  157. <ul>
  158. <li>parsing C/C++-style comment</li>
  159. <li>parsing C++-style comment</li>
  160. <li>parsing PASCAL-style comment</li>
  161. </ul>
  162. <li>Parsing tagged data with the help of the confix_parser</li>
  163. <li>Parsing tagged data with the help of the confix_parser but the semantic<br>
  164. action is directly attached to the body sequence parser</li>
  165. </ol>
  166. <p>This is part of the Spirit distribution.</p>
  167. <table border="0">
  168. <tr>
  169. <td width="10"></td>
  170. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  171. <td width="30"><a href="character_sets.html"><img src="theme/l_arr.gif" border="0"></a></td>
  172. <td width="30"><a href="list_parsers.html"><img src="theme/r_arr.gif" border="0"></a></td>
  173. </tr>
  174. </table>
  175. <br>
  176. <hr size="1">
  177. <p class="copyright">Copyright &copy; 2001-2002 Hartmut Kaiser<br>
  178. <br>
  179. <font size="2">Use, modification and distribution is subject to the Boost Software
  180. License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
  181. http://www.boost.org/LICENSE_1_0.txt) </font> </p>
  182. </body>
  183. </html>