indepth_the_parser.html 22 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287
  1. <html>
  2. <head>
  3. <title>In-depth: The Parser</title>
  4. <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  5. <link rel="stylesheet" href="theme/style.css" type="text/css">
  6. </head>
  7. <body>
  8. <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
  9. <tr>
  10. <td width="10">
  11. </td>
  12. <td width="85%">
  13. <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>In-depth: The Parser</b></font>
  14. </td>
  15. <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
  16. </tr>
  17. </table>
  18. <br>
  19. <table border="0">
  20. <tr>
  21. <td width="10"></td>
  22. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  23. <td width="30"><a href="semantic_actions.html"><img src="theme/l_arr.gif" border="0"></a></td>
  24. <td width="30"><a href="indepth_the_scanner.html"><img src="theme/r_arr.gif" border="0"></a></td>
  25. </tr>
  26. </table>
  27. <p>What makes Spirit tick? Now on to some details... The parser class is the most
  28. fundamental entity in the framework. A parser accepts a scanner comprised of
  29. a first-last iterator pair and returns a match object as its result. The iterators
  30. delimit the data currently being parsed. The match object evaluates to true
  31. if the parse succeeds, in which case the input is advanced accordingly. Each
  32. parser can represent a specific pattern or algorithm, or it can be a more complex
  33. parser formed as a composition of other parsers.</p>
  34. <p>All parsers inherit from the base template class, parser:</p>
  35. <pre>
  36. <span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>DerivedT</span><span class=special>&gt;
  37. </span><span class=keyword>struct </span><span class=identifier>parser
  38. </span><span class=special>{
  39. </span><span class=comment>/*...*/
  40. </span><span class=identifier>DerivedT</span><span class=special>&amp; </span><span class=identifier>derived</span><span class=special>();
  41. </span><span class=identifier>DerivedT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>derived</span><span class=special>() </span><span class=keyword>const</span><span class=special>;
  42. </span><span class=special>};</span></pre>
  43. <p>This class is a protocol base class for all parsers. The parser class does
  44. not really know how to parse anything but instead relies on the template parameter
  45. <tt>DerivedT</tt> to do the actual parsing. This technique is known as the <a href="references.html#curious_recurring">&quot;Curiously
  46. Recurring Template Pattern&quot;</a> in template meta-programming circles. This
  47. inheritance strategy gives us the power of polymorphism without the virtual
  48. function overhead. In essence this is a way to implement <a href="references.html#generic_patterns">compile
  49. time polymorphism</a>.</p>
  50. <h2> parser_category_t</h2>
  51. <p> Each derived parser has a typedef <tt>parser_category_t</tt> that defines
  52. its category. By default, if one is not specified, it will inherit from the
  53. base parser class which typedefs its parser_category_t as <tt>plain_parser_category</tt>.
  54. Some template classes are provided to distinguish different types of parsers.
  55. The following categories are the most generic. More specific types may inherit
  56. from these.</p>
  57. <table width="90%" border="0" align="center">
  58. <tr>
  59. <td colspan="2" class="table_title">Parser categories</td>
  60. </tr>
  61. <tr>
  62. <td class="table_cells" width="33%"><tt>plain_parser_category</tt></td>
  63. <td class="table_cells" width="67%">Your plain vanilla parser</td>
  64. </tr>
  65. <tr>
  66. <td class="table_cells" width="33%"><tt>binary_parser_category</tt></td>
  67. <td class="table_cells" width="67%">A parser that has subject a and b (e.g.
  68. alternative)</td>
  69. </tr>
  70. <tr>
  71. <td class="table_cells" width="33%"><tt>unary_parser_category</tt></td>
  72. <td class="table_cells" width="67%">A parser that has single subject (e.g.
  73. kleene star)</td>
  74. </tr>
  75. <tr>
  76. <td class="table_cells" width="33%"><tt>action_parser_category</tt></td>
  77. <td class="table_cells" width="67%">A parser with an attached semantic action</td>
  78. </tr>
  79. </table>
  80. <pre><span class=identifier> </span><span class=keyword>struct </span><span class=identifier>plain_parser_category </span><span class=special>{};
  81. </span><span class=keyword>struct </span><span class=identifier>binary_parser_category </span><span class=special>: </span><span class=identifier>plain_parser_category </span><span class=special>{};
  82. </span><span class=keyword>struct </span><span class=identifier>unary_parser_category </span><span class=special>: </span><span class=identifier>plain_parser_category </span><span class=special>{};
  83. </span><span class=keyword>struct </span><span class=identifier>action_parser_category </span><span class=special>: </span><span class=identifier>unary_parser_category </span><span class=special>{};</span></pre>
  84. <h2>embed_t</h2>
  85. <p>Each parser has a typedef <tt>embed_t</tt>. This typedef specifies how a parser
  86. is embedded in a composite. By default, if one is not specified, the parser
  87. will be embedded by value. That is, a copy of the parser is placed as a member
  88. variable of the composite. Most parsers are embedded by value. In certain situations
  89. however, this is not desirable or possible. One particular example is the <a href="rule.html">rule</a>.
  90. The rule, unlike other parsers is embedded by reference.</p>
  91. <h2><a name="match"></a>The match</h2>
  92. <p>The match holds the result of a parser. A match object evaluates to true when
  93. a succesful match is found, otherwise false. The length of the match is the
  94. number of characters (or tokens) that is successfully matched. This can be queried
  95. through its <tt>length()</tt> member function. A negative value means that the
  96. match is unsucessful. </p>
  97. <p> Each parser may have an associated attribute. This attribute is also returned
  98. back to the client on a successful parse through the match object. We can get
  99. this attribute via the match's <tt>value()</tt> member function. Be warned though
  100. that the match's attribute may be invalid, in which case, getting the attribute
  101. will result in an exception. The member function <tt>has_valid_attribute()</tt>
  102. can be queried to know if it is safe to get the match's attribute. The attribute
  103. may be set anytime through the member function <tt>value(v)</tt>where <tt>v</tt>
  104. is the new attribute value.<br>
  105. <br>
  106. A match attribute is valid:</p>
  107. <ul>
  108. <li> on a successful match</li>
  109. <li>when its value is set through the <tt>value(val)</tt> member function</li>
  110. <li> if it is assigned or copied from a compatible match object (e.g. <tt>match&lt;double&gt;</tt>
  111. from <tt>match&lt;int&gt;</tt>) with a valid attribute. A match object <tt>A</tt>
  112. is compatible with another match object <tt>B</tt> if the attribute type of
  113. <tt>A</tt> can be assigned from the attribute type of <tt></tt> <tt>B</tt>
  114. (i.e. <tt>a = b;</tt> must compile).</li>
  115. </ul>
  116. <p>The match attribute is undefined:</p>
  117. <ul>
  118. <li>on an unsuccessful match </li>
  119. <li>when an attempt to copy or assign from another match object with an incompatible
  120. attribute type (e.g. <tt>match&lt;std::string&gt;</tt> from <tt>match&lt;int&gt;</tt>).</li>
  121. </ul>
  122. <h3>The match class:</h3>
  123. <pre><span class=keyword> template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>T</span><span class=special>&gt;
  124. </span><span class=keyword> class </span><span class=identifier>match
  125. </span><span class=keyword> </span><span class=special>{
  126. </span><span class=keyword> public</span><span class=special>:
  127. </span><span class=keyword> </span><span class=comment>/*...*/
  128. </span><span class=special> </span><span class=keyword> typedef</span><span class="identifier"> T attr_t</span><span class="special">;<br>
  129. </span><span class=keyword> </span><span class="special"> </span><span class=keyword>operator safe_bool</span><span class=special>() </span><span class=keyword>const</span>; <span class="comment">// convertible to a bool</span>
  130. <span class=keyword> int </span><span class=identifier>length</span><span class=special>() </span><span class=keyword>const</span>;
  131. <span class="keyword">bool</span> has_valid_attribute<span class="special">()</span> <span class="keyword">const</span><span class="special">;</span>
  132. <span class=keyword> </span> <span class=identifier>void</span><span class=special> </span><span class=identifier>value</span><span class=special>(</span><span class="identifier">T </span><span class="keyword">const</span><span class=special>&amp;) </span><span class=keyword>const;
  133. </span><span class=identifier>T </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>value</span><span class=special>();
  134. </span><span class=keyword> </span><span class=special>};</span></pre>
  135. <h2>match_result</h2>
  136. <p>It has been mentioned repeatedly that the parser returns a match object as
  137. its result. This is a simplification. Actually, for the sake of genericity,
  138. parsers are really not hard-coded to return a match object. More accurately,
  139. a parser returns an object that adheres to a conceptual interface, of which
  140. the match is an example. Nevertheless, we shall call the result type of a parser
  141. a match object regardless if it is actually a match class, a derivative or a
  142. totally unrelated type.</p>
  143. <table width="80%" border="0" align="center">
  144. <tr>
  145. <td class="note_box"><img src="theme/lens.gif" width="15" height="16"> <b>Meta-functions</b><br>
  146. <br>
  147. What are meta-functions? We all know how functions look like. In simplest
  148. terms, a function accepts some arguments and returns a result. Here is the
  149. function we all love so much:<br>
  150. <br>
  151. <code><span class="keyword">int</span> identity_func<span class="special">(</span><span class="keyword">int</span>
  152. arg<span class="special">)</span><br>
  153. <span class="special">{</span> <span class="keyword">return</span> arg<span class="special">;
  154. }</span> <span class="comment">// return the argument arg</span><br>
  155. </code><br>
  156. Meta-functions are essentially the same. These beasts also accept arguments
  157. and return a result. However, while functions work at runtime on values,
  158. meta-functions work at compile time on types (or constants, but we shall
  159. deal only with types). The meta-function is a template class (or struct).
  160. The template parameters are the arguments to the meta-function and a typedef
  161. within the class is the meta-function's return type. Here is the corresponding
  162. meta-function:<code><br>
  163. <br>
  164. <span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span>
  165. ArgT<span class="special">&gt;</span><br>
  166. <span class="keyword">struct</span> identity_meta_func<br>
  167. <span class="special">{</span> <span class="keyword">typedef</span> ArgT
  168. type<span class="special">; } </span><span class="comment">// return the
  169. argument ArgT</span><br>
  170. <br>
  171. </code>The meta-function above is invoked as:<br>
  172. <br>
  173. <code><span class="keyword">typename</span> identity_meta_func<span class="special">&lt;</span>ArgT<span class="special">&gt;::</span>type</code><br>
  174. <br>
  175. By convention, meta-functions return the result through the typedef <tt>type</tt>.
  176. Take note that <tt>typename</tt> is only required within templates.</td>
  177. </tr>
  178. </table>
  179. <p>The actual match type used by the parser depends on two types: the parser's
  180. attribute type and the scanner type. <tt>match_result</tt> is the meta-function
  181. that returns the desired match type given an attribute type and a scanner type.
  182. </p>
  183. <p>Usage:</p>
  184. <pre> <span class=keyword>typename </span><span class=identifier>match_result</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>, </span><span class=identifier>T</span><span class=special>&gt;::</span><span class=identifier>type</span></pre>
  185. <p>The meta-function basically answers the question &quot;given a scanner type
  186. <tt>ScannerT</tt> and an attribute type <tt>T</tt>, what is the desired match
  187. type?&quot; [<img src="theme/note.gif" width="16" height="16"> <tt>typename</tt>
  188. is only required within templates ].</p>
  189. <h2>The parse member function</h2>
  190. <p>Concrete sub-classes inheriting from parser must have a corresponding member
  191. function <tt>parse(...)</tt> compatible with the conceptual Interface:<br>
  192. </p>
  193. <pre><span class=identifier> </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  194. </span><span class=identifier>RT
  195. </span><span class=identifier>parse</span><span class=special>(</span><span class=identifier>ScannerT</span><span class=special></span> const<span class=special>&amp; </span>scan<span class=identifier></span><span class=special>) </span><span class=keyword>const</span><span class=special>;</span></pre>
  196. <p>where <tt>RT</tt> is the desired return type of the parser. </p>
  197. <h2>The parser result</h2>
  198. <p>Concrete sub-classes inheriting from parser in most cases need to have a nested
  199. meta-function <tt>result</tt> that returns the result <tt>type</tt> of the parser's
  200. parse member function, given a scanner type. The meta-function has the form:</p>
  201. <pre><span class=keyword> template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  202. </span><span class=keyword>struct </span><span class=identifier>result
  203. </span><span class=special>{
  204. </span><span class=keyword>typedef </span>RT <span class=identifier></span><span class=identifier>type</span><span class=special>;
  205. </span><span class=special>};</span></pre>
  206. <p>where <tt>RT</tt> is the desired return type of the parser. This is usually,
  207. but not always, dependent on the template parameter <tt>ScannerT</tt>. For example,
  208. given an attribute type <tt>int</tt>, we can use the match_result metafunction:</p>
  209. <pre><span class=keyword> template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  210. </span><span class=keyword>struct </span><span class=identifier>result
  211. </span><span class=special>{
  212. </span><span class=keyword>typedef typename </span><span class=identifier>match_result</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>, </span><span class="keyword">int</span><span class=special>&gt;::</span><span class=identifier>type type</span><span class=special>;
  213. };</span></pre>
  214. <p>If a parser does not supply a result metafunction, a default is provided by
  215. the base parser class.<span class=special> </span>The default is declared as:</p>
  216. <pre><span class=keyword> template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  217. </span><span class=keyword>struct </span><span class=identifier>result
  218. </span><span class=special>{
  219. </span><span class=keyword>typedef typename </span><span class=identifier>match_result</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>, </span><span class="identifier">nil_t</span><span class=special>&gt;::</span><span class=identifier>type type</span><span class=special>;
  220. };</span></pre>
  221. <p>Without a result metafunction, notice that the parser's default attribute is
  222. <tt>nil_t</tt> (i.e. the parser has no attribute).</p>
  223. <h2><span class=special></span>parser_result</h2>
  224. <p>Given a a scanner type <tt>ScannerT</tt> and a parser type <tt>ParserT</tt>,
  225. what will be the actual result of the parser? The answer to this question is
  226. provided to by the <tt>parser_result</tt> meta-function.</p>
  227. <p>Usage:</p>
  228. <pre> <span class=keyword>typename </span><span class=identifier>parser_result</span><span class=special>&lt;</span><span class=identifier>ParserT, ScannerT</span><span class=special>&gt;::</span><span class=identifier>type</span></pre>
  229. <p>In general, the meta-function just forwards the invocation to the parser's
  230. result meta-function:</p>
  231. <pre><span class=identifier> </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ParserT</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  232. </span><span class=keyword>struct </span><span class=identifier>parser_result
  233. </span><span class=special>{
  234. </span><span class=keyword>typedef </span><span class=keyword>typename </span><span class=identifier>ParserT</span><span class=special>::</span><span class=keyword>template </span><span class=identifier>result</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>&gt;::</span><span class=identifier>type </span><span class=identifier>type</span><span class=special>;
  235. </span><span class=special>};</span></pre>
  236. <p>This is similar to a global function calling a member function. Most of the
  237. time, the usage above is equivalent to:</p>
  238. <pre><span class=keyword> typename </span><span class=identifier>ParserT</span><span class=special>::</span><span class=keyword>template </span><span class=identifier>result</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>&gt;::</span><span class=identifier>type</span></pre>
  239. <p>Yet, this should not be relied upon to be true all the time because the parser_result
  240. metafunction might be specialized for specific parser and/or scanner types.</p>
  241. <p>The parser_result metafunction makes the signature of the required parse member
  242. function almost canonical:</p>
  243. <pre><span class=identifier> </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  244. </span><span class=keyword>typename </span><span class=identifier>parser_result</span><span class=special>&lt;</span><span class=identifier>self_t, ScannerT</span><span class=special>&gt;::</span><span class=identifier>type</span><br> <span class=identifier>parse</span><span class=special>(</span><span class=identifier>ScannerT</span><span class=special></span> const<span class=special>&amp; </span>scan<span class=identifier></span><span class=special>) </span><span class=keyword>const</span><span class=special>;</span></pre>
  245. <p>where<span class=special></span> <tt>self_t</tt> is a typedef to the parser.</p>
  246. <h2>parser class declaration</h2>
  247. <pre><span class=identifier> </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>DerivedT</span><span class=special>&gt;
  248. </span><span class=keyword>struct </span><span class=identifier>parser
  249. </span><span class=special>{
  250. </span><span class=keyword>typedef </span><span class=identifier>DerivedT embed_t</span><span class=special>;
  251. </span><span class=keyword>typedef </span><span class=identifier>DerivedT derived_t</span><span class=special>;
  252. </span><span class=keyword>typedef </span><span class=identifier>plain_parser_category parser_category_t</span><span class=special>;
  253. </span><span class=keyword>template </span><span class=special>&lt;</span><span class="keyword">typename</span> ScannerT<span class=special>&gt;
  254. </span><span class=keyword>struct </span><span class=identifier>result
  255. </span><span class=special>{
  256. </span><span class=keyword>typedef typename </span><span class=identifier>match_result</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>, </span><span class=identifier>nil_t</span><span class=special>&gt;::</span><span class=identifier>type type</span><span class=special>;
  257. };
  258. </span><span class=identifier>DerivedT</span><span class=special>&amp; </span><span class=identifier>derived</span><span class=special>();
  259. </span><span class=identifier>DerivedT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>derived</span><span class=special>() </span><span class=keyword>const</span><span class=special>;
  260. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ActionT</span><span class=special>&gt;
  261. </span><span class=identifier>action</span><span class=special>&lt;</span><span class=identifier>DerivedT</span><span class=special>, </span><span class=identifier>ActionT</span><span class=special>&gt;
  262. </span><span class=keyword>operator</span><span class=special>[](</span><span class=identifier>ActionT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>actor</span><span class=special>) </span><span class=keyword>const</span><span class=special>;
  263. };</span></pre>
  264. <table border="0">
  265. <tr>
  266. <td width="10"></td>
  267. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  268. <td width="30"><a href="semantic_actions.html"><img src="theme/l_arr.gif" border="0"></a></td>
  269. <td width="30"><a href="indepth_the_scanner.html"><img src="theme/r_arr.gif" border="0"></a></td>
  270. </tr>
  271. </table>
  272. <br>
  273. <hr size="1">
  274. <p class="copyright">Copyright &copy; 1998-2003 Joel de Guzman<br>
  275. <br>
  276. <font size="2">Use, modification and distribution is subject to the Boost Software
  277. License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
  278. http://www.boost.org/LICENSE_1_0.txt) </font> </p>
  279. </body>
  280. </html>