indepth_the_scanner.html 21 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290
  1. <html>
  2. <head>
  3. <title>In-depth The Scanner</title>
  4. <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  5. <link rel="stylesheet" href="theme/style.css" type="text/css">
  6. </head>
  7. <body>
  8. <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
  9. <tr>
  10. <td width="10">
  11. </td>
  12. <td width="85%"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>In-depth:
  13. The Scanner</b></font> </td>
  14. <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
  15. </tr>
  16. </table>
  17. <br>
  18. <table border="0">
  19. <tr>
  20. <td width="10"></td>
  21. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  22. <td width="30"><a href="indepth_the_parser.html"><img src="theme/l_arr.gif" border="0"></a></td>
  23. <td width="30"><a href="indepth_the_parser_context.html"><img src="theme/r_arr.gif" border="0"></a></td>
  24. </tr>
  25. </table>
  26. <h2>Basic Scanner API </h2>
  27. <table width="90%" border="0" align="center">
  28. <tr>
  29. <td class="table_title" colspan="10"> class scanner </td>
  30. </tr>
  31. <tr>
  32. <tr>
  33. <td class="table_cells"><code><span class=identifier>value_t</span></code></td>
  34. <td class="table_cells">typedef: The value type of the scanner's iterator</td>
  35. </tr>
  36. <td class="table_cells"><code><span class=identifier>ref_t</span></code></td>
  37. <td class="table_cells">typedef: The reference type of the scanner's iterator</td>
  38. </tr>
  39. <td class="table_cells"><code><span class=keyword>bool </span><span class=identifier>at_end</span><span class=special>()
  40. </span><span class=keyword>const</span></code></td>
  41. <td class="table_cells">Returns true if the input is exhausted</td>
  42. </tr>
  43. <td class="table_cells"><code><span class=identifier>value_t </span><span class=keyword>operator</span><span class=special>*()
  44. </span><span class=keyword>const</span></code></td>
  45. <td class="table_cells">Dereference/get a <code><span class=identifier>value_t</span></code>
  46. from the input</td>
  47. </tr>
  48. <td class="table_cells"><code><span class=keyword> </span><span class=identifier>scanner
  49. </span><span class=keyword>const</span><span class=special>&amp; </span><span class=keyword>operator</span><span class=special>++()</span></code></td>
  50. <td class="table_cells">move the scanner forward</td>
  51. </tr>
  52. <tr>
  53. <td class="table_cells"><code><span class=identifier>IteratorT&amp; first</span><span class=special></span></code></td>
  54. <td class="table_cells">The iterator pointing to the current input position.
  55. Held by reference</td>
  56. </tr>
  57. <tr>
  58. <td class="table_cells"><code><span class=identifier>IteratorT </span><span class=keyword>const</span>
  59. <span class=identifier>last</span><span class=special></span></code></td>
  60. <td class="table_cells">The iterator pointing to the end of the input. Held
  61. by value</td>
  62. </tr>
  63. </table>
  64. <p> The basic behavior of the scanner is handled by policies. The actual execution
  65. of the scanner's public member functions listed in the table above is implemented
  66. by the scanner policies.</p>
  67. <p> Three sets of policies govern the behavior of the scanner. These policies
  68. make it possible to extend Spirit non-intrusively. The scanner policies allow
  69. the core-functionality to be extended without requiring any potentially destabilizing
  70. changes to the code. A library writer might provide her own policies that override
  71. the ones that are already in place to fine tune the parsing process
  72. to fit her own needs. Layers above the core might also want to take advantage
  73. of this policy based machanism. Abstract syntax tree generation, debuggers and
  74. lexers come to mind.</p>
  75. <p> There are three sets of policies that govern:</p>
  76. <ul>
  77. <li>Iteration and filtering</li>
  78. <li>Recognition and matching</li>
  79. <li>Handling semantic actions</li>
  80. </ul>
  81. <a name="iteration_policy"></a>
  82. <h2>iteration_policy</h2>
  83. <p> Here are the default policies that govern iteration and filtering:</p>
  84. <pre>
  85. <code><span class=keyword>struct </span><span class=identifier>iteration_policy
  86. </span><span class=special>{
  87. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  88. </span><span class=keyword>void
  89. </span><span class=identifier>advance</span><span class=special>(</span><span class=identifier>ScannerT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>scan</span><span class=special>) </span><span class=keyword>const
  90. </span><span class=special>{ </span><span class=special>++</span><span class=identifier>scan</span><span class=special>.</span><span class=identifier>first</span><span class=special>; </span><span class=special>}
  91. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  92. </span><span class=keyword>bool </span><span class=identifier>at_end</span><span class=special>(</span><span class=identifier>ScannerT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>scan</span><span class=special>) </span><span class=keyword>const
  93. </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>scan</span><span class=special>.</span><span class=identifier>first </span><span class=special>== </span><span class=identifier>scan</span><span class=special>.</span><span class=identifier>last</span><span class=special>; </span><span class=special>}
  94. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>T</span><span class=special>&gt;
  95. </span><span class=identifier>T </span><span class=identifier>filter</span><span class=special>(</span><span class=identifier>T </span><span class=identifier>ch</span><span class=special>) </span><span class=keyword>const
  96. </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>ch</span><span class=special>; </span><span class=special>}
  97. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  98. </span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>::</span><span class=identifier>ref_t
  99. </span><span class=identifier>get</span><span class=special>(</span><span class=identifier>ScannerT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>scan</span><span class=special>) </span><span class=keyword>const
  100. </span><span class=special>{ </span><span class=keyword>return </span><span class=special>*</span><span class=identifier>scan</span><span class=special>.</span><span class=identifier>first</span><span class=special>; </span><span class=special>}
  101. </span><span class=special>};</span></code></pre>
  102. <table width="90%" border="0" align="center">
  103. <tr>
  104. <td class="table_title" colspan="8"> Iteration and filtering policies </td>
  105. </tr>
  106. <tr>
  107. <tr>
  108. <td class="table_cells"><b>advance</b></td>
  109. <td class="table_cells">Move the iterator forward</td>
  110. </tr>
  111. <td class="table_cells"><b>at_end</b></td>
  112. <td class="table_cells">Return true if the input is exhausted</td>
  113. </tr>
  114. <td class="table_cells"><b>filter</b></td>
  115. <td class="table_cells">Filter a character read from the input</td>
  116. </tr>
  117. <td class="table_cells"><b>get</b></td>
  118. <td class="table_cells">Read a character from the input</td>
  119. </tr>
  120. </table>
  121. <p> The following code snippet demonstrates a simple policy that converts all
  122. characters to lower case:</p>
  123. <pre>
  124. <code><span class=keyword>struct </span><span class=identifier>inhibit_case_iteration_policy </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>iteration_policy
  125. </span><span class=special>{
  126. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>CharT</span><span class=special>&gt;
  127. </span><span class=identifier>CharT filter</span><span class=special>(</span><span class=identifier>CharT ch</span><span class=special>) </span><span class=keyword>const
  128. </span><span class=special>{
  129. </span><span class=keyword>return </span>std::<span class=identifier>tolower</span><span class=special>(</span><span class=identifier>ch</span><span class=special>);
  130. }
  131. };</span></code></pre>
  132. <a name="match_policy"></a>
  133. <h2>match_policy</h2>
  134. <p> Here are the default policies that govern recognition and matching:</p>
  135. <pre>
  136. <code><span class=keyword>struct </span><span class=identifier>match_policy
  137. </span><span class=special>{
  138. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>T</span><span class=special>&gt;
  139. </span><span class=keyword>struct </span><span class=identifier>result </span><span class=special>
  140. {
  141. </span><span class=keyword>typedef </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>T</span><span class=special>&gt; </span><span class=identifier>type</span><span class=special>; </span><span class=special>
  142. };
  143. </span><span class=keyword>const </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>nil_t</span><span class=special>&gt;
  144. </span><span class=identifier>no_match</span><span class=special>() </span><span class=keyword>const
  145. </span><span class=special>{ </span><span class=keyword>
  146. return </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>nil_t</span><span class=special>&gt;(); </span><span class=special>
  147. }
  148. </span><span class=keyword>const </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>nil_t</span><span class=special>&gt;
  149. </span><span class=identifier>empty_match</span><span class=special>() </span><span class=keyword>const
  150. </span><span class=special>{ </span><span class=keyword>
  151. return </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>nil_t</span><span class=special>&gt;(</span><span class=number>0</span><span class=special>, </span><span class=identifier>nil_t</span><span class=special>());
  152. </span><span class=special>}
  153. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>AttrT</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>IteratorT</span><span class=special>&gt;
  154. </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>AttrT</span><span class=special>&gt;
  155. </span><span class=identifier>create_match</span><span class=special>(
  156. </span><span class=keyword>std::size_t </span><span class=identifier>length</span><span class=special>,
  157. </span><span class=identifier>AttrT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>val</span><span class=special>,
  158. </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=comment>/*first*/</span><span class=special>,
  159. </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=comment>/*last*/</span><span class=special>) </span><span class=keyword>const
  160. </span><span class=special>{ </span><span class=keyword>
  161. return </span><span class=identifier>match</span><span class=special>&lt;</span><span class=identifier>AttrT</span><span class=special>&gt;(</span><span class=identifier>length</span><span class=special>, </span><span class=identifier>val</span><span class=special>); </span><span class=special>
  162. }
  163. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>MatchT</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>IteratorT</span><span class=special>&gt;
  164. </span><span class=keyword>void
  165. </span><span class=identifier>group_match</span><span class=special>(
  166. </span><span class=identifier>MatchT</span><span class=special>&amp; </span><span class=comment>/*m*/</span><span class=special>,
  167. </span><span class=identifier>parser_id </span><span class=keyword>const</span><span class=special>&amp; </span><span class=comment>/*id*/</span><span class=special>,
  168. </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=comment>/*first*/</span><span class=special>,
  169. </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=comment>/*last*/</span><span class=special>) </span><span class=keyword>const </span><span class=special>{}
  170. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>Match1T</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>Match2T</span><span class=special>&gt;
  171. </span><span class=keyword>void
  172. </span><span class=identifier>concat_match</span><span class=special>(</span><span class=identifier>Match1T</span><span class=special>&amp; </span><span class=identifier>l</span><span class=special>, </span><span class=identifier>Match2T </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>r</span><span class=special>) </span><span class=keyword>const
  173. </span><span class=special>{ </span><span class=identifier>
  174. l</span><span class=special>.</span><span class=identifier>concat</span><span class=special>(</span><span class=identifier>r</span><span class=special>);
  175. </span><span class=special>}
  176. </span><span class=special>};</span></code></pre>
  177. <table width="90%" border="0" align="center">
  178. <tr>
  179. <td class="table_title" colspan="12"> Recognition and matching </td>
  180. </tr>
  181. <tr>
  182. <tr>
  183. <td class="table_cells"><b>result</b></td>
  184. <td class="table_cells">A metafunction that returns a match type given an
  185. attribute type (see In-depth: The Parser)</td>
  186. </tr>
  187. <td class="table_cells"><b>no_match</b></td>
  188. <td class="table_cells">Create a failed match</td>
  189. </tr>
  190. <td class="table_cells"><b>empty_match</b></td>
  191. <td class="table_cells">Create an empty match. An empty match is a successful
  192. epsilon match (matching length == 0)</td>
  193. </tr>
  194. <td class="table_cells"><b>create_match</b></td>
  195. <td class="table_cells">Create a match given the matching length, an attribute
  196. and the iterator pair pointing to the matching portion of the input</td>
  197. </tr>
  198. <td class="table_cells"><b>group_match</b></td>
  199. <td class="table_cells">For non terminals such as rules, this is called after
  200. a successful match has been made to allow post processing</td>
  201. </tr>
  202. <td class="table_cells"><b>concat_match</b></td>
  203. <td class="table_cells">Concatenate two match objects</td>
  204. </tr>
  205. </table>
  206. <a name="action_policy"></a>
  207. <h2>action_policy</h2>
  208. <p> The action policy has only one function for handling semantic actions:</p>
  209. <pre>
  210. <code><span class=keyword>struct </span><span class=identifier>action_policy
  211. </span><span class=special>{
  212. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ActorT</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>AttrT</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>IteratorT</span><span class=special>&gt;
  213. </span><span class=keyword>void
  214. </span><span class=identifier>do_action</span><span class=special>(
  215. </span><span class=identifier>ActorT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>actor</span><span class=special>,
  216. </span><span class=identifier>AttrT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>val</span><span class=special>,
  217. </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>first</span><span class=special>,
  218. </span><span class=identifier>IteratorT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>last</span><span class=special>) </span><span class=keyword>const</span><span class=special>;
  219. </span><span class=special>};</span></code></pre>
  220. <p> The default action policy forwards to:</p>
  221. <pre>
  222. <code><span class=identifier>actor</span><span class=special>(</span><span class=identifier>first</span><span class=special>, </span><span class=identifier>last</span><span class=special>);</span></code></pre>
  223. <p> If the attribute <tt>val</tt> is of type nil_t. Otherwise:</p>
  224. <pre>
  225. <code><span class=identifier>actor</span><span class=special>(</span><span class=identifier>val</span><span class=special>);</span></code></pre>
  226. <a name="scanner_policies_mixer"></a>
  227. <h3>scanner_policies mixer</h3>
  228. <p> The class <tt>scanner_policies</tt> combines the three scanner policy classes
  229. above into one:</p>
  230. <pre>
  231. <code><span class=keyword>template </span><span class=special>&lt;
  232. </span><span class=keyword>typename </span><span class=identifier>IterationPolicyT </span><span class=special>= </span><span class=identifier>iteration_policy</span><span class=special>,
  233. </span><span class=keyword>typename </span><span class=identifier>MatchPolicyT </span><span class=special>= </span><span class=identifier>match_policy</span><span class=special>,
  234. </span><span class=keyword>typename </span><span class=identifier>ActionPolicyT </span><span class=special>= </span><span class=identifier>action_policy</span><span class=special>&gt;
  235. </span><span class=keyword>struct </span><span class=identifier>scanner_policies</span><span class=special>;
  236. </span></code></pre>
  237. <p> This <i>mixer</i> class inherits from all the three policies. This scanner_policies
  238. class is then used to parameterize the scanner:</p>
  239. <pre>
  240. <code><span class=keyword>template </span><span class=special>&lt;
  241. </span><span class=keyword>typename </span><span class=identifier>IteratorT </span><span class=special>= </span><span class=keyword>char </span><span class=keyword>const</span><span class=special>*,
  242. </span><span class=keyword>typename </span><span class=identifier>PoliciesT </span><span class=special>= </span><span class=identifier>scanner_policies</span><span class=special>&lt;&gt; </span><span class=special>&gt;
  243. </span><span class=keyword>class </span><span class=identifier>scanner</span><span class=special>;
  244. </span></code></pre>
  245. <p> The scanner in turn inherits from the PoliciesT.</p>
  246. <a name="rebinding_policies"></a>
  247. <h3>Rebinding Policies</h3>
  248. <p> The scanner can be made to rebind to a different set of policies anytime.
  249. It has a member function <tt>change_policies(new_policies)</tt>. Given a new
  250. set of policies, this member function creates a new scanner with the new set
  251. of policies. The result type of the <i>rebound</i> scanner can be can be obtained
  252. by calling the metafunction:</p>
  253. <pre>
  254. <code><span class=identifier>rebind_scanner_policies</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>, </span><span class=identifier>PoliciesT</span><span class=special>&gt;::</span><span class=identifier>type</span></code></pre>
  255. <a name="rebinding_iterators"></a>
  256. <h3>Rebinding Iterators</h3>
  257. <p> The scanner can also be made to rebind to a different iterator type anytime.
  258. It has a member function <tt>change_iterator(first, last)</tt>. Given a new
  259. pair of iterator of type different from the ones held by the scanner, this member
  260. function creates a new scanner with the new pair of iterators. The result type
  261. of the <i>rebound</i> scanner can be can be obtained by calling the metafunction:</p>
  262. <pre>
  263. <code><span class=identifier>rebind_scanner_iterator</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>, </span><span class=identifier>IteratorT</span><span class=special>&gt;::</span><span class=identifier>type</span></code></pre>
  264. <table border="0">
  265. <tr>
  266. <td width="10"></td>
  267. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  268. <td width="30"><a href="indepth_the_parser.html"><img src="theme/l_arr.gif" border="0"></a></td>
  269. <td width="30"><a href="indepth_the_parser_context.html"><img src="theme/r_arr.gif" border="0"></a></td>
  270. </tr>
  271. </table>
  272. <br>
  273. <hr size="1">
  274. <p class="copyright">Copyright &copy; 1998-2003 Joel de Guzman<br>
  275. <br>
  276. <font size="2">Use, modification and distribution is subject to the Boost Software
  277. License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
  278. http://www.boost.org/LICENSE_1_0.txt)</font></p>
  279. <p class="copyright">&nbsp;</p>
  280. </body>
  281. </html>