distinct.html 8.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122
  1. <html>
  2. <head>
  3. <!-- Generated by the Spirit (http://spirit.sf.net) QuickDoc -->
  4. <title>Distinct Parser</title>
  5. <link rel="stylesheet" href="theme/style.css" type="text/css">
  6. </head>
  7. <body>
  8. <table width="100%" height="48" border="0" background="theme/bkd2.gif" cellspacing="2">
  9. <tr>
  10. <td width="10">
  11. </td>
  12. <td width="85%">
  13. <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Distinct Parser </b></font></td>
  14. <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" align="right" border="0"></a></td>
  15. </tr>
  16. </table>
  17. <br>
  18. <table border="0">
  19. <tr>
  20. <td width="10"></td>
  21. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  22. <td width="30"><a href="scoped_lock.html"><img src="theme/l_arr.gif" border="0"></a></td>
  23. <td width="30"><a href="symbols.html"><img src="theme/r_arr.gif" border="0"></a></td>
  24. </tr>
  25. </table>
  26. <h3>Distinct Parsers</h3><p>
  27. The distinct parsers are utility parsers which ensure that matched input is
  28. not immediately followed by a forbidden pattern. Their typical usage is to
  29. distinguish keywords from identifiers.</p>
  30. <h3>distinct_parser</h3>
  31. <p>
  32. The basic usage of the <tt>distinct_parser</tt> is to replace the <tt>str_p</tt> parser. For
  33. example the <tt>declaration_rule</tt> in the following example:</p>
  34. <pre>
  35. <code><span class=identifier>rule</span><span class=special>&lt;</span><span class="identifier">ScannerT</span><span class=special>&gt; </span><span class=identifier>declaration_rule </span><span class=special>= </span><span class=identifier>str_p</span><span class=special>(</span><span class=string>&quot;declare&quot;</span><span class=special>) &gt;&gt; </span><span class=identifier>lexeme_d</span><span class=special>[+</span><span class=identifier>alpha_p</span><span class=special>];
  36. </span></code></pre>
  37. <p>
  38. would correctly match an input &quot;declare abc&quot;, but as well an input&quot;declareabc&quot; what is usually not intended. In order to avoid this, we can
  39. use <tt>distinct_parser</tt>:</p>
  40. <code>
  41. <pre>
  42. <span class=comment>// keyword_p may be defined in the global scope
  43. </span><span class=identifier>distinct_parser</span><span class=special>&lt;&gt; </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=string>&quot;a-zA-Z0-9_&quot;</span><span class=special>);
  44. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class="identifier">ScannerT</span><span class=special>&gt; </span><span class=identifier>declaration_rule </span><span class=special>= </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=string>&quot;declare&quot;</span><span class=special>) &gt;&gt; </span><span class=identifier>lexeme_d</span><span class=special>[+</span><span class=identifier>alpha_p</span><span class=special>];
  45. </span></pre>
  46. </code>
  47. <p>
  48. The <tt>keyword_p</tt> works in the same way as the <tt>str_p</tt> parser but matches only
  49. when the matched input is not immediately followed by one of the characters
  50. from the set passed to the constructor of <tt>keyword_p</tt>. In the example the
  51. &quot;declare&quot; can't be immediately followed by any alphabetic character, any
  52. number or an underscore.</p>
  53. <p>
  54. See the full <a href="../example/fundamental/distinct/distinct_parser.cpp">example here </a>.</p>
  55. <h3>distinct_directive</h3><p>
  56. For more sophisticated cases, for example when keywords are stored in a
  57. symbol table, we can use <tt>distinct_directive</tt>.</p>
  58. <pre>
  59. <code><span class=identifier>distinct_directive</span><span class=special>&lt;&gt; </span><span class=identifier>keyword_d</span><span class=special>(</span><span class=string>&quot;a-zA-Z0-9_&quot;</span><span class=special>);
  60. </span><span class=identifier>symbol</span><span class=special>&lt;&gt; </span><span class=identifier>keywords </span><span class=special>= </span><span class=string>&quot;declare&quot;</span><span class=special>, </span><span class=string>&quot;begin&quot;</span><span class=special>, </span><span class=string>&quot;end&quot;</span><span class=special>;
  61. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class="identifier">ScannerT</span><span class=special>&gt; </span><span class=identifier>keyword </span><span class=special>= </span><span class=identifier>keyword_d</span><span class=special>[</span><span class=identifier>keywords</span><span class=special>];
  62. </span></code></pre>
  63. <h3>dynamic_distinct_parser and dynamic_distinct_directive</h3><p>
  64. In some cases a set of forbidden follow-up characters is not sufficient.
  65. For example ASN.1 naming conventions allows identifiers to contain dashes,
  66. but not double dashes (which marks the beginning of a comment).
  67. Furthermore, identifiers can't end with a dash. So, a matched keyword can't
  68. be followed by any alphanumeric character or exactly one dash, but can be
  69. followed by two dashes.</p>
  70. <p>
  71. This is when <tt>dynamic_distinct_parser</tt> and the <tt>dynamic_distinct_directive </tt>come into play. The constructor of the <tt>dynamic_distinct_parser</tt> accepts a
  72. parser which matches any input that <strong>must NOT</strong> follow the keyword.</p>
  73. <pre>
  74. <code><span class=comment>// Alphanumeric characters and a dash followed by a non-dash
  75. // may not follow an ASN.1 identifier.
  76. </span><span class=identifier>dynamic_distinct_parser</span><span class=special>&lt;&gt; </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=identifier>alnum_p </span><span class=special>| (</span><span class=literal>'-' </span><span class=special>&gt;&gt; ~</span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'-'</span><span class=special>)));
  77. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class="identifier">ScannerT</span><span class=special>&gt; </span><span class=identifier>declaration_rule </span><span class=special>= </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=string>&quot;declare&quot;</span><span class=special>) &gt;&gt; </span><span class=identifier>lexeme_d</span><span class=special>[+</span><span class=identifier>alpha_p</span><span class=special>];
  78. </span></code></pre>
  79. <p>
  80. Since the <tt>dynamic_distinct_parser</tt> internally uses a rule, its type is
  81. dependent on the scanner type. So, the <tt>keyword_p</tt> shouldn't be defined
  82. globally, but rather within the grammar.</p>
  83. <p>
  84. See the full <a href="../example/fundamental/distinct/distinct_parser_dynamic.cpp">example here</a>.</p>
  85. <h3>How it works</h3><p>
  86. When the <tt>keyword_p_1</tt> and the <tt>keyword_p_2</tt> are defined as</p>
  87. <code><pre>
  88. <span class=identifier>distinct_parser</span><span class=special>&lt;&gt; </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=identifier>forbidden_chars</span><span class=special>);
  89. </span><span class=identifier>distinct_parser_dynamic</span><span class=special>&lt;&gt; </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=identifier>forbidden_tail_parser</span><span class=special>);
  90. </span></pre></code>
  91. <p>
  92. the parsers</p>
  93. <code><pre>
  94. <span class=identifier>keyword_p_1</span><span class=special>(</span><span class=identifier>str</span><span class=special>)
  95. </span><span class=identifier>keyword_p_2</span><span class=special>(</span><span class=identifier>str</span><span class=special>)
  96. </span></pre></code>
  97. <p>
  98. are equivalent to the rules</p>
  99. <code><pre>
  100. <span class=identifier>lexeme_d</span><span class=special>[</span><span class=identifier>chseq_p</span><span class=special>(</span><span class=identifier>str</span><span class=special>) &gt;&gt; ~</span><span class=identifier>epsilon_p</span><span class=special>(</span><span class=identifier>chset_p</span><span class=special>(</span><span class=identifier>forbidden_chars</span><span class=special>))]
  101. </span><span class=identifier>lexeme_d</span><span class=special>[</span><span class=identifier>chseq_p</span><span class=special>(</span><span class=identifier>str</span><span class=special>) &gt;&gt; ~</span><span class=identifier>epsilon_p</span><span class=special>(</span><span class=identifier>forbidden_tail_parser</span><span class=special>)]
  102. </span></pre></code>
  103. <table border="0">
  104. <tr>
  105. <td width="10"></td>
  106. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  107. <td width="30"><a href="scoped_lock.html"><img src="theme/l_arr.gif" border="0"></a></td>
  108. <td width="30"><a href="symbols.html"><img src="theme/r_arr.gif" border="0"></a></td>
  109. </tr>
  110. </table>
  111. <br>
  112. <hr size="1">
  113. <p class="copyright">Copyright &copy; 2003-2004
  114. Vaclav Vesely<br><br>
  115. <font size="2">Use, modification and distribution is subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) </font> </p>
  116. </body>
  117. </html>