finite_state_filters.html 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184
  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
  2. <HTML>
  3. <HEAD>
  4. <TITLE>Tutorial</TITLE>
  5. <LINK REL="stylesheet" HREF="../../../../boost.css">
  6. <LINK REL="stylesheet" HREF="../theme/iostreams.css">
  7. </HEAD>
  8. <BODY>
  9. <!-- Begin Banner -->
  10. <H1 CLASS="title">Tutorial</H1>
  11. <HR CLASS="banner">
  12. <!-- End Banner -->
  13. <!-- Begin Nav -->
  14. <DIV CLASS='nav'>
  15. <A HREF='dual_use_filters.html'><IMG BORDER=0 WIDTH=19 HEIGHT=19 SRC='../../../../doc/src/images/prev.png'></A>
  16. <A HREF='tutorial.html'><IMG BORDER=0 WIDTH=19 HEIGHT=19 SRC='../../../../doc/src/images/up.png'></A>
  17. <A><IMG BORDER=0 WIDTH=19 HEIGHT=19 SRC='../../../../doc/src/images/next_disabled.png'></A>
  18. </DIV>
  19. <!-- End Nav -->
  20. <A NAME="finite_state"></A>
  21. <H2>2.2.10. Finite State Filters</H2>
  22. <A NAME="finite_state_machine"></A>
  23. <H4>Finite State Machines</H4>
  24. <P>
  25. In this section I show how to construct <A HREF="../concepts/dual_use_filter.html">Dual-Use Filters</A> from finite state machines. For purposes of this section, a finite state machine consists of a collection of <SPAN>states</SPAN>, represented as <CODE>int</CODE>s, a distinguished <SPAN>initial state</SPAN>, a <SPAN>transition table</SPAN>, a collection of <SPAN>event handlers</SPAN> and a stack of characters. These finite state machines were inspired by the finite state machine examples that accompany the <A HREF="../../../mpl/doc/index.html" TARGET="_top"><CODE>Boost Metaprogamming library</CODE></A>. <I>See</I>, <I>e.g.</I>, <A HREF="../../../mpl/example/fsm/player1.cpp" TARGET="_top"><CODE>&lt;libs/mpl/example/player1.cpp&gt;</CODE></A>
  26. </P>
  27. <P>
  28. For us, an <SPAN>event</SPAN> is simply a character. A transition table is an <A HREF="../../../mpl/doc/refmanual/forward-sequence.html" TARGET="_top">MPL Forward Sequence</A> of <SPAN>rows</SPAN>. Each row is a 4-tuple consisting of a <SPAN>current state</SPAN>, a <SPAN>character class</SPAN>, a <SPAN>next state</SPAN> and an <SPAN>event handler</SPAN>. During the operation of a finite state machine, its transition table is consulted to determine how the machine's state should be updated and which event handler to call. When an event <CODE>e</CODE> occurs, the transition table is searched for the first row whose current state is equal to the current state of the machine, and whose character class matches <CODE>e</CODE>. The machine's state is then updated to the row's next state, and the event handler is called with <CODE>e</CODE> as an argument.
  29. </P>
  30. <P>
  31. A character class is a class type with a <CODE>static</CODE> member function <CODE>test</CODE> taking a character and a <CODE>std::locale</CODE> as arguments. <I>E.g.</I>,
  32. <PRE class="broken_ie"> <SPAN CLASS="keyword">struct</SPAN> is_any {
  33. <SPAN CLASS="keyword">static</SPAN> <SPAN CLASS="keyword">bool</SPAN> test(<SPAN CLASS="keyword">char</SPAN>, <SPAN CLASS="keyword">const</SPAN> std::locale&amp;);
  34. };</PRE>
  35. <P>
  36. There are several built in character classes. The character class <CODE>is_any</CODE> matches any character. For any character <CODE>c</CODE>, the character class <CODE>is&lt;c&gt;</CODE> only matches <CODE>c</CODE>. The character classes <CODE>alnum</CODE>, <CODE>is_alpha</CODE>, <CODE>is_cntrl</CODE>, <CODE>is_digit</CODE>, <CODE>is_graph</CODE>, <CODE>is_lower</CODE>, <CODE>is_print</CODE>, <CODE>is_punct</CODE>, <CODE>is_space</CODE>, <CODE>is_upper</CODE>, <CODE>is_xdigit</CODE> implement <CODE>locale</CODE>-sensitive character classification.
  37. </P>
  38. <P>
  39. Event handlers are member functions of a finite state machine which take a single character as argument and return <CODE>void</CODE>. The implementation of an event handler typically examines the given character and pushes zero or more characters of filtered data onto the stack. A special event handler <CODE>on_eof</CODE> takes no arguments and is called automatically after the last character in a sequence is processed.
  40. </P>
  41. <P>
  42. There are three built-in event handlers, <CODE>on_any</CODE>, <CODE>push</CODE> and <CODE>skip</CODE>. The handler <CODE>on_any</CODE> is called automatically when an event occurs which is not covered by any row in the transition table. The default behavior of <CODE>on_any</CODE> is to do nothing. The handler <CODE>push</CODE> pushes the given character onto the stack. The handler <CODE>skip</CODE> ignores the current character. The handlers <CODE>push</CODE> and <CODE>skip</CODE> are not defined by default; in order to make them available, the macro <CODE>BOOST_IOSTREAM_FSM</CODE> should be invoked within the class definition of a finite state machine, passing the state machine's name as the macro argument.
  43. </P>
  44. <P>
  45. Finite state machines must derive from the class template <CODE>boost::iostreams::finite_state_machine</CODE>, defined in the header <A HREF="../../example/finite_state_filter.hpp"><CODE>&lt;libs/iostreams/example/finite_state_filter.hpp&gt;</CODE></A>. The first template argument to <CODE>finite_state_machine</CODE> should be the derived class itself; the second template argument should be the character type.
  46. <P>
  47. A finite state machine's transition table should be declared as a member type named <CODE>transition_table</CODE>. Its event handlers should be implemented as member functions. A finite state machine may define a <CODE>static</CODE> integral constant named <CODE>initial_state</CODE>, which will be treated as the machine's initial state. Alternatively, the definition of the initial state can be omitted, in which case it will default to <CODE>0</CODE>.
  48. </P>
  49. <P>
  50. Given a finite state machine <CODE>my_fsm</CODE>, you can define a <A HREF="../concepts/dual_use_filter.html">Dual-Use Filter</A> as follows:
  51. </P>
  52. <PRE class="broken_ie"><SPAN CLASS="keyword">namespace</SPAN> io = boost::iostreams;
  53. <SPAN CLASS="keyword">typedef</SPAN> io::finite_state_filter&lt;my_fsm&gt; my_finite_state_filter;</PRE>
  54. <A NAME="dos2unix_fsm"></A>
  55. <H4><CODE>dos2unix_fsm</CODE></H4>
  56. <P>The following state machine can be used to convert line endings from <CODE>DOS</CODE> to <CODE>UNIX</CODE>. The constant <CODE>initial_state</CODE>, the class template <CODE>row</CODE> and the character classes <CODE>is</CODE> and <CODE>is_any</CODE> are members of the base class <CODE>boost::iostreams::finite_state_machine</CODE>.</P>
  57. <PRE class="broken_ie"><SPAN CLASS='preprocessor'>#include</SPAN> <A CLASS="header" HREF="../../../../boost/mpl/vector.hpp"><SPAN CLASS="literal">&lt;boost/mpl/vector&gt;</SPAN></A>
  58. <SPAN CLASS='preprocessor'>#include</SPAN> <A CLASS="header" HREF="../../example/finite_state_filter.hpp"><SPAN CLASS="literal">&lt;libs/iostreams/example/finite_state_filter.hpp&gt;</SPAN></A>
  59. <SPAN CLASS="keyword">namespace</SPAN> io = boost::iostreams;
  60. <SPAN CLASS="keyword">struct</SPAN> dos2unix_fsm : io::finite_state_machine&lt;dos2unix_fsm&gt; {
  61. BOOST_IOSTREAMS_FSM(dos2unix_fsm) <SPAN CLASS='comment'>// Define skip and push.</SPAN>
  62. <SPAN CLASS="keyword">typedef</SPAN> dos2unix_fsm self;
  63. <SPAN CLASS="keyword">typedef</SPAN> boost::mpl::vector&lt;
  64. row&lt;initial_state, is&lt;<SPAN CLASS="literal">'\r'</SPAN>&gt;, initial_state, &amp;self::skip&gt;,
  65. row&lt;initial_state, is_any, initial_state, &amp;self::push&gt;
  66. &gt; transition_table;
  67. };</PRE>
  68. <P>
  69. This machine has just a single state. Its transition table can be understood as follows: if the current character is <CODE>'\r'</CODE>, ignore it; otherwise, forward it unchanged.
  70. </P>
  71. <A NAME="unix2dos_fsm"></A>
  72. <H4><CODE>unix2dos_fsm</CODE></H4>
  73. <P>The following state machine can be used to convert line endings from <CODE>UNIX</CODE> to <CODE>DOS</CODE>:</P>
  74. <PRE class="broken_ie"><SPAN CLASS='preprocessor'>#include</SPAN> <A CLASS="header" HREF="../../../../boost/mpl/vector.hpp"><SPAN CLASS="literal">&lt;boost/mpl/vector&gt;</SPAN></A>
  75. <SPAN CLASS='preprocessor'>#include</SPAN> <A CLASS="header" HREF="../../example/finite_state_filter.hpp"><SPAN CLASS="literal">&lt;libs/iostreams/example/finite_state_filter.hpp&gt;</SPAN></A>
  76. <SPAN CLASS="keyword">namespace</SPAN> io = boost::iostreams;
  77. <SPAN CLASS="keyword">struct</SPAN> unix2dos_fsm : io::finite_state_machine&lt;unix2dos_fsm&gt; {
  78. BOOST_IOSTREAMS_FSM(unix2dos_fsm) <SPAN CLASS='comment'>// Define skip and push.</SPAN>
  79. <SPAN CLASS="keyword">typedef</SPAN> unix2dos_fsm self;
  80. <SPAN CLASS="keyword">void</SPAN> on_lf(<SPAN CLASS="keyword">char</SPAN>) { push(<SPAN CLASS="literal">'\r'</SPAN>); push(<SPAN CLASS="literal">'\n'</SPAN>); }
  81. <SPAN CLASS="keyword">typedef</SPAN> boost::mpl::vector&lt;
  82. row&lt;initial_state, is&lt;<SPAN CLASS="literal">'\n'</SPAN>&gt;, initial_state, &amp;self::on_lf&gt;,
  83. row&lt;initial_state, is_any, initial_state, &amp;self::push&gt;
  84. &gt; transition_table;
  85. };</PRE>
  86. <P>
  87. This machine also has just a single state. The event handler <CODE>on_lf</CODE> pushes a <CODE>DOS</CODE> line-ending sequence onto the stack. The transition table can be understood as follows: if the current character is <CODE>'\n'</CODE>, write a <CODE>DOS</CODE> line-ending sequence; otherwise, forward it unchanged.
  88. </P>
  89. <A NAME="uncommenting_fsm"></A>
  90. <H4><CODE>uncommenting_fsm</CODE></H4>
  91. <P>The following state machine removes C-style comments. Although it's not quite sophisticated enough for processing source code, it's a good illustration of a multi-state machine.</P>
  92. <PRE class="broken_ie"><SPAN CLASS='preprocessor'>#include</SPAN> <A CLASS="header" HREF="../../../../boost/mpl/vector.hpp"><SPAN CLASS="literal">&lt;boost/mpl/vector&gt;</SPAN></A>
  93. <SPAN CLASS='preprocessor'>#include</SPAN> <A CLASS="header" HREF="../../example/finite_state_filter.hpp"><SPAN CLASS="literal">&lt;libs/iostreams/example/finite_state_filter.hpp&gt;</SPAN></A>
  94. <SPAN CLASS="keyword">namespace</SPAN> io = boost::iostreams;
  95. <SPAN CLASS="keyword">struct</SPAN> uncommenting_fsm : io::finite_state_machine&lt;uncommenting_fsm&gt; {
  96. BOOST_IOSTREAMS_FSM(uncommenting_fsm) <SPAN CLASS='comment'>// Define skip and push.</SPAN>
  97. <SPAN CLASS="keyword">typedef</SPAN> uncommenting_fsm self;
  98. <SPAN CLASS="keyword">static</SPAN> <SPAN CLASS="keyword">const</SPAN> <SPAN CLASS="keyword"><SPAN CLASS="keyword">int</SPAN></SPAN> no_comment = initial_state;
  99. <SPAN CLASS="keyword">static</SPAN> <SPAN CLASS="keyword">const</SPAN> <SPAN CLASS="keyword"><SPAN CLASS="keyword">int</SPAN></SPAN> pre_comment = no_comment <SPAN CLASS='numeric_literal'>+ 1</SPAN>;
  100. <SPAN CLASS="keyword">static</SPAN> <SPAN CLASS="keyword">const</SPAN> <SPAN CLASS="keyword"><SPAN CLASS="keyword">int</SPAN></SPAN> comment = pre_comment <SPAN CLASS='numeric_literal'>+ 1</SPAN>;
  101. <SPAN CLASS="keyword">static</SPAN> <SPAN CLASS="keyword">const</SPAN> <SPAN CLASS="keyword"><SPAN CLASS="keyword">int</SPAN></SPAN> post_comment = comment <SPAN CLASS='numeric_literal'>+ 1</SPAN>;
  102. <SPAN CLASS="keyword">void</SPAN> push_slash(<SPAN CLASS="keyword">char</SPAN> c) { push('/'); push(c); }
  103. <SPAN CLASS="keyword">typedef</SPAN> boost::mpl::vector&lt;
  104. row&lt;no_comment, is&lt;<SPAN CLASS='literal'>'/'</SPAN>&gt;, pre_comment, &amp;self::skip&gt;,
  105. row&lt;no_comment, is_any, no_comment, &amp;self::push&gt;,
  106. row&lt;pre_comment, is&lt;<SPAN CLASS='literal'>'*'</SPAN>&gt;, comment, &amp;self::skip&gt;,
  107. row&lt;pre_comment, is&lt;<SPAN CLASS='literal'>'/'</SPAN>&gt;, pre_comment, &amp;self::push&gt;,
  108. row&lt;pre_comment, is_any, no_comment, &amp;self::push_slash&gt;,
  109. row&lt;comment, is&lt;<SPAN CLASS='literal'>'*'</SPAN>&gt;, post_comment, &amp;self::skip&gt;,
  110. row&lt;comment, is_any, comment, &amp;self::skip&gt;,
  111. row&lt;post_comment, is&lt;<SPAN CLASS='literal'>'/'</SPAN>&gt;, no_comment, &amp;self::skip&gt;,
  112. row&lt;post_comment, is&lt;<SPAN CLASS='literal'>'*'</SPAN>&gt;, post_comment, &amp;self::skip&gt;,
  113. row&lt;post_comment, is_any, comment, &amp;self::skip&gt;
  114. &gt; transition_table;
  115. };</PRE>
  116. <P>
  117. This machine has four states and one user-defined event handler. I'll leave it as an exercise to discover how it works.
  118. </P>
  119. <!-- Begin Nav -->
  120. <DIV CLASS='nav'>
  121. <A HREF='dual_use_filters.html'><IMG BORDER=0 WIDTH=19 HEIGHT=19 SRC='../../../../doc/src/images/prev.png'></A>
  122. <A HREF='tutorial.html'><IMG BORDER=0 WIDTH=19 HEIGHT=19 SRC='../../../../doc/src/images/up.png'></A>
  123. <A><IMG BORDER=0 WIDTH=19 HEIGHT=19 SRC='../../../../doc/src/images/next_disabled.png'></A>
  124. </DIV>
  125. <!-- End Nav -->
  126. <!-- Begin Footer -->
  127. <HR>
  128. <P CLASS="copyright">&copy; Copyright 2008 <a href="http://www.coderage.com/" target="_top">CodeRage, LLC</a><br/>&copy; Copyright 2004-2007 <a href="https://www.boost.org/users/people/jonathan_turkanis.html" target="_top">Jonathan Turkanis</a></P>
  129. <P CLASS="copyright">
  130. Use, modification, and distribution are subject to the Boost Software License, Version 2.0. (See accompanying file <A HREF="../../../../LICENSE_1_0.txt">LICENSE_1_0.txt</A> or copy at <A HREF="http://www.boost.org/LICENSE_1_0.txt">http://www.boost.org/LICENSE_1_0.txt</A>)
  131. </P>
  132. <!-- End Footer -->
  133. </BODY>