1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495 |
- [/==============================================================================
- Copyright (C) 2001-2015 Joel de Guzman
- Copyright (C) 2001-2011 Hartmut Kaiser
- Distributed under the Boost Software License, Version 1.0. (See accompanying
- file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
- ===============================================================================/]
- [section Introduction]
- Boost Spirit X3 is an object-oriented, recursive-descent parser for C++. It
- allows you to write grammars using a format similar to
- Extended Backus Naur Form (EBNF)[footnote
- [@http://www.cl.cam.ac.uk/%7Emgk25/iso-14977.pdf ISO-EBNF]] directly in C++.
- These inline grammar specifications can mix freely with other C++ code and,
- thanks to the generative power of C++ templates, are immediately executable.
- Conventional compiler-compilers or parser-generators have to perform an
- additional translation step from the source EBNF code to C or C++ code.
- Since the target input grammars are written entirely in C++ we do not need any
- separate tools to compile, preprocess or integrate those into the build process.
- __spirit__ allows seamless integration of the parsing process with other C++
- code. This often allows for simpler and more efficient code.
- The created parsers are fully attributed, which allows you to easily build and
- handle hierarchical data structures in memory. These data structures resemble
- the structure of the input data and can directly be used to generate
- arbitrarily-formatted output.
- A simple EBNF grammar snippet:
- group ::= '(' expression ')'
- factor ::= integer | group
- term ::= factor (('*' factor) | ('/' factor))*
- expression ::= term (('+' term) | ('-' term))*
- is approximated using facilities of Spirit's /X3/ as seen in this code snippet:
- group = '(' >> expression >> ')';
- factor = integer | group;
- term = factor >> *(('*' >> factor) | ('/' >> factor));
- expression = term >> *(('+' >> term) | ('-' >> term));
- Through the magic of expression templates, this is perfectly valid and
- executable C++ code. The production rule `expression` is, in fact, an object
- that has a member function `parse` that does the work given a source code
- written in the grammar that we have just declared. Yes, it's a calculator. We
- shall simplify for now by skipping the type declarations and the definition of
- the rule `integer` invoked by `factor`. Now, the production rule `expression` in
- our grammar specification, traditionally called the `start` symbol, can
- recognize inputs such as:
- 12345
- -12345
- +12345
- 1 + 2
- 1 * 2
- 1/2 + 3/4
- 1 + 2 + 3 + 4
- 1 * 2 * 3 * 4
- (1 + 2) * (3 + 4)
- (-1 + 2) * (3 + -4)
- 1 + ((6 * 200) - 20) / 6
- (1 + (2 + (3 + (4 + 5))))
- Certainly we have modified the original EBNF syntax. This is done to
- conform to C++ syntax rules. Most notably we see the abundance of shift >>
- operators. Since there are no juxtaposition operators in C++, it is simply not
- possible to write something like:
- a b
- as seen in math syntax, for example, to mean multiplication or, in our case,
- as seen in EBNF syntax to mean sequencing (b should follow a). __x3__
- uses the shift `>>` operator instead for this purpose. We take the `>>`
- operator, with arrows pointing to the right, to mean "is followed by". Thus we
- write:
- a >> b
- The alternative operator `|` and the parentheses `()` remain as is. The
- assignment operator `=` is used in place of EBNF's `::=`. Last but not least,
- the Kleene star `*`, which in this case is a postfix operator in EBNF becomes a
- prefix. Instead of:
- a* //... in EBNF syntax,
- we write:
- *a //... in Spirit.
- since there are no postfix stars, `*`, in C/C++. Finally, we terminate each
- rule with the ubiquitous semi-colon, `;`.
- [endsect]
|