123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197 |
- [/==============================================================================
- Copyright (C) 2001-2015 Hartmut Kaiser
- Copyright (C) 2001-2011 Joel de Guzman
- Distributed under the Boost Software License, Version 1.0. (See accompanying
- file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
- ===============================================================================/]
- [/////////////////////////////////////////////////////////////////////////////]
- [section:primitive_attributes Attributes of Primitive Components]
- Parsers in __spirit__ are fully attributed. __x3__ parsers always /expose/ an
- attribute specific to their type. This is called /synthesized attribute/ as it
- is returned from a successful match representing the matched input sequence. For
- instance, numeric parsers, such as `int_` or `double_`, return the `int` or
- `double` value converted from the matched input sequence. Other primitive parser
- components have other intuitive attribute types, such as for instance `int_`
- which has `int`, or `ascii::char_` which has `char`. Primitive parsers apply the
- normal C++ convertibility rules: you can use any C++ type to receive the parsed
- value as long as the attribute type of the parser is convertible to the type
- provided. The following example shows how a synthesized parser attribute (the
- `int` value) is extracted by calling the API function `x3::parse`:
- int value = 0;
- std::string str("123");
- std::string::iterator strbegin = str.begin();
- x3::parse(strbegin, str.end(), int_, value); // value == 123
- For a full list of available parser primitives and their attribute types please
- see the sections __sec_x3_primitive__.
- [endsect]
- [/////////////////////////////////////////////////////////////////////////////]
- [section:compound_attributes Attributes of Compound Components]
- __x3__ implement well defined attribute type propagation rules for all compound
- parsers, such as sequences, alternatives, Kleene star, etc. The main attribute
- propagation rule for a sequences is for instance:
- a: A, b: B --> (a >> b): tuple<A, B>
- which reads as:
- [:Given `a` and `b` are parsers, and `A` is the attribute type of `a`, and `B`
- is the attribute type of `b`, then the attribute type of `a >> b` (`a << b`)
- will be `tuple<A, B>`.]
- [note The notation `tuple<A, B>` is used as a placeholder expression for any
- fusion sequence holding the types A and B, such as `boost::fusion::tuple<A, B>`
- or `std::pair<A, B>` (for more information see __fusion__).]
- As you can see, in order for a type to be compatible with the attribute type
- of a compound expression it has to
- * either be convertible to the attribute type,
- * or it has to expose certain functionalities, i.e. it needs to conform to a
- concept compatible with the component.
- Each compound component implements its own set of attribute propagation rules.
- For a full list of how the different compound parsers consume attributes
- see the sections __sec_x3_compound__.
- [heading The Attribute of Sequence Parsers]
- Sequences require an attribute type to expose the concept of a fusion sequence,
- where all elements of that fusion sequence have to be compatible with the
- corresponding element of the component sequence. For example, the expression:
- double_ >> double_
- is compatible with any fusion sequence holding two types, where both types have
- to be compatible with `double`. The first element of the fusion sequence has to
- be compatible with the attribute of the first `double_`, and the second element
- of the fusion sequence has to be compatible with the attribute of the second
- `double_`. If we assume to have an instance of a `std::pair<double, double>`,
- we can directly use the expressions above to do both, parse input to fill the
- attribute:
- // the following parses "1.0 2.0" into a pair of double
- std::string input("1.0 2.0");
- std::string::iterator strbegin = input.begin();
- std::pair<double, double> p;
- x3::phrase_parse(strbegin, input.end(),
- x3::double_ >> x3::double_, // parser grammar
- x3::space, // delimiter grammar
- p); // attribute to fill while parsing
- [tip *For sequences only:* To keep it simple, unlike __Spirit.qi__, __x3__ does
- not support more than one attribute anymore in the `parse` and `phrase_parse` function.
- Just use `std:tuple'. Be sure to include `boost/fusion/adapted/std_tuple.hpp' in this case.
- ]
- [heading The Attribute of Alternative Parsers]
- Alternative parsers are all about - well - alternatives. In
- order to store possibly different result (attribute) types from the different
- alternatives we use the data type __boost_variant__. The main attribute
- propagation rule of these components is:
- a: A, b: B --> (a | b): variant<A, B>
- Alternatives have a second very important attribute propagation rule:
- a: A, b: A --> (a | b): A
- often simplifying things significantly. If all sub expressions of
- an alternative expose the same attribute type, the overall alternative
- will expose exactly the same attribute type as well.
- [endsect]
- [/////////////////////////////////////////////////////////////////////////////]
- [section:more_compound_attributes More About Attributes of Compound Components]
- While parsing input, it is often desirable to combine some
- constant elements with variable parts. For instance, let us look at the example
- of parsing or formatting a complex number, which is written as `(real, imag)`,
- where `real` and `imag` are the variables representing the real and imaginary
- parts of our complex number. This can be achieved by writing:
- '(' >> double_ >> ", " >> double_ >> ')'
- Literals (such as `'('` and `", "`) do /not/ expose any attribute
- (well actually, they do expose the special type `unused_type`, but in this
- context `unused_type` is interpreted as if the component does not expose any
- attribute at all). It is very important to understand that the literals don't
- consume any of the elements of a fusion sequence passed to this component
- sequence. As said, they just don't expose any attribute and don't produce
- (consume) any data. The following example shows this:
- // the following parses "(1.0, 2.0)" into a pair of double
- std::string input("(1.0, 2.0)");
- std::string::iterator strbegin = input.begin();
- std::pair<double, double> p;
- x3::parse(strbegin, input.end(),
- '(' >> x3::double_ >> ", " >> x3::double_ >> ')', // parser grammar
- p); // attribute to fill while parsing
- where the first element of the pair passed in as the data to generate is still
- associated with the first `double_`, and the second element is associated with
- the second `double_` parser.
- This behavior should be familiar as it conforms to the way other input and
- output formatting libraries such as `scanf`, `printf` or `boost::format` are
- handling their variable parts. In this context you can think about __x3__'s
- primitive components (such as the `double_` above) as of being
- type safe placeholders for the attribute values.
- [tip *For sequences only:* To keep it simple, unlike __Spirit.qi__, __x3__ does
- not support more than one attribute anymore in the `parse` and `phrase_parse` function.
- Just use `std:tuple'. Be sure to include `boost/fusion/adapted/std_tuple.hpp' in this case.
- ]
- Let's take a look at this from a more formal perspective:
- a: A, b: Unused --> (a >> b): A
- which reads as:
- [:Given `a` and `b` are parsers, and `A` is the attribute type of
- `a`, and `unused_type` is the attribute type of `b`, then the attribute type
- of `a >> b` (`a << b`) will be `A` as well. This rule applies regardless of
- the position the element exposing the `unused_type` is at.]
- This rule is the key to the understanding of the attribute handling in
- sequences as soon as literals are involved. It is as if elements with
- `unused_type` attributes 'disappeared' during attribute propagation. Notably,
- this is not only true for sequences but for any compound components. For
- instance, for alternative components the corresponding rule is:
- a: A, b: Unused --> (a | b): A
- again, allowing to simplify the overall attribute type of an expression.
- [endsect]
- [/////////////////////////////////////////////////////////////////////////////]
- [section:nonterminal_attributes Attributes of Nonterminals]
- Nonterminals are the main means of constructing more complex parsers out of
- simpler ones. The nonterminals in the parser world are very similar to functions
- in an imperative programming language. They can be used to encapsulate parser
- expressions for a particular input sequence. After being defined, the
- nonterminals can be used as 'normal' parsers in more complex expressions
- whenever the encapsulated input needs to be recognized. Parser nonterminals in
- __x3__ usually return a value (the synthesized attribute).
- The type of the synthesized attribute as to be explicitly specified while
- defining the particular nonterminal. Example (ignore ID for now):
- x3::rule<ID, int> r;
- [endsect]
|