[/============================================================================== Copyright (C) 2001-2015 Joel de Guzman Copyright (C) 2001-2011 Hartmut Kaiser Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) ===============================================================================/] [section:employee Employee - Parsing into structs] It's a common question in the __spirit_list__: How do I parse and place the results into a C++ struct? Of course, at this point, you already know various ways to do it, using semantic actions. There are many ways to skin a cat. Spirit X3, being fully attributed, makes it even easier. The next example demonstrates some features of Spirit X3 that make this easy. In the process, you'll learn about: * More about attributes * Auto rules * Some more built-in parsers * Directives First, let's create a struct representing an employee: namespace client { namespace ast { struct employee { int age; std::string forename; std::string surname; double salary; }; }} Then, we need to tell __fusion__ about our employee struct to make it a first-class fusion citizen that the grammar can utilize. If you don't know fusion yet, it is a __boost__ library for working with heterogeneous collections of data, commonly referred to as tuples. Spirit uses fusion extensively as part of its infrastructure. In fusion's view, a struct is just a form of a tuple. You can adapt any struct to be a fully conforming fusion tuple: BOOST_FUSION_ADAPT_STRUCT( client::ast::employee, age, forename, surname, salary ) Now we'll write a parser for our employee. Inputs will be of the form: employee{ age, "forename", "surname", salary } [#__tutorial_employee_parser__] Here goes: namespace parser { namespace x3 = boost::spirit::x3; namespace ascii = boost::spirit::x3::ascii; using x3::int_; using x3::lit; using x3::double_; using x3::lexeme; using ascii::char_; x3::rule const employee = "employee"; auto const quoted_string = lexeme['"' >> +(char_ - '"') >> '"']; auto const employee_def = lit("employee") >> '{' >> int_ >> ',' >> quoted_string >> ',' >> quoted_string >> ',' >> double_ >> '}' ; BOOST_SPIRIT_DEFINE(employee); } The full cpp file for this example can be found here: [@../../../example/x3/employee.cpp employee.cpp] Let's walk through this one step at a time (not necessarily from top to bottom). [heading Rule Declaration] We are assuming that you already know about rules. We introduced rules in the previous [tutorial_roman Roman Numerals example]. Please go back and review the previous tutorial if you have to. x3::rule employee = "employee"; [heading Lexeme] lexeme['"' >> +(char_ - '"') >> '"']; `lexeme` inhibits space skipping from the open brace to the closing brace. The expression parses quoted strings. +(char_ - '"') parses one or more chars, except the double quote. It stops when it sees a double quote. [heading Difference] The expression: a - b parses `a` but not `b`. Its attribute is just `A`; the attribute of `a`. `b`'s attribute is ignored. Hence, the attribute of: char_ - '"' is just `char`. [heading Plus] +a is similar to Kleene star. Rather than match everything, `+a` matches one or more. Like it's related function, the Kleene star, its attribute is a `std::vector` where `A` is the attribute of `a`. So, putting all these together, the attribute of +(char_ - '"') is then: std::vector [heading Sequence Attribute] Now what's the attribute of '"' >> +(char_ - '"') >> '"' ? Well, typically, the attribute of: a >> b >> c is: fusion::vector where `A` is the attribute of `a`, `B` is the attribute of `b` and `C` is the attribute of `c`. What is `fusion::vector`? - a tuple. [note If you don't know what I am talking about, see: [@http://tinyurl.com/6xun4j Fusion Vector]. It might be a good idea to have a look into __fusion__ at this point. You'll definitely see more of it in the coming pages.] [heading Attribute Collapsing] Some parsers, especially those very little literal parsers you see, like `'"'`, do not have attributes. Nodes without attributes are disregarded. In a sequence, like above, all nodes with no attributes are filtered out of the `fusion::vector`. So, since `'"'` has no attribute, and `+(char_ - '"')` has a `std::vector` attribute, the whole expression's attribute should have been: fusion::vector > But wait, there's one more collapsing rule: If the attribute is followed by a single element `fusion::vector`, The element is stripped naked from its container. To make a long story short, the attribute of the expression: '"' >> +(char_ - '"') >> '"' is: std::vector [heading Rule Definition] Again, we are assuming that you already know about rules and rule definitions. We introduced rules in the previous [tutorial_roman Roman Numerals example]. Please go back and review the previous tutorial if you have to. employee = lit("employee") >> '{' >> int_ >> ',' >> quoted_string >> ',' >> quoted_string >> ',' >> double_ >> '}' ; BOOST_SPIRIT_DEFINE(employee); Applying our collapsing rules above, the RHS has an attribute of: fusion::vector These nodes do not have an attribute: * `lit("employee")` * `'{'` * `','` * `'}'` [note In case you are wondering, `lit("employee")` is the same as "employee". We had to wrap it inside `lit` because immediately after it is `>> '{'`. You can't right-shift a `char[]` and a `char` - you know, C++ syntax rules.] Recall that the attribute of `parser::employee` is the `ast::employee` struct. Now everything is clear, right? The `struct employee` *IS* compatible with `fusion::vector`. So, the RHS of `start` uses start's attribute (a `struct employee`) in-situ when it does its work. [endsect]