Home | Libraries | People | FAQ | More |
It's a common question in the Spirit Mailing List: How do I parse and place the results into a C++ struct? Of course, at this point, you already know various ways to do it, using semantic actions. There are many ways to skin a cat. Spirit X3, being fully attributed, makes it even easier. The next example demonstrates some features of Spirit X3 that make this easy. In the process, you'll learn about:
First, let's create a struct representing an employee:
namespace client { namespace ast { struct employee { int age; std::string forename; std::string surname; double salary; }; }}
Then, we need to tell Boost.Fusion about our employee struct to make it a first-class fusion citizen that the grammar can utilize. If you don't know fusion yet, it is a Boost library for working with heterogeneous collections of data, commonly referred to as tuples. Spirit uses fusion extensively as part of its infrastructure.
In fusion's view, a struct is just a form of a tuple. You can adapt any struct to be a fully conforming fusion tuple:
BOOST_FUSION_ADAPT_STRUCT( client::ast::employee, age, forename, surname, salary )
Now we'll write a parser for our employee. Inputs will be of the form:
employee{ age, "forename", "surname", salary }
namespace parser { namespace x3 = boost::spirit::x3; namespace ascii = boost::spirit::x3::ascii; using x3::int_; using x3::lit; using x3::double_; using x3::lexeme; using ascii::char_; x3::rule<class employee, ast::employee> const employee = "employee"; auto const quoted_string = lexeme['"' >> +(char_ - '"') >> '"']; auto const employee_def = lit("employee") >> '{' >> int_ >> ',' >> quoted_string >> ',' >> quoted_string >> ',' >> double_ >> '}' ; BOOST_SPIRIT_DEFINE(employee); }
The full cpp file for this example can be found here: employee.cpp
Let's walk through this one step at a time (not necessarily from top to bottom).
We are assuming that you already know about rules. We introduced rules in the previous Roman Numerals example. Please go back and review the previous tutorial if you have to.
x3::rule<class employee, ast::employee> employee = "employee";
lexeme['"' >> +(char_ - '"') >> '"'];
lexeme
inhibits space skipping
from the open brace to the closing brace. The expression parses quoted strings.
+(char_ - '"')
parses one or more chars, except the double quote. It stops when it sees a double quote.
The expression:
a - b
parses a
but not b
. Its attribute is just A
; the attribute of a
.
b
's attribute is ignored.
Hence, the attribute of:
char_ - '"'
is just char
.
+a
is similar to Kleene star. Rather than match everything, +a
matches one or more. Like it's related
function, the Kleene star, its attribute is a std::vector<A>
where A
is the attribute
of a
. So, putting all these
together, the attribute of
+(char_ - '"')
is then:
std::vector<char>
Now what's the attribute of
'"' >> +(char_ - '"') >> '"'
?
Well, typically, the attribute of:
a >> b >> c
is:
fusion::vector<A, B, C>
where A
is the attribute
of a
, B
is the attribute of b
and
C
is the attribute of c
. What is fusion::vector
?
- a tuple.
Note | |
---|---|
If you don't know what I am talking about, see: Fusion Vector. It might be a good idea to have a look into Boost.Fusion at this point. You'll definitely see more of it in the coming pages. |
Some parsers, especially those very little literal parsers you see, like
'"'
, do not have attributes.
Nodes without attributes are disregarded. In a sequence, like above, all
nodes with no attributes are filtered out of the fusion::vector
.
So, since '"'
has no attribute,
and +(char_
- '"')
has a std::vector<char>
attribute, the whole expression's attribute should have been:
fusion::vector<std::vector<char> >
But wait, there's one more collapsing rule: If the attribute is followed
by a single element fusion::vector
,
The element is stripped naked from its container. To make a long story short,
the attribute of the expression:
'"' >> +(char_ - '"') >> '"'
is:
std::vector<char>
Again, we are assuming that you already know about rules and rule definitions. We introduced rules in the previous Roman Numerals example. Please go back and review the previous tutorial if you have to.
employee = lit("employee") >> '{' >> int_ >> ',' >> quoted_string >> ',' >> quoted_string >> ',' >> double_ >> '}' ; BOOST_SPIRIT_DEFINE(employee);
Applying our collapsing rules above, the RHS has an attribute of:
fusion::vector<int, std::string, std::string, double>
These nodes do not have an attribute:
lit("employee")
'{'
','
'}'
Note | |
---|---|
In case you are wondering, |
Recall that the attribute of parser::employee
is the ast::employee
struct.
Now everything is clear, right? The struct
employee
IS
compatible with fusion::vector<int, std::string, std::string, double>
.
So, the RHS of start
uses
start's attribute (a struct employee
) in-situ when it does its work.