123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208 |
- [/==============================================================================
- Copyright (C) 2001-2011 Joel de Guzman
- Copyright (C) 2001-2011 Hartmut Kaiser
- Distributed under the Boost Software License, Version 1.0. (See accompanying
- file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
- ===============================================================================/]
- [section Preface]
- [:['["Examples of designs that meet most of the criteria for
- "goodness" (easy to understand, flexible, efficient) are a
- recursive-descent parser, which is traditional procedural
- code. Another example is the STL, which is a generic library of
- containers and algorithms depending crucially on both traditional
- procedural code and on parametric polymorphism.]] [*--Bjarne
- Stroustrup]]
- [heading History]
- [heading /80s/]
- In the mid-80s, Joel wrote his first calculator in Pascal. Such an
- unforgettable coding experience, he was amazed at how a mutually
- recursive set of functions can model a grammar specification. In time,
- the skills he acquired from that academic experience became very
- practical as he was tasked to do some parsing. For instance, whenever he
- needed to perform any form of binary or text I/O, he tried to approach
- each task somewhat formally by writing a grammar using Pascal-like
- syntax diagrams and then a corresponding recursive-descent parser. This
- process worked very well.
- [heading /90s/]
- The arrival of the Internet and the World Wide Web magnified the need
- for parsing a thousand-fold. At one point Joel had to write an HTML
- parser for a Web browser project. Using the W3C formal specifications,
- he easily wrote a recursive-descent HTML parser. With the influence of
- the Internet, RFC specifications were abundent. SGML, HTML, XML, email
- addresses and even those seemingly trivial URLs were all formally
- specified using small EBNF-style grammar specifications. Joel had more
- parsing to do, and he wished for a tool similar to larger parser
- generators such as YACC and ANTLR, where a parser is built automatically
- from a grammar specification.
- This ideal tool would be able to parse anything from email addresses and
- command lines, to XML and scripting languages. Scalability was a primary
- goal. The tool would be able to do this without incurring a heavy
- development load, which was not possible with the above mentioned parser
- generators. The result was Spirit.
- Spirit was a personal project that was conceived when Joel was involved
- in R&D in Japan. Inspired by the GoF's composite and interpreter
- patterns, he realized that he can model a recursive-descent parser with
- hierarchical-object composition of primitives (terminals) and composites
- (productions). The original version was implemented with run-time
- polymorphic classes. A parser was generated at run time by feeding in
- production rule strings such as:
- "prod ::= {'A' | 'B'} 'C';"
- A compile function compiled the parser, dynamically creating a hierarchy
- of objects and linking semantic actions on the fly. A very early text
- can be found here: __early_spirit__.
- [heading /2001 to 2006/]
- Version 1.0 to 1.8 was a complete rewrite of the original Spirit parser
- using expression templates and static polymorphism, inspired by the
- works of Todd Veldhuizen (__exprtemplates__, C++ Report, June
- 1995). Initially, the static-Spirit version was meant only to replace
- the core of the original dynamic-Spirit. Dynamic-Spirit needed a parser
- to implement itself anyway. The original employed a hand-coded
- recursive-descent parser to parse the input grammar specification
- strings. It was at this time when Hartmut Kaiser joined the Spirit
- development.
- After its initial "open-source" debut in May 2001, static-Spirit became
- a success. At around November 2001, the Spirit website had an activity
- percentile of 98%, making it the number one parser tool at Source Forge
- at the time. Not bad for a niche project like a parser library. The
- "static" portion of Spirit was forgotten and static-Spirit simply became
- Spirit. The library soon evolved to acquire more dynamic features.
- Spirit was formally accepted into __boost__ in October 2002. Boost is a
- peer-reviewed, open collaborative development effort around a collection
- of free Open Source C++ libraries covering a wide range of domains. The
- Boost Libraries have become widely known as an industry standard for
- design and implementation quality, robustness, and reusability.
- [heading /2007/]
- Over the years, especially after Spirit was accepted into Boost, Spirit
- has served its purpose quite admirably. [*/Classic-Spirit/] (versions
- prior to 2.0) focused on transduction parsing, where the input string is
- merely translated to an output string. Many parsers fall into the
- transduction type. When the time came to add attributes to the parser
- library, it was done in a rather ad-hoc manner, with the goal being 100%
- backward compatible with Classic Spirit. As a result, some parsers have
- attributes, some don't.
- Spirit V2 is another major rewrite. Spirit V2 grammars are fully
- attributed (see __attr_grammar__) which means that all parser components
- have attributes. To do this efficiently and elegantly, we had to use a
- couple of infrastructure libraries. Some did not exist, some were quite
- new when Spirit debuted, and some needed work. __mpl__ is an important
- infrastructure library, yet is not sufficient to implement Spirit V2.
- Another library had to be written: __fusion__. Fusion sits between MPL
- and STL --between compile time and runtime -- mapping types to values.
- Fusion is a direct descendant of both MPL and __boost_tuples__. Fusion
- is now a full-fledged __boost__ library. __phoenix__ also had to be
- beefed up to support Spirit V2. The result is __phoenix__. Last
- but not least, Spirit V2 uses an __exprtemplates__ library called
- __boost_proto__.
- Even though it has evolved and matured to become a multi-module library,
- Spirit is still used for micro-parsing tasks as well as scripting
- languages. Like C++, you only pay for features that you need. The power
- of Spirit comes from its modularity and extensibility. Instead of giving
- you a sledgehammer, it gives you the right ingredients to easily create
- a sledgehammer.
- [heading New Ideas: Spirit V2]
- Just before the development of Spirit V2 began, Hartmut came across the
- __string_template__ library that is a part of the ANTLR parser
- framework. [footnote Quote from http://www.stringtemplate.org/: It is a
- Java template engine (with ports for C# and Python) for generating
- source code, web pages, emails, or any other formatted text output.]
- The concepts presented in that library lead Hartmut to
- the next step in the evolution of Spirit. Parsing and generation are
- tightly connected to a formal notation, or a grammar. The grammar
- describes both input and output, and therefore, a parser library should
- have a grammar driven output. This duality is expressed in Spirit by the
- parser library __qi__ and the generator library __karma__ using the same
- component infrastructure.
- The idea of creating a lexer library well integrated with the Spirit
- parsers is not new. This has been discussed almost since Classic-Spirit
- (pre V2) initially debuted. Several attempts to integrate existing lexer
- libraries and frameworks with Spirit have been made and served as a
- proof of concept and usability (for example see __wave__: The Boost
- C/C++ Preprocessor Library, and __slex__: a fully dynamic C++ lexer
- implemented with Spirit). Based on these experiences we added __lex__: a
- fully integrated lexer library to the mix, allowing the user to take
- advantage of the power of regular expressions for token matching,
- removing pressure from the parser components, simplifying parser
- grammars. Again, Spirit's modular structure allowed us to reuse the same
- underlying component library as for the parser and generator libraries.
- [heading How to use this manual]
- Each major section (there are 3: __sec_qi__, __sec_karma__, and
- __sec_lex__) is roughly divided into 3 parts:
- # Tutorials: A step by step guide with heavily annotated code. These
- are meant to get the user acquainted with the library as quickly as
- possible. The objective is to build the confidence of the user in
- using the library through abundant examples and detailed
- instructions. Examples speak volumes and we have volumes of
- examples!
- # Abstracts: A high level summary of key topics. The objective is to
- give the user a high level view of the library, the key concepts,
- background and theories.
- # Reference: Detailed formal technical reference. We start with a quick
- reference -- an easy to use table that maps into the reference proper.
- The reference proper starts with C++ concepts followed by
- models of the concepts.
- Some icons are used to mark certain topics indicative of their relevance.
- These icons precede some text to indicate:
- [table Icons
- [[Icon] [Name] [Meaning]]
- [[__note__] [Note] [Generally useful information (an aside that
- doesn't fit in the flow of the text)]]
- [[__tip__] [Tip] [Suggestion on how to do something
- (especially something that is not obvious)]]
- [[__important__] [Important] [Important note on something to take
- particular notice of]]
- [[__caution__] [Caution] [Take special care with this - it may
- not be what you expect and may cause bad
- results]]
- [[__danger__] [Danger] [This is likely to cause serious
- trouble if ignored]]
- ]
- This documentation is automatically generated by Boost QuickBook
- documentation tool. QuickBook can be found in the __boost_tools__.
- [heading Support]
- Please direct all questions to Spirit's mailing list. You can subscribe
- to the __spirit_list__. The mailing list has a searchable archive. A
- search link to this archive is provided in __spirit__'s home page. You
- may also read and post messages to the mailing list through
- __spirit_general__ (thanks to __gmane__). The news group mirrors the
- mailing list. Here is a link to the archives: __mlist_archive__.
- [endsect] [/ Preface]
|