C++: Beyond the Standard Library
Pages: 1, 2


I saved the best for last. Some members of the C++ standard committee started Boost as a platform for exploring future directions for the C++ standard library. Already, parts of Boost are being added to the standard library (such as type traits, regular expressions, smart pointers, enhanced binders, and adapters). You can expect to see other parts of Boost added in the future.

Boost is big and eclectic. It contains everything from type traits to quaternions (a generalization of imaginary numbers to four dimensions), from parsing to linear algebra, from simple memory utilities to threads and synchronization.

This section presents only a few of the packages in Boost that are part of the proposed library extension: tuples, smart pointers, lambda expressions, and the Spirit parser generator.

Building and Installing Boost

The web site has instructions for building and installing Boost. You need to install Boost Jam, which is like make, but is designed for greater portability and ease of use. Follow the directions on the web site to build and install Boost.


A tuple is a generalization of the standard pair class template. Instead of being limited to two elements, a tuple can contain up to 10 elements (and the maximum can be extended if necessary). The following example shows one way to use boost::tuple:

#include <numeric>
#include <iostream>
#include <ostream>
#include <vector>
#include "boost/tuple/tuple.hpp"

// Store count, sum, and sum of squares.
typedef boost::tuple<std::size_t, double, double> Stats;

// Accumulate statistics.
Stats stats(Stats s, double x)
  s.get<1>() += x;
  s.get<2>() += x * x;
  return s;

int main()
  std::vector<double> v;
  ... fill v with data ...

  Stats s = std::accumulate(v.begin(), v.end(),
                  boost::make_tuple(0U, 0.0, 0.0), stats);
  std::cout << "count = " << s.get<0>() << '\n';
  std::cout << "mean  = " << s.get<1>() / s.get<0>() << '\n';

Smart Pointers

Boost has several smart pointer class templates. They solve a number of problems that the standard auto_ptr<> class template does not. For example, you cannot store an auto_ptr<> object in a standard container, but you can store a boost::shared_ptr<> object. Boost has several other smart pointer templates; for the sake of brevity, the following example only shows shared_ptr<>:

#include <algorithm>
#include <iostream>
#include <iterator>
#include <ostream>
#include <string>
#include <vector>
#include "boost/smart_ptr.hpp"

// A company has employee. Each employee can be Exempt
// or NonExempt. Certain Exempt employees are Managers,
// which are distinguished by having a group of Employees.
// All the memory for said employees is managed
// automatically by Boost shared_ptr<> templates.

class Employee {
  Employee(const std::string& name) : name_(name) {}
  virtual ~Employee() {}
  const std::string name() const { return name_; }
  virtual void print(std::ostream&);
  const std::string name_;
typedef boost::shared_ptr<Employee> employee;

void Employee::print(std::ostream& out)
  out << name() << '\n';

class Exempt : public Employee {
  Exempt(const std::string& name) : Employee(name) {}

class NonExempt : public Employee {
  NonExempt(const std::string& name) : Employee(name) {}

class Manager : public Exempt {
  Manager(const std::string& name) : Exempt(name) {}
  void add(Employee* e) { group_.push_back(employee(e)); }
  void add(employee e) { group_.push_back(e); }
  virtual void print(std::ostream&);
  std::vector<employee> group_;
typedef boost::shared_ptr<Manager> manager;

void Manager::print(std::ostream& out)
  out << name() << " { ";
  std::copy(group_.begin(), group_.end(),
    std::ostream_iterator<employee>(out, ""));
  out << "}\n";

// Make it easier to print any kind of employee.
template<typename charT, typename traits>
std::basic_ostream<charT,traits>& operator<<
     (std::basic_ostream<charT,traits>& out, employee e)
  return out;

int main()
  manager ceo(new Manager("I. M. Portant"));
  manager muddle(new Manager("Muddled manager"));
  muddle->add(new Exempt("J. Doe"));
  muddle->add(new NonExempt("J. Dough"));

Lambda Expressions

C++ supports multiple programming paradigms, including procedural, object-oriented, and generic programming. You probably don't think of C++ when it comes to functional programming, though. Nonetheless, Boost supports functional programming in C++, albeit with some limitations.

Central to most functional programming languages is the ability to write a function on the fly. These unnamed functions are typically called Lambda Functions. (The name comes from the lambda-calculus. If you don't know about it, don't worry.)

For example, suppose you have a vector of numbers and you want to print the numbers, one per line. You can use an ostream_iterator as follows:

vector<int> data;
copy(data.begin(), data.end(), ostream_iterator<int>(cout, "\n"));

Or you can use a lambda function:

for_each(data.begin(), data.end(), 
cout << _1 << '\n');

The last argument to for_each must be a function pointer or a functor&mdash;an object that has an overloaded operator(), which can be called with a single argument, namely, each element of the range. In this case, the Boost lambda library overloads the operator<< (and all the other overloadable operators) to create a functor. The _1 is a placeholder for the functor's argument, that is, the values from the range.

As the output formatting grows more complex, the value of the lambda function increases. For example, a common problem is printing the contents of a map. You cannot use ostream_iterator because it requires operator<<, which is not defined for std::pair. Instead, you can write your own print function, and call it from for_each, but that moves the print logic away from where it is needed. Instead, you can use a lambda function:

map<string, unsigned long> counts;
typedef pair<const string, unsigned long> mypair;
for_each(counts.begin(), counts.end(),
  cout << bind(&mypair::first, _1) << '\t' 
       << bind(&mypair::second, _1) << '\n');

The nature of the lambda function requires the bind function, which delays binding the member pointer with an object until runtime. Again _1 refers to each element of the range. Every time for_each iterates, bind binds the element of the range to the member pointer, and evaluates the expression.

Lambda expressions are used heavily in the Spirit parser generator, as described in the next section.

Spirit Parser Generator

Perhaps the most interesting and exciting part of the latest Boost release (1.30.0) is the Spirit parser generator. Spirit takes advantage of C++ operator overloading and expression templates to let you define a grammar for a recursive-descent parser in C++. You don't need an external parser-generating tool such as YACC or ANTLR. The drawback is that you cannot define grammars that are as complex as what parser-generators can handle, but Spirit is robust and can handle many complicated tasks.

You can use Spirit without Boost. Spirit is hosted at SourceForge.

Spirit makes heavy use of lambda functions and expression templates. Following is a simple grammar in BNF (Backus-Naur Form):

start ::= expression
factor ::= '(' expression ')' | integer
term ::= factor '*' factor | factor '/' factor
expression ::= term '+' term | term '-' term

Following is the same grammar in YACC syntax:

start: expression;
factor: '(' expression ')' | integer;
term: factor '*' factor | factor '/' factor;
expression: term '+' term | term '-' term;

Following is the same grammar in Spirit syntax:

start = expression;
factor = '(' >> expression >> ')' | int_p;
term = factor >> '*' >> factor | factor >> '/' >> factor;
expression = term >> '+' >> term | term >> '-' >> term;

Notice how little difference there is. The most significant difference is the use of the >> operator to separate symbols in each production. All the operators are normal, albeit overloaded, C++ operators. The production names are ordinary objects of type rule<>.

As grammars grow more complex, the C++ syntax forces more differences. For example, optional items use the ! operator. The Kleene star, which is postfix in BNF, uses the unary * operator, which is prefix in C++. For example, suppose the grammar is extended to allow a series of terms in an expression:

expression = term >> *( ('+' >> term) | ('-' >> term) ) ;

Spirit generates a recursive descent grammar, so you cannot use left-recursive rules, such as the following:

// illegal: causes infinite recursion in Spirit
expression = term | expression >> '+' >> term | expression 
             >> '-' >> term ;

All left-recursive rules are trivially rewritten to avoid left-recursion. (See any textbook on parsing for details. For example, consult the "Dragon Book": Compilers: Principles, Techniques, and Tools, by Alfred V. Aho, et al.)

In a real parser, you need more than just a grammar. You also need semantic actions: the parser must do something when it matches a production. Specify semantic actions as lambda expressions.

Spirit was added to Boost recently, and the author of Spirit wrote his own Lambda library, called Phoenix. Spirit was built to use Phoenix (or Phoenix was written to implement Spirit, depending on your point of view). In time, as Spirit and Boost continue to merge, the Boost lambda library will incorporate the best features of Phoenix into a single lambda library. Until then, you need to use Phoenix, although you can mix in some Boost if you really want to confuse yourself.

You can define a context for each production. The context can store information, and you can assign to the context in a semantic action, or obtain the context from another production. Semantic actions are specified in square brackets. For example, suppose the earlier grammar takes a double value that is computed on the fly. In Phoenix, you use arg1 instead of Boost's _1.

start = expression [start.value = arg1];
factor = '(' >> expression[factor.value=arg1] >> ')'
       | int_p[factor.value = arg1];
term = factor[term.value=arg1] >> '*' >> factor[term.value *= arg1]
     | factor[term.value=arg1] >> '/' >> factor[term.value /= arg1] ;
expression = term[expression.value=arg1] >> '+' >> term[expression.value += arg1]
           | term[expression.value=arg1] >> '-' >> term[expression.value -= arg1];

You can also call functions or functors, use binders to call arbitrary member functions, and so on. Numerous built-in parsers come with Spirit, such as int_p to parse an integer, real_p to parse a real number, and so on.

A parser gets its input from a scanner. You can scan a null-terminated character string, a range of characters specified by two iterators, or write a custom scanner. Spirit lets you control every aspect of the scanning and parsing, or you can choose the defaults, which work well for traditional text-oriented languages.

Click here for a complete example of a simple calculator.

Ray Lischner is the author of Delphi in a Nutshell and O'Reilly's upcoming C++ in a Nutshell.

O'Reilly & Associates will soon release (May 2003) C++ in a Nutshell.

Return to the O'Reilly Network.