[section Object Code] Let's look at some assembly. All assembly here was produced with Clang 4.0 with `-O3`. Given these definitions: [arithmetic_perf_decls] Here is a _yap_-based arithmetic function: [arithmetic_perf_eval_as_yap_expr] and the assembly it produces: arithmetic_perf[0x100001c00] <+0>: pushq %rbp arithmetic_perf[0x100001c01] <+1>: movq %rsp, %rbp arithmetic_perf[0x100001c04] <+4>: mulsd %xmm1, %xmm0 arithmetic_perf[0x100001c08] <+8>: addsd %xmm2, %xmm0 arithmetic_perf[0x100001c0c] <+12>: movapd %xmm0, %xmm1 arithmetic_perf[0x100001c10] <+16>: mulsd %xmm1, %xmm1 arithmetic_perf[0x100001c14] <+20>: addsd %xmm0, %xmm1 arithmetic_perf[0x100001c18] <+24>: movapd %xmm1, %xmm0 arithmetic_perf[0x100001c1c] <+28>: popq %rbp arithmetic_perf[0x100001c1d] <+29>: retq And for the equivalent function using builtin expressions: [arithmetic_perf_eval_as_cpp_expr] the assembly is: arithmetic_perf[0x100001e10] <+0>: pushq %rbp arithmetic_perf[0x100001e11] <+1>: movq %rsp, %rbp arithmetic_perf[0x100001e14] <+4>: mulsd %xmm1, %xmm0 arithmetic_perf[0x100001e18] <+8>: addsd %xmm2, %xmm0 arithmetic_perf[0x100001e1c] <+12>: movapd %xmm0, %xmm1 arithmetic_perf[0x100001e20] <+16>: mulsd %xmm1, %xmm1 arithmetic_perf[0x100001e24] <+20>: addsd %xmm0, %xmm1 arithmetic_perf[0x100001e28] <+24>: movapd %xmm1, %xmm0 arithmetic_perf[0x100001e2c] <+28>: popq %rbp arithmetic_perf[0x100001e2d] <+29>: retq If we increase the number of terminals by a factor of four: [arithmetic_perf_eval_as_yap_expr_4x] the results are the same: in this simple case, the _yap_ and builtin expressions result in the same object code. However, increasing the number of terminals by an additional factor of 2.5 (for a total of 90 terminals), the inliner can no longer do as well for _yap_ expressions as for builtin ones. More complex nonarithmetic code produces more mixed results. For example, here is a function using code from the Map Assign example: std::map make_map_with_boost_yap () { return map_list_of ("<", 1) ("<=",2) (">", 3) (">=",4) ("=", 5) ("<>",6) ; } By contrast, here is the Boost.Assign version of the same function: std::map make_map_with_boost_assign () { return boost::assign::map_list_of ("<", 1) ("<=",2) (">", 3) (">=",4) ("=", 5) ("<>",6) ; } Here is how you might do it "manually": std::map make_map_manually () { std::map retval; retval.emplace("<", 1); retval.emplace("<=",2); retval.emplace(">", 3); retval.emplace(">=",4); retval.emplace("=", 5); retval.emplace("<>",6); return retval; } Finally, here is the same map created from an initializer list: std::map make_map_inializer_list () { std::map retval = { {"<", 1}, {"<=",2}, {">", 3}, {">=",4}, {"=", 5}, {"<>",6} }; return retval; } All of these produce roughly the same amount of assembly instructions. Benchmarking these four functions with Google Benchmark yields these results: [table Runtimes of Different Map Constructions [[Function] [Time (ns)]] [[make_map_with_boost_yap()] [1285]] [[make_map_with_boost_assign()] [1459]] [[make_map_manually()] [985]] [[make_map_inializer_list()] [954]] ] The _yap_-based implementation finishes in the middle of the pack. In general, the expression trees produced by _yap_ get evaluated down to something close to the hand-written equivalent. There is an abstraction penalty, but it is small for reasonably-sized expressions. [endsect]