Thursday, May 22, 2014

Using The Pigeonhole Principle in C++ Metaprogramming

The Pigeonhole Principle is one of the most obvious fundamentals in mathematics. It is so obvious that you may be surprised that there is even a name for it. It states that:

"If n items are put into m containers, with n > m, then at least one container must contain more than one item."

Alternatively,

"If there are n items and m containers, with n > m, and only one item can fit in a container, then at least one item must remain out."

For those who prefer visuals and really hate math:


Even though the principle is simple it has been used to prove many complex mathematical theorems and lemmas. Here is one I find quite interesting:

"Incompressible strings of every length exist."

Alternatively,

"There is a file of every size that your favorite zip program can't compress."

The solution is left to the reader as an exercise.

So, does the Pigeonhole Principle show up in programming. Of course it does. That's why std::vector must allocate memory when its capacity is full. OK, but does it manifest in more interesting ways? As it turns out, it has been used in compile-time meta-programming to achieve interesting results. It manifests in preprocessor meta-programming and in template meta-programming in two distinct flavors.

The Pigeonhole Principle in C++ Preprocessor Meta-programming

Check out the following example. Also available here. The original author of this trick is unknown to me.
#include <iostream>

#define COUNT_ARGS(...)     PP_NARG_IMPL(__VA_ARGS__,PP_RSEQ_N()) 
#define PP_NARG_IMPL(...)   PP_ARG_N(__VA_ARGS__) 
#define PP_ARG_N( _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, N, ...) N 
#define PP_RSEQ_N() 10,9,8,7,6,5,4,3,2,1,0 

int main()
{
  std::cout << COUNT_ARGS(a,b,c,d); // prints 4
}
COUNT_ARGS is a "simple" macro that counts the number of variadic arguments it is called with. It does that by using a preprocessing programming trick based on the Pigeonhole principle. Here is how the macro expands:
  1. The COUNT_ARGS macro substitutes the arguments (a,b,c,d) in the __VA_ARGS__ part before calling PP_NARG_IMPL. The PP_RSEQ_N macro is a list of integers from 10 to 0, which is substituted in the PP_NARG_IMPL. Therefore, the PP_NARG_IMPL macro is "called" with actual arguments = a,b,c,d,10,9,8,7,6,5,4,3,2,1,0
  2. The PP_NARG_IMPL macro simply forwards its arguments to the PP_ARG_N macro.
  3. The PP_ARG_N macro is where the Pigeonhole Principle comes in to play. It has 11 named arguments: From _1, _2, _3, etc. and N. Note that _1, _2, etc. are not special. They are just macro arguments with an underscore at the beginning. You may want to rename them as one, two, three, four, etc. It won't make a difference. The PP_ARG_N always expands to its 11th argument because of N.
  4. The original argument list has 15 arguments but there are only 11 arguments to the PP_ARG_N macro. Obviously, not all are going to fit. The PP_ARG_N macro only "picks up" the first actual argument that does not get a slot (i.e., 11th)
  5. As N always coincides with the 11th actual argument, the PP_ARG_N results in that value producing the count.
Needless to say, that's clever! Now let's proceed with template meta-programming.

The Pigeonhole Principle in C++ Template Meta-programming

Check out the following example. Also available here.
int main()
{
 auto x = ::nth<7>(0,"1",'2',3,"4",'5',6,"7");
 std::cerr << x << std::endl;
}
The goal is to access the N-th element in a variadic function argument list. The output of the above program should be 7.

There are many ways to go about implementing it, most using recursion of some sort. However, there is one implementation I came across, which I find particularly interesting. Why? You guessed it ... It uses the Pigeonhole Principle to avoid recursion.

Originally, the code below was posted by Roland Bock on the boost developers mailing list. If you prefer more comments, please see the same example with comments by LJEvans.
#include <utility>
#include <iostream>

namespace detail
{
  struct any { template<typename T> any(T &&) {} };

  template<typename T, typename U> struct first { typedef T type; };

  template<typename ...Ts>
  struct select_impl 
  {
    template<typename U, typename ...Vs>
 static U &&select(typename first<any, Ts>::type..., U &&u, Vs &&...) 
    {
    return static_cast<U&&>(u);
    }
  };

  template<std::size_t... Idx, typename... Ts>
  static auto select(const std::index_sequence<Idx...>&, Ts&&... ts)
  {
     return select_impl<decltype(Idx)...>::select(static_cast<Ts&&>(ts)...);
  }
}

template<std::size_t N, typename ...Ts>
auto nth(Ts &&...ts)
{
  return detail::select(std::make_index_sequence<N>(), static_cast<Ts&&>(ts)...);
}

int main()
{
 auto x = ::nth<7>(0,"1",'2',3,"4",'5',6,"7"); // prints 7
 std::cerr << x << std::endl;
}
Here is how the nth<7>(...) function works in the example above.
  1. N is 7 and Ts is a variadic parameter pack of integers, character strings, and plain characters.
  2. The std::make_index_sequence is a new addition in C++14 that produces an instance of std::index_sequence given a compile-time integral constant. Here, it produces std::index_sequence<0,1,2,3,4,5,6>.
  3. The formal arguments to the nth function (captured in the parameter pack ts) are forwarded to detail::select using a static_cast. This function must return the nth argument among the forwarded arguments.
  4. In detail::select, the Idx parameter pack represents the indices from 0 to 6. Its deduced by the compiler looking at the type of the index_sequence instance.
  5. The select_impl class template is instantiated with the decltype of each member in the Idx parameter pack. decltype(ts)... expands in to a list of types for every member in Ids. In this case, it is just 'int, int, int,... 7 times. The remaining arguments to select_impl::select are just being forwarded as before.
  6. The select_impl::select has access to Ts parameter pack, which is at the class-template level. Recall that it is 'int,int,int,....'. The list of formal arguments to select_impl::select is broken down in to 3 parts: a variadic piece of N-1 arguments at the beginning, U&& in the middle, and everything else in Vs.
  7. The first N-1 arguments to select_impl::select are "absorbed" using the detail::any class. The detail::any has a single argument constructor that converts argument of any type to any. The first N-1 arguments are thus converted to any. In our example, all the arguments from 0 to 6 are converted to any. The conversion is achieved using an in place parameter pack expansion 'typename first::type...'. For every argument in the Ts parameter pack, the 'first' meta-function is applied, which results into the 'any' type every time.
  8. As the first N-1 arguments are out of the way, U&& necessarily fits the N-th argument. This is where the Pigeonhole Principle springs back in to action.
  9. The remaining argument after the N-th (if any) are left unused in the Vs parameter pack.

So, there it is: returning the N-th argument in a argument list without using recursion. In practice, however, std::make_index_sequence is implemented using recursion. So, the above code is not truly recursion-free.

OK ... So you read it all! I'm sure you found the use of the Pigeonhole Principle in processing variadics in C++ very interesting.

Wednesday, May 07, 2014

A tale of noexcept swap for user-defined classes in C++11

C++11 has taken another stab at function exception specifications for the masses. The shiny new ‘noexcept’ keyword is phasing out its zombie cousin ‘throw’. The well accepted guideline for the old exception specification is: "don’t use them!" That’s good for us because now we don’t have to bother (the reasons for not using throw specification aren’t for the faint hearted anyways.) The noexcept feature does not appear to be a walk in a park either. So hold on tight!

noexcept is meta-programmable! a.k.a conditional noexcept. It is possible to conditionally specify functions to throw any exceptions. The noexcept specification of functions can be inspected at compile-time and functions can derive their own noexcept specification based on exception specifications found elsewhere in the program. Meta-programs are not off-limits here. An excellent introduction of the noexcept feature and its history can be found in June’11 Overload Journal and on Andrzej's C++ blog. I won’t repeat that here but a quick motivation is in order.

Compile-time knowledge about a move-constructor that it does not throw (i.e., annotated with noexcept(true)) can be used for improved run-time performance. For example, std::vector<T>::resize, std::vector<T>::reserve automatically use T’s move-constructor instead of a copy-constructor if T's move-constructor does not throw. Moving of T objects instead of copying would likely achieve higher performance. The standard library provides std::move_if_noexcept function that does move only if it is guaranteed to not fail. Otherwise, it simply resorts to copy-construction. More on this can be found here. I'll return to the topic of writing noexcept move-ctor towards the end of this post.

But the post is about swapping, so lets talk about that. Here is how the std::pair::swap is declared in C++11.

void pair::swap(pair& p)
  noexcept(noexcept(swap(first, p.first)) &&
           noexcept(swap(second, p.second)));


It basically says that pair's swap will throw if either swapping of first or second member throws. The swapping of first and second possibly resolve to their own namespace-level swap via ADL or else they simply pick up std::swap (because pair::swap is declared in std namespace). This declaration seems to indicate a few things.

  1. For a user-defined type it is now much more preferable to have its own namespace-level swap overload instead of using vanilla std::swap. The namespace-level swap function, being an extension of the type’s interface, is vastly more efficient and often can be written in way that it does not throw. The std::swap function, on the other hand, may throw because it uses a copy-constructor and copy-assignment operator while swapping two objects. If you are like me, your member copy-assignment operator will use copy-and-swap idiom for strong exception safety. That means std::swap will create and destroy 3 copies. What a waste! Moreover, that increases the chance of throwing. Bottom line: If T has a member swap, a swap function should be added in the same namespace as of T.

  2. There are some assumptions buried in the above reasoning: (1) move-constructor is not available for a user-defined T, and (2) it is possible to implement member swap of T that does not throw and you can actually say so.

  3. As far as move-constructor being not available is concerned, I think, there are large number of C++03 libraries, which will acquire move-semantics slowly than you might desire. Till then std::swap will, unfortunately, use copy-ctor and copy-assign. A legacy class with a copy-ctor won’t get an automatic move-ctor because it would do wrong things.

  4. The noexcept specification of the user-defined swap better be accurate. Can we really write a nothrow member swap and say so confidently? Obviously, it depends on the members of the user-defined type that we’re swapping. To me, it’s all uphill from here.


Lets look at how standard library uses noexcept for some of its own classes. We already saw std::pair before. How about std::tuple, which is a generalization of a struct and that’s why may be user-defined classes should follow the same basic principle. Tuple’s member swap is declared as

void tuple::swap(tuple& rhs) noexcept(see below)
// The expression inside noexcept is equivalent to the
// logical AND of the following expressions:

noexcept(swap(declval<T&>(), declval<T&>()))
// where Ti is ith type in the variadic list of types.


Tuple’s member swap inspects the noexcept specifications of its members’ swap functions and derives its own specification from that. That makes sense. If swapping of any of the member throws, the tuple’s swap throws. The declval function belongs to same clique as std::move, std::forward, etc. It is a way to obtain rvalue reference of a type without using a constructor. It suffices to say that declval<T> returns "T&&" and declval<T&> returns "T & &&", which is the same as "T &."

Consider a typical C++03 user-defined legacy class that manages its own resources.

namespace L 
{
struct legacy 
{
  legacy();                             // Does not throw
  legacy(const legacy &);               // This may throw
  legacy & operator = (const legacy &); // This may throw (strong exception safe)
  ~legacy() throw();                    // Does not throw
  void swap(legacy &) throw();          // Does not throw
};
} // namespace L


The above legacy class is well-designed except for the fact that it does not have any namespace-level swap. To make it a better citizen in C++11, it needs a namespace-level swap. Interestingly enough, there is substantial code out there that has a specialization of the std::swap in the std namespace. Considering the popularity of this anti-pattern, it probably makes sense to add a swap in two places. It is not too much work in the first place and that may help people who always use a qualified swap (std::swap) habitually.

namespace L 
{
  void swap(legacy &, legacy &) noexcept;
}

namespace std 
{
  template <>
  void swap(L::legacy &, L::legacy &) noexcept;
}


Let’s try to use this modernized legacy class in our C++11 class: Modern.

namespace M 
{
struct Modern 
{
  int count;
  std::string str;
  std::array<L::legacy, 5> data;
  /// default-ctor, copy-ctor, copy-assign, move-ctor, move-assign are defined = default.
  void swap(Modern & p) 
    noexcept(noexcept(swap(std::declval<int &>(),
                           std::declval<int &>())) &&
             noexcept(swap(std::declval<std::string &>(),
                           std::declval<std::string &>())) &&
             noexcept(swap(std::declval<std::array<L::legacy, 5> &>(),
                           std::declval<std::array<L::legacy, 5> &>())));
};
} // namespace M


That’s one awful looking swap. My eyes bleed when I look at it.

It is doing exactly the same thing as std::tuple. It simply checks whether swapping two integers, two strings, and two arrays throw or not. If it does, then member swap of Modern throws. You can skip checks for integer swapping but what’s left is not pretty at all.

Note: The text in italics below is not accurate anymore as noted in the comment below. C++11 std::swap uses move-construction and move-assignment internally. As a result, std::swap in C++11 is generally very efficient IF the move-operations are defined and they are really more efficient than their copy counterparts. In such cases, there is little need to write a custom namespace-level swap. However, member swap may still be useful. See the "default-construct-and-swap" technique for implementing a move-constructor below.

Couple of important things to note here:
  1. Because we added namespace-level swap for the L::legacy class, we dodged several problems.If we did not have free swap, the compiler will instantiate std::swap if it is in the scope. Note, however, that vanilla std::swap may throw defeating the purpose of Modern::swap. If no swap is to be found, compiler issues an error.

  2. We hope that swap of std::string and std::array do not throw. As mentioned in [container.requirements.general], std::string::swap does not throw but std::array<T> swap may throw if an unqualified call to swap(T &, T &) throws. Once again, the namespace-level swap will be chosen here, if available. If that does not exist, a specialization of std::swap will be chosen. If none of them exist, std::swap will be instantiated giving our Modern::swap a nothrow(false) specification.

I like my strong exception safety in my little copy-assignment operator so I don’t want my swap to throw but I don’t want my program to std::terminate() if an exception is really thrown. With all that in mind, I would rewrite the swap as follows.

void Modern::swap(Modern &) noexcept 
{
  static_assert(noexcept(swap(std::declval<int &>(),
                              std::declval<int &>())) &&
                noexcept(swap(std::declval<std::string &>(),
                              std::declval<std::string &>())) &&
                noexcept(swap(std::declval<std::array<L::legacy, 5> &>(),
                              std::declval<std::array<L::legacy, 5> &>())), 
                "throwing swap");
  //.... remaining swap code
}


This is not unlike what’s proposed by others but there’s more. The static assert looks awful and looks redundant. There is already a standard library utility that does the same thing: std::tuple. As mentioned before, std::tuple’s swap throws if any member swap throws. We use it here to make our job a lot easier.

void Modern::swap(Modern & that) noexcept 
{
  typedef std::tuple<int, std::string, std::array <L::legacy, 5> > MemberTuple;
  static MemberTuple *t = 0;
  static_assert(noexcept(t->swap(*t)), "throwing swap");
  // .... remaining swap code
}


If you an enthusiastically paranoid C++ programmer, you may want to use conditional noexcept with a little meta-function (is_nothrow_swappable_all) to save typing.

template<typename... T>
struct is_nothrow_swappable_all
{
  static std::tuple<T...> *t = 0;
  enum { value = noexcept(t->swap(*t)) };
};

void Modern::swap(Modern & that) 
   noexcept(is_nothrow_swappable_all<int, std::string, std::array<L::legacy, 5> >::value);


The "remaining swap code" section above is also rather important. The recommended (by Dave and Howard) way to write it is to use unqualified swap but bring std::swap in the scope:

void Modern::swap(Modern & that) 
  noexcept(is_nothrow_swappable_all<int, std::string, std::array<L::legacy, 5> >::value);
{
  using std::swap;
  swap(this->count, that->count);
  swap(this->str, that->str);
  swap(this->data, that->data);
}


The appropriate swap (namespace-level via ADL or std::swap) will be chosen depending upon availability. That’s exactly what tuple’s swap noexcept checker is going to do for us.

Back to the move-constructor, as promised! As mentioned before, noexcept specification of a move-ctor may have performance implications. So you want to get it right. For the M::Modern class we could have defaulted the move-ctor (= default). That will give it a noexcept(false) specification automatically because internally it uses L::legacy’s copy constructor, which throws. As a consequence, std::vector::resize is not as fast as it can be. We can do better.

Implementing a move-ctor using the default-construct-and-swap idiom turns out be quite handy. Default-construct-and-swap does what it says by first delegating to a noexcept default constructor followed by a swap. To implement a noexcept move-ctor this way, you really need to make sure that swap does not throw. As always, you can rely on a conditional noexcept. I'm using std::is_nothrow_constructible type trait to test the obvious!

Modern::Modern(Modern && m) 
  noexcept(std::is_nothrow_constructible<Modern>::value &&
           noexcept(m.swap(m)))
  : Modern()           
{
  swap(m);
}


As long as Modern's default constructor and member swap are noexcept, the move-ctor is noexcept.

Finally, the move-assign operator can be simply implemented as a single swap operation but for some classes that may not be accurate.

Modern::operator = (Modern && m) noexcept
{
  swap(m);
}


Acknowledgements: I want to thank Howard Hinnant (former Library Working Group chair) for reviewing this article and providing me valuable feedback. Any inaccuracies and mistakes left in the article are strictly mine.

Sunday, May 04, 2014

Fun with Lambdas: C++14 Style (part 2)


Look at some interesting examples of C++11/14 lambdas and how they interact with other language features and libraries. I hope to find some time to add some explanations. See part 1 if you missed it.

  • Associative containers and lambdas
    std::set<int, std::function<bool(int, int)>> 
      numbers([](int i, int j) { return i < j; });
    
  • Recursive Lambdas (see Creating recursive lambdas and returning them too!)
    auto make_fibo() 
    {
      return [](int n) {
        std::function<int(int)> recurse;
        recurse = [&](int n){ 
           return (n<=2)? 1 : recurse(n-1) + recurse(n-2); 
        }; 
        return recurse(n);
      };
    }
    
  • Composable list manipulation (e.g., cpplinq, narl, LEESA)
    Box boxes[] = { ... };
    int sum_of_weights = 
         cpplinq::from_array(boxes)
      >> where([](const Box & box) { 
           return box.color == Color.RED;
         })
      >> select([](const Box & box) {
           return box.get_weight();
         })
      >> sum();
    
    
  • Overloaded Lambdas
    template <class... F>
    struct overload : F... {
      overload(F... f) : F(f)... {}  
    };
    
    template <class... F>
    auto make_overload(F... f) {
      return overload<F...>(f...);   
    }
    
    auto f = 
        make_overload([](int i) { /* print */ },
                      [](double d) { /* print */ });
    f(10); // int 
    f(9.99); // double
    
  • Type Switch (simple pattern matching) (see type_switch.cpp and this paper)
    struct Base { 
      virtual ~Base() {} 
    };
    struct Derived : Base {};
    
    template <class Poly>
    void test(Poly& p) {  
      match(p)(
        [](int i)             { cout << "int";       },
        [](std::string &)     { cout << "string";    },
        [](Derived &)         { cout << "Derived";   },     
        [](Base &)            { cout << "Base";      },    
        otherwise([](auto x)  { cout << "Otherwise"; })
      );  
    }
    Derived d;
    Base &b = d;
    std::string cpptruths = "C++ Truths";
    boost::any something = cpptruths;
    
    test(10);        // int
    test(cpptruths); // string
    test(something); // string
    test(b);         // Derived
    test(9.99);      // Otherwise
    
  • Converting shared_ptr between boost and std (see StackOverflow)
    template <typename T>
    boost::shared_ptr<T> 
    make_shared_ptr(std::shared_ptr<T> ptr) 
    {      
      return boost::shared_ptr<T>(ptr.get(), 
        [ptr](T*) mutable { ptr.reset(); });
    }
    
    template <typename T>
    std::shared_ptr<T> 
    make_shared_ptr(boost::shared_ptr<T> ptr)
    {      
      return std::shared_ptr<T>(ptr.get(), 
        [ptr](T*) mutable { ptr.reset(); });
    }
    
  • In-place parameter pack expansion 
    template <class... T>
    void foreach(T... args) 
    {  
      bool b[] = { [=](){ 
        std::cout << args << "\n"; 
        return true; 
      }()... }; 
    }
    
    foreach(10, 20.2, true);
    
  • Memoization (see original)
    template <typename ReturnType, 
              typename... Args>
    auto memoize(ReturnType (*func)(Args...))
    {
        std::map<std::tuple<Args...>, ReturnType> cache;
    
        return ([=](Args... args) mutable  
        {
            std::tuple<Args...> t(args...);
            if (cache.find(t) == cache.end())                
            {
              std::cout << "not found\n";
              cache[t] = func(args...);
            }
            return cache[t];
        });
    }
    
  • Finally, slides




Monday, March 24, 2014

Why we need compile-time reflection in C++1y

Programs need data. That's a no brainer. Programs are only as good as the data you provide them. Based on what kind of data is consumed, programs can be divided into two broad categories: (1) those that operate on regular data (a file), and (2) those that operate on other programs. The first kind of programs are abundant. Your browser, for instance, is showing you this page--its data. The second kind of programs are more interesting and they are called meta-programs.

Meta-programs need data too. As with the other programs, meta-programs are only as good as the data you provide them. So what do we feed them? ... Well, In C++, more important than 'what' is 'when'. (remember Morpheus?) A C++ program is just a sequence of bits the compiler is trying to understand. So, while the compiler is trying to make sense of your program, most of it gets translated (to assembly) but some of it gets executed. Quite intriguing! We're talking about compile-time meta-programming.

Coming back to the 'what'. We want to be able to feed whatever is available at compile-time: types, members, functions, arguments, namespaces, line numbers, file names, all are a fair game. Less obvious things are relationships among types: convertibility, parent/child, base/derived, container/iterator, friends, and more.

A C++ compiler already has this information but it is not in a form a meta-program can use. So we're in a soup, where we can run programs (at compile-time) but there is no data! So the next question is 'how' do we make the data available to our meta-programs? And that brings me to what I like to call the Curiously Recurring Template Meta-Programming  (CRTMP) pattern.

Curiously Recurring Template Meta-Programming Pattern

The idea is rather general and many have done it successfully before: Make data available to meta-programs without offending the compiler and do something interesting with it.

Let's look at who are the subjects (players) in this pattern. (1) the compiler, (2) the meta-program, and last but not the least is (3) the programmer itself because machines haven't taken over yet and humans still write most of the programs as of today.

The compile-time data must make sense to all three above. Today, C++ programmers, because we don't mind pain, create that data in a form that is understood by the former two. The prime examples are the traits idiom, the type_traits library, and sometimes code generators that parse C++ files and spit out relationships between classes.  For example, LEESA's gen-meta.py script generates typelists (Boost MPL vectors) for classes that contain other classes (think XML data-binding). Effectively it builds a compile-time tree of the XML node types.

When things are not auto generated, we make it palatable to the fellow programmers using macros. To many, macros are as obnoxious as the data they hide/generate but lets move on. There are many examples of super-charged too: Boost SIMD, pre-variadic Boost MPL, smart enumerations, and many more. When macros are used in a clever way (abused!) they really do look like magic. I got a first-hand experience of that while developing the RefleX library.

RefleX is a compile-time reflection-based type modeling in C++ for DDS Topics. It is open-source but you need the RTI Connext DDS to play with it. It essentially transforms your native C/C++ type into a serializable type representation called a TypeObject and marshals your data in what is called a DynamicData object. Note that both, type and data are serialized. There are systems--perhaps many we owe our modern life to--that need to distribute types and data over the network for discovery, interoperability, compatibility, and for other reasons.

Here's an example:


The RTI_ADAPT_STRUCT macro expands to about 120 lines of C++ code, which is primarily reflection information about ShapeType and it can be used at compile-time. It is based on the BOOST_FUSION_ADAPT_STRUCT macro. The macro opens the guts of the specified type to the RefleX library. The meta-programs in RefleX use this "data" to do their business. The reflection information includes member types, member names, enumerations, and other ornaments such as a "key". The point is that the same CRTMP pattern is used to "export" information about a native C++ type.

So, the last two open-source C++ libraries I wrote use the CRTMP pattern: In one, "data" is generated using a Python script and in the other using a macro. CRTMP makes C++ libraries remarkably powerful. The reality is there is nothing novel about it. It is seen everywhere.

The natural step in evolution of a idiom/pattern is first-class language support.  If something is so prevalent, the language itself should absorb it eliminate the crud involved developing and writing CRTMP-based libraries.

That brings us to the main point of this post: Compile-time Reflection. We need it. Period. It's a natural step of evolution from where C++ is now. When available, it will make vast amount of compile-time data available to C++ meta-programs. They will run faster, look nicer, and they will knock your socks off! It is mind boggling what has been achieved using template and preprocessor meta-programming. Compile-time reflection will push it two notches up. So stay tuned for C++1y.

Wednesday, March 12, 2014

Fun with Lambdas: C++14 Style (part 1)

It's common knowledge that Functional Programming is spreading like a wildfire in mainstream languages. Latest promoted languages: Java 8 and C++, both of which now support lambdas. So, let the lambdas begin! and may the fun be ever on your side. The same text is available in slides form on Slideshare. This blog post and the talk/slides are inspired by JSON inventor Douglas Crockford.

Write an Identity function that takes an argument and returns the same argument.
auto Identity = [](auto x) {
  return x;
};
Identity(3); // 3
Write 3 functions add, sub, and mul that take 2 parameters each and return their sum, difference, and product respectively.
auto add = [](auto x, auto y) {
  return x + y;
}; 
auto sub = [](auto x, auto y) {
  return x - y;
};
int mul (int x, int y) {
  return x * y;
};

Write a function, identityf, that takes an argument and returns an inner class object that returns that argument.
auto identityf = [](auto x) {
  class Inner {
    int x;
    public: Inner(int i): x(i) {}
    int operator() () { return x; }
  };
  return Inner(x);
};
identityf(5)(); // 5

Write a function, identityf, that takes an argument and returns a function that returns that argument.
auto identityf = [](auto x) {
  return [=]() { return x; };
};
identityf(5)(); // 5

Lambda != Closure
  • A lambda is just a syntax sugar to define anonymous functions and function objects. 
  • A closure in C++ is a function object which closes over the environment in which it was created. The line #2 above defines a closure that closes over x.
  • A lambda is a syntactic construct (expression), and a closure is a run-time object, an instance of a closure type. 
  • C++ closures do not extend the lifetime of their context. (If you need this use shared_ptr)
Write a function that produces a function that returns values in a range.
auto fromto = [](auto start, auto finish) {    
  return [=]() mutable {      
    if(start < finish)        
      return start++;      
    else        
      throw std::runtime_error(“Complete");    
  };  
};
auto range = fromto(0, 10); 
range(); // 0
range(); // 1
Write a function that adds from two invocations.
auto addf = [](auto x) {
  return [=](auto y) { 
    return x+y; 
  };
};
addf(5)(4); // 9
Write a function swap that swaps the arguments of a binary function.
auto swap =[](auto binary) {
  return [=](auto x, auto y) {
    return binary(y, x);
  };
};
swap(sub)(3, 2); // -1
Write a function twice that takes a binary function and returns a unary function that passes its argument to the binary function twice.
auto twice =[](auto binary) {
  return [=](auto x) {
    return binary(x, x);
  };
};
twice(add)(11); // 22
Write a function that takes a binary function and makes it callable with two invocations.
auto applyf = [](auto binary) {
  return [=](auto x) { 
    return [=](auto y) {
      return binary(x, y); 
    };
  };
};
applyf(mul)(3)(4); // 12
Write a function that takes a function and an argument and returns a function that takes the second argument and applies the function.
auto curry = [](auto binary, auto x) {
  return [=](auto y) { 
    return binary(x, y);
  };
};
curry(mul, 3)(4); // 12
Currying (schönfinkeling)
  • Currying is the technique of transforming a function that takes multiple arguments in such a way that it can be called as a chain of functions, each with a single argument.
  • In lambda calculus functions take a single argument only.
  • Must know Currying to understand Haskell.
  • Currying != Partial function application
Partial Function Application
auto addFour = [](auto a, auto b, 
                  auto c, auto d) {
  return a+b+c+d;
};
auto partial = [](auto func, auto a, auto b) {
  return [=](auto c, auto d) { 
    return func(a, b, c, d);
  };
};
partial(addFour,1,2)(3,4); //10
Without creating a new function show 3 ways to create the inc function.
auto inc = curry(add, 1);
auto inc = addf(1);
auto inc = applyf(add)(1);
Write a function composeu that takes two unary functions and returns a unary function that calls them both.
auto composeu =[](auto f1, auto f2) {
  return [=](auto x) {
    return f2(f1(x));
  };
};
composeu(inc1, curry(mul, 5))(3) // 20
Write a function that returns a function that allows a binary function to be called exactly once.
auto once = [](auto binary) {    
  bool done = false;    
  return [=](auto x, auto y) mutable {
    if(!done) {        
      done = true;        
      return binary(x, y);      
    }      
    else        
      throw std::runtime_error("once!");     
  };  
};
once(add)(3,4); // 7
Write a function that takes a binary function and returns a function that takes two arguments and a callback.
auto binaryc = [](auto binary) {    
  return [=](auto x, auto y, auto callbk) {
   return callbk(binary(x,y));    
  };  
};
binaryc(mul)(5, 6, inc) // 31
binaryc(mul)(5,6,[](int a) { return a+1; });
Write 3 functions:
  • unit – same as Identityf
  • stringify – that stringifies its argument and applies unit to it
  • bind – that takes a result of unit and returns a function that takes a callback and returns the result of callback applied to the result of unit.
auto unit = [](auto x) { 
  return [=]() { return x; };
};
auto stringify = [](auto x) {    
  std::stringstream ss;
  ss << x;
  return unit(ss.str());
};

auto bind = [](auto u) {    
  return [=](auto callback) {
   return callback(u());    
  };  
}; 
Then verify.
std::cout << "Left Identity "  
          << stringify(15)() 
          << "==" 
          << bind(unit(15))(stringify)()
          << std::endl;

std::cout << "Right Identity " 
          << stringify(5)() 
          << "=="
          << bind(stringify(5))(unit)()
          << std::endl;
Why are unit and bind special? Read more about them here.

Sunday, March 09, 2014

Fun with Lambdas: C++14 Style

I am presenting at the SF Bay Area Association of C/C++ Users (ACCU) meetup on Wed, Mar 12th. Topic: Fun with Lambdas: C++14 Style. Slides and the blog will be available here so stay tuned.


Wednesday, October 16, 2013

Creating Recursive Lambdas ... and returning them too!

Ever since C++ adopted lambda expressions, many have stumbled upon the question whether they can be recursive. IMO, if an anonymous function needs to call itself why not create a named function/functor in the first place? But I'm not here today to argue whether it is a good or bad design. So here it is: the most common way to create a recursive lambda in C++11.
void test() 
{
  std::function<int(int)> fib = [&fib](int n)
  {
    return (n <= 2)? 1 : fib(n-1) + fib(n-2);
  };
}
The way it works is quite interesting. First, I created a std::function object, fib, and captured that in the lambda by reference. When the lambda captures fib, it is uninitialized because nothing has been assigned to it yet. The compiler knows the type of fib so it does not complain. It happily creates a closure with an uninitialized std::function. Right after that, it assigns the closure to the fib object so it gets initialized. Therefore, the reference inside lambda also works automatically. The lambda simply uses the captured std::function reference to call itself.

A problem with this approach is that the recursive lambda can not be returned. Why? When the function ends, so does the fib object and consequently, the reference inside the closure becomes invalid. Capture by value is also futile because in that case the closure object will end up getting a copy of the uninitialized std::function object that upon invocation throws std::bad_function_call exception.

Is there a way out? As it turns out, there is!

UPDATE: Within hours of uploading the original post, I came to understand a sleeker way to create and return recursive lambdas. Thanks to the generous people out there on reddit/r/cpp.
std::function<int(int)> create()
{
  int foo = 20;
  std::function<int(int)> f = 
    [=](int n) mutable {
         std::function<int(int)> recurse;
         recurse = [&](int n) { 
            foo = 10;
            return (n<=2)? 1 : recurse(n-1) + recurse(n-2); 
         };  
         return recurse(n);
       };  
  std::cout << f(6) << ", foo=" << foo << "\n"; // prints "8, foo=20"
  return f;
}
The technique involves creating nested lambdas. You don't have to return the inner recursive lambda, which becomes somewhat clunky as described later in this post. Instead, you return the outer lambda, which when invoked creates a recursive lambda using the same technique described at the beginning. It also invokes it and returns the result because the inner lambda does not live for too long.

Capturing the right state at the right place may become important if your recursive lambda is stateful. The outer lambda (copied in f) captures the state by value. I.e., it captures foo by value. The inner lambda captures its state by reference. However, the foo it refers is not the local foo variable in function create. Instead, it refers to the state captured by the outer lambda. So when you modify foo inside the inner lambda, it modifies the copy inside outer and the local foo remains unchanged. Further, to allow modification, the outer lambda must be mutable.

This behavior does not appear surprising at all when you consider the "context" in which the inner lambda runs. The create function might have returned long before the inner lambda ever got a chance to execute. The compiler is smart to figure that out and so the context of the capture for the inner lambda is limited to the state captured by the outer lambda.

foo is referenced inside the inner lambda only. But syntactically it lives inside the outer lambda too. And therefore, it must be captured by the outer lambda. This allows clear separation of who manages the state and who manages recursion.

Of course, there are many other combinations of capturing state for the inner and outer lambdas. Those are left to the reader as an excercise!

Finally, returning the std::function object will make a copy of the outer lambda and in turn the state captured by it. In most cases this is desirable. But if it is not, you could create a shared_ptr to the std::function object and use that to pass the stateful recursive lambda around.

std::shared_ptr<std::function<int(int)>> is not callable, however. I mean, you have to dereference it before you can call it like a function. So it cannot be passed to STL algorithms. A wrapper may be what you need to take care of that.
template <class T>
class func_wrapper // version 3
{
  std::shared_ptr<std::function<T>> func;

public:

  func_wrapper(std::function<T> f)
    : func(std::make_shared<std::function<T>>(std::move(f)))
  { }

  template <class... Args>
  auto operator () (Args&&... args) const 
    -> decltype((*func)(std::forward<Args>(args)...)) 
  {
    return (*func)(std::forward<Args>(args)...);
  }
};
Continue reading if you are interested in avoiding nested lambdas.

WARNING: Hacks ahead!
The idea is something like this: If the std::function is captured by value, we get two copies. One is uninitialized and the other is initialized. If somehow, during the initialization of the std::function object, we could go back and modify the copy inside the closure, may be we will get lucky.

So here is how I've to rewrite the above recursive lambda. Because I also intend to return it, I captured the variables by-value.
const func_wrapper<int(int)> create()
{
  func_wrapper<int(int)> fib;
  fib = [fib](int n)
  {
    return (n <= 2)? 1 : fib(n-1) + fib(n-2);
  };
  return fib;
}

The func_wrapper class does a bunch of interesting things and most of my discussion is going to center around it. Note, however, that I've separated the default initialized fib object from the assignment. It is very important to do it that way.

The reason behind separating the default construction from assignment is that copy-constructor and copy-assignment operators are called separately for the fib object. First, the fib object is default-initialized so we have something "decent" to work with. When the fib object is captured by-value, its copy-constructor is called as part of the usual capture semantics. The copy constructor is the only chance we get to "interact" with the func_wrapper inside the closure object. Later on, copy-assignment operator is called on fib with closure object as the right hand side.

These two functions give us just the right opportunity to setup some references inside func_wrapper to change the state of the captured func_wrapper. So lets look at the func_wrapper class without further ado.
template <class T>
class func_wrapper
{
  std::shared_ptr<std::function<T>> func;
  func_wrapper *captured;

public:

  func_wrapper() 
    : captured(nullptr)
  {}

  func_wrapper(const func_wrapper &) = default;
  func_wrapper(func_wrapper &&) = default;

  // non-const copy-ctor
  func_wrapper(func_wrapper & f)
    : func(f.func),
      captured(nullptr)
  {
    f.captured = this;
  }

  template <class U> 
  const func_wrapper & operator = (U&& closure)
  {
    func = std::make_shared<std::function<T>>();
    if(captured)
      captured->func = func;

    (*func) = std::forward<U>(closure);
    return *this;
  }

  template <class... Args>
  auto operator () (Args&&... args) const 
    -> decltype((*func)(std::forward<Args>(args)...)) 
  {
    return (*func)(std::forward<Args>(args)...);
  }
};

The func_wrapper class is small but quite interesting (at least, I think so!). The role of func_wrapper is to behave just like std::function but the main difference is that func_wrapper uses a shared_ptr to a std::function. This way, multiple func_wrapper objects can share the same std::function object. You know where I am going with this, right?!

The default constructor, copy-constructor (const version), and the move-constructor of func_wrapper are very typical. There is also a non-const copy-constructor: "func_wrapper(func_wrapper &)". This constructor is called only when the right-hand-side object is a non-const lvalue. I.e., this is the constructor that gets invoked when the fib object is captured by-value in the lambda.

What's unusual about this constructor is that it modifies the right-hand-side lvalue. It is allowed because the parameter is non-const. We assign this pointer of the nameless func_wrapper (inside the closure) to the named fib object (outside the closure). I.e., the named fib object now "links" to the namesless object inside the closure. The func_wrapper inside the closure links to NULL. You can think of it as a little linked-list of func_wrapper objects. The shared_ptr in both objects are just default initialized at this stage.

Right after the copy construction of the captured variables, compiler creates a closure object and assigns it to the named func_wrapper object ("fib"). The template assignment operator (with universal reference) of func_wrapper is invoked. First thing we do is to allocate an empty std::function object using make_shared. This initializes the func shared pointer in the named func_wrapper ("fib" again). We cannot use the lambda closure object to create the std::function just yet because the std::function constructor is going to make an internal copy of the closure. Our captured pointer, however, points to the one inside the closure object, which is still on the stack. We have to modify the shard_ptr inside the closure object before passing it on to the std::function. That's what we do next.

We assign the (*this).func object to the captured func object only if something has been captured. As it turns out, we have captured a pointer to the func_wrapper inside the closure object. So we assign the shared_ptr to it.

At this stage, the reference count of the empty std::function object is 2. One inside the "fib" object and other one inside the closure object.

Now we are ready to assign the closure object to initialize the empty std::function. We use std::forward to pass on the closure object to the std::function assignment operator. Thus, std::function does not really make copies of the closure but instead moves it. The last step is to return *this.

At this stage, the lambda is ready to invoke itself recursively. The first call is made, of course, using the named func_wrapper.
int main(void)
{
  auto const fib = create();
  std::cout << fib(6); // prints 8
}
The func_wrapper has overloaded function call operator just like std::function does. It simply uses std::forward to pass on the arguments to the closure held by the shared_ptr.

Note that the fib object in main is const. This ensures that no further modifications can be made to this recursive lambda wrapper. After all, this the only reference we have to an anonymous recursive lambda closure. The fib object can be passed by-value to other functions without issues because the const version of the copy-constructor gets invoked in those cases. That constructor makes no modifications to the rhs lvalue.

So, are we done? Not quite, unfortunately! There's a memory leak!

Fixing the Memory Leak

Note that the shread_ptr inside the closure object points to itself. That's a cycle! Even though the named fib object goes out of scope the reference count of the std::function never reaches zero because there is at least one shared_ptr keeping it alive. Therefore, you end up leaking memory.

So we need std::weak_ptr. And here is an improved version of the func_wrapper that does not leak.
template <class T>
class func_wrapper // version 2
{
  std::shared_ptr<std::function<T>> func;
  std::weak_ptr<std::function<T>> weak_func;
  func_wrapper *captured;

public:

  func_wrapper() 
    : captured(nullptr)
  {}

  func_wrapper(const func_wrapper &) = default;
  func_wrapper(func_wrapper &&) = default;

  func_wrapper(func_wrapper & f)
    : func(f.func),
      weak_func(f.weak_func),
      captured(nullptr)
  {
    f.captured = this;
  }

  template <class U> 
  typename 
    std::enable_if<
      is_callable<typename std::remove_reference<U>::type>::value,
      const func_wrapper &>::type
  operator = (U&& closure)
  {
    weak_func = func = std::make_shared<std::function<T>>();
    if(captured)
       captured->weak_func = func;
    (*func) = std::forward<U>(closure);
    return *this;
  }

  template <class... Args>
  auto operator () (Args&&... args) const 
    -> decltype((*func)(std::forward<Args>(args)...)) 
  {
    if(func)
      return (*func)(std::forward<Args>(args)...);
    else
      return (*weak_func.lock())(std::forward<Args>(args)...);
  }
};


The overall concept is the same except that we never initialize the shared_ptr inside the closure object. Instead we use a std::weak_ptr. std::weak_ptr does not increase the reference count. When it needs the object it is referring to, it has to create a shared_ptr object and dereference that one. If in the meanwhile the reference count drops to zero, the std::function and the closure is reclaimed as expected.

There are two ways to invoke the recursive closure from within the overloaded operator (). If there is a named reference ("fib") the shared_ptr is active in that case. So we use that one in the first condition in operator(). Within the closure, however, there is a "half-backed" func_wrapper that does not have strong reference but only a weak one. So the second condition turns the weak reference into a strong one (by calling .lock) and forwards the arguments just as usual.

Additionally, I'm using std::enable_if in the template assignment operator to make sure that what you are assigning is indeed a callable object. is_callable is a non-standard trait I found here.

So there you have it. A reusable solution in C++11 for recursive lambdas that you can return and use just like any other object. BTW, if you know a compelling use-case for using recursive lambdas, I'm all ears!