Folds for Imperative Programmers

Coming from an imperative language such as Java/C/C# and moving to a functional language, you will almost certainly encounter ‘folding functions’. There are usually two folding functions (sometimes there are minor variants) — one folds to the right and the other folds to the left. In Haskell, these functions are named foldr and foldl.

This first encounter with folding functions can often be quite daunting, but I propose an alternative approach. In fact, these folding functions are almost certainly quite familiar to the imperative programmer — just without consciousness. If I had a dollar for every time I saw something like this (I’d buy every CS student a book on the lambda calculus :)):

  ...
  B accum = b;
  for(final A a : as) {
    accum = f.f(a, accum);
  }
  return accum;

Look familiar? Not so daunting? Great, because it’s a fold — specificially, a fold to the left. It is a fold to the left because the sequence (list, array, whatever) is iterated from the left/start to the right/end. Conversely, a fold right iterates from the right/end of the sequence to the left/start.

Let’s take a closer look at the ‘fold left’ above. The first line initialises an accumulator of type B to the given argument. Then the sequence (of A) is iterated and each element is passed to some function and its result assigned to the accumulator. Finally, the accumulated value is returned.

Have I just trivialised a seemingly complicated concept? Not at all! I have undermined its apparent complexity perhaps. On the contrary, it is the imperative language version that has complicated a trivial concept. Let’s take a look at a complete and compilable example:

class Foldr {
  static interface F<A, B> {
    B f(A a, B b);
  }

  static <A, B> B foldr(final F<A, B> f, final B b, final A[] as) {
    B accum = b;

    for(int i = as.length - 1; i >= 0; i--) {
      accum = f.f(as[i], accum);
    }

    return accum;
  }
}

class Foldl {
  static interface F<A, B> {
    A f(A a, B b);
  }

  static <A, B> A foldl(final F<A, B> f, final A a, final B[] bs) {
    A accum = a;

    for(final B b : bs) {
      accum = f.f(accum, b);
    }

    return accum;
  }
}

public class Folds {
  public static void main(final String[] args) {
    final Integer[] i = new Integer[]{7,8,9,42,11,13,45,54,45,46,64,74};

    final Integer right = Foldr.foldr(new Foldr.F<Integer, Integer>() {
      public Integer f(final Integer a, final Integer b) {
        return a - b;
      }
    }, 79, i);

    System.out.println(right);

    final Integer left = Foldl.foldl(new Foldl.F<Integer, Integer>() {
      public Integer f(final Integer a, final Integer b) {
        return a - b;
      }
    }, 97, i);

    System.out.println(left);
  }
}

Phew! What an effort — pass me a beer!

We notice in the main method that the types of both A and B are the same (Integer). This is purely consequential and these types may, and often do, differ. Also notice that our function in each case is not commutative. That is, f(a, b) is not necessarily equivalent to f(b, a). We know this because a - b is not necessarily equivalent to b - a. If the functions were commutative, then the result of a fold right would be equivalent to the result of a fold left. If you execute the code above, you will not get the same output for each fold. Try changing the implementation of each function to be commutative (like +) and observe the equivalent results.

Let’s take a look at the Haskell equivalent at an interpreter:

Prelude> let i = [7,8,9,42,11,13,45,54,45,46,64,74]
Prelude> foldr (-) 97 i
41
Prelude> foldl (-) 79 i
-321

Same outputs as the imperative code? Good, so that’s folds out of the way — easy peasey :)
By the way, yes the Haskell equivalent is supposed to be 100 times shorter — that is the nature of a relatively expressive programming language.

10 Responses to “Folds for Imperative Programmers”

  1. Jeremy Says:

    You’ve got it backwards. A “fold left” starts on the left and ends on the right. A “fold right” starts on the right and ends on the left.

    The difference matters especially with structures like functional cons lists. A fold left can be written tail-recursively; a fold right cannot (except by reversing the list and folding left).

  2. Ryan Says:

    I’m pretty sure I get the concept, but I don’t see why foldr (-) 97 i evaluates to 41. When I do it by hand (97 - 74 - 64 …), I’m getting -321. Am I seriously confused, or is it just a typo?

  3. Jaak Says:

    http://foldr.com/
    http://foldl.com/

    definitions of folds in haskell should be:

    foldl _ e [] = e
    foldl f e (x : xs) = foldl f (f e x) xs

    and

    foldr _ e [] = e
    foldr f e (x : xs) = f x (foldr f e xs)

    so “foldr (-) 97 i” is

    (7 - (8 - (9 - (… (74 - 97))))

    Yes. you got them backwards.

  4. Tony Morris Says:

    WOOPS!!

    Sorry to anyone for whom I have caused confusion. It is me who was confused. My sincerest apologies. Ryan, I have updated in an attempt to correct your confusion for which I accept full responsibility.

  5. Ryan Says:

    Ah, ok, I see now. Thanks Tony and Jaak!

  6. nobody Says:

    Another mistake: associativity is

    (a + b) + c = a + (b + c),

    not

    a + b = b + a,

    which is commutativity. (But associativity is the property that makes the two forms equivalent if they both terminate.)

  7. nobody Says:

    (Actually, the ‘n’ in ‘fold c n xs’ needs to commute with the elements of ‘xs’ too.)

  8. Tony Morris Says:

    Man, what a screw up!
    I will bathe in the consolation that any future mistakes of such a trivial magnitude will be picked up immediately :)
    Thanks to those who did and sorry to those who I have misled.

  9. sean Says:

    By the way, yes the Haskell equivalent is supposed to be 100 times shorter — that is the nature of a relatively expressive programming language.

    Nope. That’s the nature of implementing one language’s idioms in another. For an example going the other way, say we have a function
    char * echo(char * str) { return str; } // C
    or
    pre = id — haskell
    and it’s called all over our program. Now say we want to count the number of times it’s called. In C, we just do this:

    int echo_count = 0;
    char * echo(char * str) { ++echo_count; return str; } // C

    Global state is easy and idiomatic. In Haskell, we’re hosed — “proper hosed”. Time to rewrite our whole program.

  10. Tony Morris Says:

    Hi sean, the premise of your statement is not correct. You say, “Now say we want to count the number of times it’s called.”, but this is completely orthogonal to a software requirement, which is always based on a lambda abstraction. Rather than start talking about Church’s theories, I will point out that the only reason you ever care about “the number of times it’s called”, is because you’re using C - not producing any useful lambda abstraction. In other words, there are lots of things you cannot do with Haskell (this is not one of them), but it is great that you can’t because they have nothing to do with software (and usually, as in this case, everything to do with restrictions placed by the von Neumann machine (ick!)).

    Furthermore, the ability to perform destructive update does not disqualify the ability to perform what you are asking using Haskell if indeed you correct it to be a sensible expression of software - that is, you pass the “number of times it’s called” as an additional argument/return value. This can be achieved cleanly by using the State monad.

Leave a Reply