Folds for Imperative Programmers
Coming from an imperative language such as Java/C/C# and moving to a functional language, you will almost certainly encounter ‘folding functions’. There are usually two folding functions (sometimes there are minor variants) — one folds to the right and the other folds to the left. In Haskell, these functions are named foldr and foldl.
This first encounter with folding functions can often be quite daunting, but I propose an alternative approach. In fact, these folding functions are almost certainly quite familiar to the imperative programmer — just without consciousness. If I had a dollar for every time I saw something like this (I’d buy every CS student a book on the lambda calculus :)):
...
B accum = b;
for(final A a : as) {
accum = f.f(a, accum);
}
return accum;
Look familiar? Not so daunting? Great, because it’s a fold — specificially, a fold to the left. It is a fold to the left because the sequence (list, array, whatever) is iterated from the left/start to the right/end. Conversely, a fold right iterates from the right/end of the sequence to the left/start.
Let’s take a closer look at the ‘fold left’ above. The first line initialises an accumulator of type B to the given argument. Then the sequence (of A) is iterated and each element is passed to some function and its result assigned to the accumulator. Finally, the accumulated value is returned.
Have I just trivialised a seemingly complicated concept? Not at all! I have undermined its apparent complexity perhaps. On the contrary, it is the imperative language version that has complicated a trivial concept. Let’s take a look at a complete and compilable example:
class Foldr {
static interface F<A, B> {
B f(A a, B b);
}
static <A, B> B foldr(final F<A, B> f, final B b, final A[] as) {
B accum = b;
for(int i = as.length - 1; i >= 0; i--) {
accum = f.f(as[i], accum);
}
return accum;
}
}
class Foldl {
static interface F<A, B> {
A f(A a, B b);
}
static <A, B> A foldl(final F<A, B> f, final A a, final B[] bs) {
A accum = a;
for(final B b : bs) {
accum = f.f(accum, b);
}
return accum;
}
}
public class Folds {
public static void main(final String[] args) {
final Integer[] i = new Integer[]{7,8,9,42,11,13,45,54,45,46,64,74};
final Integer right = Foldr.foldr(new Foldr.F<Integer, Integer>() {
public Integer f(final Integer a, final Integer b) {
return a - b;
}
}, 79, i);
System.out.println(right);
final Integer left = Foldl.foldl(new Foldl.F<Integer, Integer>() {
public Integer f(final Integer a, final Integer b) {
return a - b;
}
}, 97, i);
System.out.println(left);
}
}
Phew! What an effort — pass me a beer!
We notice in the main method that the types of both A and B are the same (Integer). This is purely consequential and these types may, and often do, differ. Also notice that our function in each case is not commutative. That is, f(a, b) is not necessarily equivalent to f(b, a). We know this because a - b is not necessarily equivalent to b - a. If the functions were commutative, then the result of a fold right would be equivalent to the result of a fold left. If you execute the code above, you will not get the same output for each fold. Try changing the implementation of each function to be commutative (like +) and observe the equivalent results.
Let’s take a look at the Haskell equivalent at an interpreter:
Prelude> let i = [7,8,9,42,11,13,45,54,45,46,64,74] Prelude> foldr (-) 97 i 41 Prelude> foldl (-) 79 i -321
Same outputs as the imperative code? Good, so that’s folds out of the way — easy peasey ![]()
By the way, yes the Haskell equivalent is supposed to be 100 times shorter — that is the nature of a relatively expressive programming language.
January 6th, 2007 at 2:10 am
You’ve got it backwards. A “fold left” starts on the left and ends on the right. A “fold right” starts on the right and ends on the left.
The difference matters especially with structures like functional cons lists. A fold left can be written tail-recursively; a fold right cannot (except by reversing the list and folding left).
January 6th, 2007 at 4:50 am
I’m pretty sure I get the concept, but I don’t see why foldr (-) 97 i evaluates to 41. When I do it by hand (97 - 74 - 64 …), I’m getting -321. Am I seriously confused, or is it just a typo?
January 6th, 2007 at 5:19 am
http://foldr.com/
http://foldl.com/
definitions of folds in haskell should be:
foldl _ e [] = e
foldl f e (x : xs) = foldl f (f e x) xs
and
foldr _ e [] = e
foldr f e (x : xs) = f x (foldr f e xs)
so “foldr (-) 97 i” is
(7 - (8 - (9 - (… (74 - 97))))
Yes. you got them backwards.
January 6th, 2007 at 8:16 am
WOOPS!!
Sorry to anyone for whom I have caused confusion. It is me who was confused. My sincerest apologies. Ryan, I have updated in an attempt to correct your confusion for which I accept full responsibility.
January 6th, 2007 at 9:44 am
Ah, ok, I see now. Thanks Tony and Jaak!
January 7th, 2007 at 2:19 am
Another mistake: associativity is
(a + b) + c = a + (b + c),
not
a + b = b + a,
which is commutativity. (But associativity is the property that makes the two forms equivalent if they both terminate.)
January 8th, 2007 at 12:42 pm
(Actually, the ‘n’ in ‘fold c n xs’ needs to commute with the elements of ‘xs’ too.)
January 13th, 2007 at 6:58 am
Man, what a screw up!
I will bathe in the consolation that any future mistakes of such a trivial magnitude will be picked up immediately
Thanks to those who did and sorry to those who I have misled.
March 9th, 2007 at 9:20 am
Nope. That’s the nature of implementing one language’s idioms in another. For an example going the other way, say we have a function
char * echo(char * str) { return str; } // C
or
pre = id — haskell
and it’s called all over our program. Now say we want to count the number of times it’s called. In C, we just do this:
int echo_count = 0;
char * echo(char * str) { ++echo_count; return str; } // C
Global state is easy and idiomatic. In Haskell, we’re hosed — “proper hosed”. Time to rewrite our whole program.
March 10th, 2007 at 6:17 am
Hi sean, the premise of your statement is not correct. You say, “Now say we want to count the number of times it’s called.”, but this is completely orthogonal to a software requirement, which is always based on a lambda abstraction. Rather than start talking about Church’s theories, I will point out that the only reason you ever care about “the number of times it’s called”, is because you’re using C - not producing any useful lambda abstraction. In other words, there are lots of things you cannot do with Haskell (this is not one of them), but it is great that you can’t because they have nothing to do with software (and usually, as in this case, everything to do with restrictions placed by the von Neumann machine (ick!)).
Furthermore, the ability to perform destructive update does not disqualify the ability to perform what you are asking using Haskell if indeed you correct it to be a sensible expression of software - that is, you pass the “number of times it’s called” as an additional argument/return value. This can be achieved cleanly by using the State monad.