Archive for February, 2007

Glue?

Saturday, February 24th, 2007

I recently noted that I’d had stitches in 23 sites on my body — including, among other things, an amputation of my right-index finger. It was last Saturday (17 Feb 07) that I was training as usual at my squash club, when I was struck just above the eye by the end of my opponent’s swing, drawing lots of blood as you would expect with an elevated heart rate during squash and sending my opponent into mild shock at the sight of it (te he :)).

Anyway, my count still remains at 23 — why? Because the emergency department used glue! I suppose the injury was not large enough to warrant the use of sutures, but glue!? Who’d uv thunk?! It has healed perfectly and my first match in this weekend’s Charles Butcher Tournament starts in a couple of hours. Revenge is definitely on the agenda! Watch your back Brad :)

Statefulness and the Abstract Universe

Thursday, February 22nd, 2007

Quite often, I hear comments from people who are trying to get a grasp on functional programming about entities which are “inherently stateful” or “intrinsically mutable” (example). The commenters often point to a disk or network to make the point. In this post, I am going to attempt to portray a slightly deeper understanding of this topic to erase the facade and bring to fruition the fact that this “inherent state” (or whatever) is in fact an illusion. This will require some thought experiment and the manipulation of some abstract entities in order to attain this insight.

I will first start off by describing the distinction between the physical universe as we observe it and the abstract universe. The physical universe is made up of matter; for example, a house, a typewriter and even a hard disk. The abstract universe is one that is, well, abstract — one that is not physical. If we consider the number 2. We call it ‘a number’, a physical entity, yet we cannot hold it in our hands or point to its physical representation. Therefore, 2 is better described as a representation of “twoness” or “the concept of two”. The symbol ‘2′ is used to denote the concept of two, among other possible representations of the same concept:

  • two
  • . .
  • 00000010

…and so on. Anyone interested in defining the concept of two, with a bent for history, might be interested in learning about the discovery of zero.

Now, this point touches on some philosophical grounds that I’d rather not go into, so instead, I will make a note of the fact that a pure evolutionary Atheist might not accept this distinction (perhaps claiming that 2 is in fact physical as a chemical signal in the brain), while a Theologian would describe the abstract universe as your ‘Soul’ (or some such). I don’t mean to encroach on this philosophical point and I am hoping I can get away with using terminology that can be translated to the relevant philosophical view.

So, when you write a function, in any language, f(2), you’re not passing a 2 or 2 itself to that function, but instead, a representation of the concept of two to the concept of your function and nothing further. Here’s the clincher: All software exists in the abstract universe. Even your C program that does clever pointer arithmetic and your assembly program that moves the hard disk head. Yes, that FILE* is in fact, an abstraction of a pointer, not a pointer itself. None of these are physical entities, but abstractions of physical entities that manifest themselves somehow — none of these manifestations are relevant to the software author.

It’s all quite simple so far, but if you can pass ‘a representation of the concept of two’, then what is to stop you from passing ‘a representation of the concept of the file system’? or the network? Nothing, that’s what. In fact, you could quite plausibly argue that this is exactly what Haskell’s IO monad is doing — an abstraction that represents these physical entities using the expressive type system of Haskell. In exactly the same way that 2 is an abstraction that represents a physical entity i.e. it will eventually manifest itself as electrical signals in your computer hardware (or physically manifest somehow anyway).

Those of you who are concerned about passing an entire file system as a function argument and the impact on performance might be interested in delving further into a topic called Lazy Evaluation and Weak Head Normal Form (WHNF). I’d rather not reiterate the work of many others who are more dedicated, so I’ll just point out that concerns for performance are definitely valid, but the impact on performance is not there (in fact, often quite the contrary — performance improves!).

The distinction between ‘2′ and ‘the file system’ and ‘the network’ from a software developer’s perspective is entirely superficial and should be abandoned. One cannot be ‘inherently stateful’ while the other isn’t. It’s one or the other, so which is it? (hint: n**ther :))

Refunctoring

Thursday, February 15th, 2007

Ask your average Java (or C#) programmer to ‘write a method that takes a list of integers, adds 10 to each element, converts the result to a String, prepends *** to it and returns the resulting list’. It’s quite easy and you can certainly postulate what kind of start would be made on this. Here is how I suspect most Java programmers would solve this problem:

...
static List<String> addTenAndConvert(List<Integer> list) {
  List<String> result = new LinkedList<String>();
 
  for(Integer i : list) {
    String s = "***" + String.valueOf(i + 10);
    result.add(s);
  }
 
  return result;
}

Simple enough. Many people would also write so-called ‘unit tests’ to assist in verifying that the method meets the specified requirement. It is interesting to note at this point that ‘for all’ (hint) list arguments, the length of the return value will always be equal. So for the argument, {1,2,3} with a length of 3, the result is {”***11″, “***12″, “***13″} also with a length of 3 and this property holds across all lists.

But then, I ask for another method that does all the same, but this time, prepends “AAA” instead of “***”. Whatchya gunna do? Copy/paste/change? No, you’re going to refunctor (to be defined in time to come), that’s what. You might pass an additional argument of type String and prepend that. Sounds fair enough. But then I will ask instead that you append “BBB” instead of prepending “AAA”. Pass an additional argument of type boolean?

Let’s take a look at what we’ve got:

...
static List<String> addTenAndConvert(
  List<Integer> list, String s, boolean prepend) {
  List<String> result = new LinkedList<String>();
 
  for(Integer i : list) {
    String ss = prepend ? s + String.valueOf(i + 10) :
        i.toString() + s;
    result.add(ss);
  }
 
  return result;
}

But then, I ask you for another requirement — this time devastating. I ask instead that it is not a List argument, but a List argument and a List return type. Going to copy/paste this time, right? The conversion from each String element to the new Integer element is left unspecified for now, but feel free to dream something up. In fact, feel free to ‘refactor’ it out:

interface Convert {
  Integer convert(String s);
}

…then change the method like so:

...
static List<Integer> stringListToIntegerList(
    Convert c, List<String> list) {
  List<Integer> result = new LinkedList<Integer>();
 
  for(String s : list) {
    Integer i = c.convert(s);
    result.add(i);
  }
 
  return result;
}

Notice that our method’s name is becoming less specific as it is refactored and becoming more abstract in behaviour. Further, we cannot write our method addTenAndConvert in terms of our method stringListToIntegerList by passing a different implementation of Convert because our types do not match. However, notice that the transformation on each list has nothing to do with specific types — these types are actually unbounded polymorphic types. Let’s try again.

interface Convert<T, U> {
  U convert(T t);
}
 
static <T, U> List<U> fooForNow(Convert<T, U> c, List<T> list) {
  List<U> result = new LinkedList<U>();
 
  for(T t : list) {
    U u = c.convert(t);
    result.add(u);
  }
 
  return result;
}

How’s that look!!? What shall we call this method? I have called it fooForNow, but if you’ll just let me call it map instead, just because. We can now write our method addTenAndConvert with our newly found abstraction — the map function:

static List<String> addTenAndConvert(List<Integer> list) {
  Convert<Integer, String> c = new Convert<Integer, String>() {
    public String convert(Integer i) {
      return "***" + String.valueOf(i + 10);
    }
  };
 
  return map(c, list);
}

How funky is that? Funky enough for you to start looking at funktional programming? I’ll let you in on a little secret. This relatively high euphoric point of using Java/C# (et. al.) is the very basis and starting point of most functional programming languages. In fact, the map function is typically included in the standard libraries! Functional programming is Java version 42, seriously. It is a natural extension to what it is that most Java/C# programmers are already doing. It is not some esoteric, orthogonal, unrelated paradigm that has nothing to do with anything except the point of singularity.

If we consider our Convert type to actually represent ‘anything that can convert a T to a U’, we call map a higher-order function, since it takes a function (from T to U) as an argument. Whether or not to name this function argument (to Convert in our example) and hide the behaviour of this conversion behind some implementation sets the premise for the ‘DD (Dynamic Dispatch) versus HOF (Higher-Order Function) issue’.

If we look at the type for the Haskell map function, we see it like this:

map :: (a -> b) -> [a] -> [b]

That is, as the first argument, it takes a function from a type ‘a’ to a type ‘b’ (we called these T and U, but Haskell type parameters must be lower-case). The second argument is a list of the type ‘a’ and the return type is a list of the type ‘b’. That’s exactly what we just wrote in 43 trillion lines of Java code! OK, maybe I am exaggerabating a bit there.

And in case you’re wondering, here is how addTenAndConvert looks in Haskell:

let addTenAndConvert = map (("***" ++) . show . (+10)) -- tidy eh?

Why ‘Refunctoring’? First, I refuse to reuse euphemistic terms that have served only to denigrate the standard of the software development industry, therefore, I cannot use ‘Refactoring’ (oooh! did I just say that!!?). To show this, consider the fact that the entire notion of ‘Refactoring’ is done away with one simple phrase ‘Functional Programming’ in the case shown and hundreds more. Of course, you might decide to redefine Refactoring to a different context, but then, redefining terms often leads to confusion (the term ‘function’ is itself a perfect example). To further demonstrate this point, open the book or catalog on Refactoring and turn to the section titled, ‘Replace Parameter with Method’. Then say out loud, ‘partial application’ — if you don’t know why you’re saying that, learn what partial application is, then perhaps it might dawn on you. A useful exercise is to apply this technique to all so-called ‘Refactorings’ and see how many are eliminated by existing programming techniques that don’t get very much air time in advertising material (though, this might be changing thankfully). More useful might be to see how many are left standing (want to know my conjecture?).

Anyway, Refunctoring is what most people really are doing. They are converting their code that is written in not-too-powerful programming languages to mimic the power that is already available in functional programming languages. They are representing higher-order functions with (importantly, co-variant) polymorphic type parameters, refactoring out what partial application already provides and much more. This is being done, despite having been invented many, many years ago, but the fact that it is being done mandates a name, albeit how absurd it might be. I use the term Refunctoring, for no other reason than it was suggested to me by a bemused colleague (you know who you are :)) one day and upon reflection, I have decided that it is entirely appropriate.

You don’t think the map function is a special case do you?

Ignorance is mostly bliss, but not always

Friday, February 9th, 2007

I have been using Scala for the past few weeks to write a web framework that ‘fixes’ the convoluted Java Servlet Specification. I have succeeded in doing so and I have a prototype running on Apache Tomcat right now as I write this post, however I didn’t get to this point easily; not without first having to do such things as:

  • write a lazy cons list without any methods declared on the type — rather, functions over the type that are declared in a Scala object (roughly analogous to Java static methods)
  • conceding to using mutable state/destructive update at some points — particularly converting strict and/or side-effecting Java types to nice, lazy, pure types
  • missing out on the syntactic niceties that monads provide — Scala does not support higher-order kinds
  • Scala is what I consider a very pragmatic (to steal a buzzword from the hype generators) approach to software development as it exists today. It satisfies the incompetent assertions of the illegitimate authority (aka management) — “you must use Java (et al.)”, while still allowing a reasonably concise expression and programming paradigm e.g. Scala has pattern matching, case classes, variance annotations, higher-order functions and implicit defs (similar to Haskell type-classes). These features alone dramatically reduce the amount of repetitious work that you otherwise do in more mainstream languages by one of those crazy orders of magnitude that would never be believed by a mainstream programmer and so could easily be explained away as nonsense with support from other ill-informed peers (those who know what I mean, know what I mean).

    I had read about CAL, but didn’t ever give it much consideration until today. And it is today that I realise that my blissful ignorance has cost me. CAL is very Haskell-like with full support for using Java types (just as Scala does) and compiles straight to the JVM (just as Scala does) — making up for the deficiencies of the Java runtime compiler using techniques such as tail recursive call elimination (again, just as Scala does).

    However, upon further reading (I admit to not having used it in great detail just yet), CAL does more — much more. It is lazily evaluated and it is pure functional. It has higher-order kinds and we know how important that is. It has monads and even the Monad data type subtypes the Functor data type! Not even Haskell does that (though I’m sure things would be different given hindsight ala CAL)! Type parameters are covariant, given its pure functional nature and it makes expressing strictness much easier than it does in other lazy languages (apparently). Of course, this is all potentially a reiteration of hyperbole, since I’ve yet to try it out — but it does look promising, especially given that it has now been released under a BSD licence.

    The point is, my ignorance (of CAL) has cost me, since I have a well thought-out, very usable web application framework (sorry HAppS) that has been written in a possibly (probably) inferior language. Here I was thinking I had the best of both worlds — satisfying the pointy-haired boss, while still getting on with writing software. Though, it could be worse, much worse — I could have written it in Java/C# like many other participants in this circus are doing.