Archive for December, 2006

The Dawkins Delusion

Saturday, December 30th, 2006

While the Dawkins argument against Agnosticism is the usual and incredibly weak and contradictory proposition with a touch of cognitive dissonance (sound familiar Dawkins?), I must otherwise extend an overriding congratulations and approval of any concise, calculated and importantly, successful, attack against Abrahamic theism. I strongly urge anyone of any faith to consider The God Delusion if only to learn what “all the fuss is about”, but do approach with extreme scepticism - as I’d imagine Dawkins himself would recommend.

The most ironic logical proposition on the internet

Wednesday, December 20th, 2006

I generally don’t like to dabble in silliness that can be easily misconstrued as elitist (or some other negative label), but nevertheless, I feel compelled to record what I believe is the most ironic logical proposition that is available on the internet. I am not recording this so that we can all point and laugh or anything mindless like that, but instead, to reflect on just how easy it is to make a crucial mistake in judgment particularly when you least expect it. I believe that the irony of this statement supersedes that of any propositions that have exited from the mouth of President Bush or any other equally vulnerable world leader, etc. though many have come close.

If we pop on over to a document titled, Beating the Averages, we find a proposition (or assumption for the remainder of the section) that is somewhat striking. Here it is:

Lisp is so great not because of some magic quality visible only to devotees, but because it is simply the most powerful language available.

I have lost a little bit of the context in citing this statement, so I encourage anyone to read the full script. The irony lies in the fact that the author (Paul Graham) sets about describing what he calls the “Blub Paradox”, which is defined by how computer programmers can only look back at what they once knew about computer programming languages, but cannot make sense of what might be ahead - or might not even be aware that there is anything ahead! What the author probably does not know, is that in attempt to describe the “Blub Programmer” in the third person, he has inadvertently labelled himself a Blub Programmer by his very own definition.

I wonder if Mr. Graham has learned of this mistake, but I sure hope someone points it out to me if (when) I make it! :)

Fix it Sun!

Tuesday, December 19th, 2006

This is a call out to renew pressure for Sun to fix the tail call elimination defect/RFE that has existed in HotSpot since inception and still exists today. The IBM JIT is very much capable of tail call elimination, as is the .NET CLR (it even has an instruction built-in!) but still Sun’s weak compiler dominates our industry and stifles further innovation in software development.

It has long been known that imperative programming has some inherent weaknesses and people all around the place, who recognise this weakness, are trying to remedy it, only to be stopped in their tracks by the fact that this industry continues to use a very poorly implemented compiler.

Today, fixing this defect is not in Sun’s interest (nor that of IBM). In fact, doing so may well speed up the death of its programming language by many orders of magnitude. I postulate that as Sun squeezes its last few breaths out of Java and as more and more people enlighten themselves with much more powerful alternatives, Sun will inevtiably concede to this pressure. I declare that I have an agenda to speed up this process. Therefore, I would like to apply pressure to Sun to fix this defect so that thousands of existing software clients across the world have a smoother migration path into a more innovative and ultimately, more productive, software development environment. After all, corporations are not necessarily all about stifling innovation, are they?

Maybe Monad in Java

Monday, December 18th, 2006

As I have shown previously, the problem of partial function in Java is not easily solved. The solution of the Maybe algebraic data type, while definitely superior to existing options, is cumbersome to implement and requires some functions defined over the type (isJust, isNothing, etc.) in order to be complete. As some point out, there is a preference for continuation passing to prevent the need for a cast (even though this cast would be hidden). This prompted me to provide a more complete, yet still incomplete solution.

It is also worth noting that the solution with continuation passing will not work on the Sun VM due to its lack of tail call elimination. i.e. some concessions must be made somewhere - manual compiler optimisations if you will :) There were also some observers who didn’t quite get it. Nevertheless, many of us will go on using null/exceptions in Java or their own implementation of Maybe in Java (feel free to steal mine if you like), while others will use Haskell’s Data.Maybe, Scala’s Some/None, Nice’s option types or whatever.

All this aside, I am writing this post to introduce the Maybe Monad in Java, not to demonstrate anything meaningful or clever. Rather, I wish to simply point out to Java programmers what this abstract and apparently mystical notion of a monad - taken from the mathematical branch of category theory - really is. We now know that exceptions are only ever used for representing a partial function (even for I/O!) and we know that the Java programming language has built-in language and API support for exceptions. Let’s just revisit what the concept of a ‘partial function’ means exactly. It means that “there exists”[see note 1] some x take from our universe of discourse (U) such that f(x) is undefined.

Sound scary? It’s not really. Suppose we have a data type, int. The size of our universe of discourse is finite; it is 2^32. We can represent this as |U| = 2^32. The notation f(x) represents some function with x applied. Having a universe of discourse of finite size implies that we can do an exhaustive proof for behaviour of the function, though not necessarily in a reasonable amount of time.

Let’s call a function ‘divide’ and x can be two ints, a and b. The size of our universe of discourse of our arguments is 2^32 * 2^32 (i.e. 2 ^ 64) because that is the total number of combinations for the values of its arguments that are available. We can say that for our function, divide, it holds that for all[see note 2] a and for b = 0, the function is undefined. That is, regardless of the value of a, if the value of b is 0, the function is undefined:

int a = ... // anything
int b = 0;
divide(a, b); // undefined

If we try to use this function with b = 0, we will observe an exception (ArithmeticException). If we were using reference types, we could write our own divide function that instead returns null or returns Maybe<Integer> and Nothing for the case of an undefined function application.

Have we got it? Great! :) The Maybe Monad is used for defining a partial function that might call 2 or more partial functions; if any of those functions called are undefined, then our function that we are writing is undefined. This might be called, “threading partial function through a computation“. We could write it in pseudo-code:

f x = if isNothing (g x)
            then return Nothing
            else if isNothing (h (g x))
              then return Nothing
              else return maybeToJust (h (g x))

Very cumbersome to write and I sure hope that the function f doesn’t need to use more than 2 partial functions (g and h) in order to define! It would be great if you could write something similar to this instead…

f x = h (g x)

…and not have to worry about “threading partial function through the computation“. Those familiar with Haskell, or those who are trying to understand monads in a Haskell context, should hopefully recognise the latter syntax as do-notation. Those who are not familiar with do-notation can just know that Haskell provides language support for this notion of threading computation, represented by the more familiar term, monads. This support is provided with a language keyword called do (very different to do/while in an imperative language!).

Anyway, this post is getting a bit long and I still haven’t introduced Java’s monad for threading partial function through a computation. Luckily, it is not very long to write so without further ado, I introduce it as follows:

throws

“WTF!!? You mean I have been using monads all along!!? How can that be!?” Well, it has been stated more than once, that many people have already used monads, even if they didn’t call it that. In recent years, fundamental computer programming concepts have often been confused with euphemisms like “design patterns” or some such, so rather than speculate at an analogy that might be familiar, I will instead turn focus on the Java throws keyword.

Consider the following rough Java code:

T foo() throws IOException {
  H handle = openFile(); // this method declares throws IOException
 
  try {
    maybeDoSomeStuff(handle); // this method declares throws IOException
    // then
    return someFunction(handle); // this method declares throws IOException
  } finally {
    close(handle); // this method declares throws IOException
  }
}

We see quite clearly that if any of the invoked methods are undefined then our entire method (foo) is also undefined - by returning with an exception instead of a type T. Sound familiar yet? Since we know that exceptions are only ever used to represent partial function, then we can infer that the throws keyword is indeed a (poor man’s) Maybe Monad in disguise. I note at this point that the Maybe Monad is one of a possible squillion monads that are used in every day programming; even in the weakest of programming languages. Here are a couple that are used very often though not necessarily with built-in language support:

  • The list monad - building up a list or threading a list through a computation
  • The state monad - passing “state” through a computation

The lesson from this somewhat lossy representation of a monad - that is hopefully familiar to most people - is that monads are not anything overly complicated or abstruse despite initial appearances. It requires very little brain power to digest the concept coming from a nil set of knowledge, but unfortunately, it appears to take a considerable effort if the subject has been “tainted” with ill-conceived presuppositions such as imperative programming languages.

Be assured, the lesson is brief.
Be warned, the lesson is very self-confrontational.

[1] This is called existential quantification and is often denoted by the symbol ∃
[2] This is called universal quantification and is often denoted by the symbol ∀

Software requirements do not change

Sunday, December 10th, 2006

There is quite a prolific myth that seems to have a grip on the collective software development industry. Often in communication with another software developer, the notion of a “change in requirement” will surface and I am forced to remind that developer that no such thing ever occurs - that the internalisation of this logical fallacy is in fact, quite detrimental to the purpose of creating software, fulfilling software requirements and should be abandoned. Since no such thing as a software requirement change ever occurs, the remainder of the conversation is invalidated making communication very difficult. Who’s responsibility is it to establish a sound knowledge in the fundamental concepts of their selected profession? Not mine - I am forced to cease the conversation. After all, I don’t explain to my doctor what my problem is - only my symptoms - hence, I am not a doctor. But I’ll bet doctors speak in a “language” that is beyond my capability of comprehending (though I have picked up a term or two having lived with a midwife for years :)). We software developers should be doing the same, but it does not seem to be the case.

Software is a function from some given inputs to some given outputs (a lambda if you will) and importantly, given some inputs, the same output should always result, always. It should not return some other result; it should not format your hard disk every second time and it shouldn’t cause the universe to implode or anything crazy like that - just consistently return the same value for its given inputs. This property is called referential transparency. Let’s look at a specific piece of software called ‘touch’. It takes as input:

  1. a file system
  2. a list of characters

It provides one output - a new file system. If a file of the given name exists on the given file system, return the new file system with that file having an updated modified time stamp, otherwise, return the new file system with an additional empty file of the given name. That’s it - nothing more to it. We might even express this software more formally as follows:

-- "Given a FileSystem,
--   then a list of characters (Char),
--   return a new FileSystem."
-- :: means 'is of type'
touch :: FileSystem -> [Char] -> FileSystem
touch fs filename = if fs `contains` filename
  then fs `updateTimestamp` filename
    else fs `addEmptyFile` filename

The ‘touch’ software is so fundamental, that most reasonable operating systems provide it without any explicit user intervention - try running it and see.

While you are free to come up with all sorts of wild and complex examples of software, I’m going to simplify a little with a new example. Suppose a client approaches you, a software developer, on Monday, with the following requirement; “given two integers, sum them, multiply by 4 and give me the result.” Here is the software written in Java:

int software(int a, int b) {
    return (a + b) * 4;
}

Easy peasey! On a side note, write it in Haskell, then play ’spot the difference’ with the client problem statement and the code. Fun hey? :)
If the client approaches you on Tuesday with the following, “given two integers, subtract the first from the last. multiply by 4 and give me the result.”, then it is imperative to understand that this is a new requirement. A software requirement holds from inception until the collapse of the universe. You might argue, “but the old one and the same one look almost the same”, in which case, feel free to write the second one by applying your favourite delta compression algorithm, but by no means should the former be replaced by the latter. You might also argue, “but the client said it is a change”. Well I hope my doctor doesn’t follow any advice that I give him on my condition at the time since I am not well versed in medical practice. I do not say, “I have a headache, I need to fix it with these drugs” or if I did, the doctor, being aware of my naivety, would ignore my advice and only consider the symptom (I would hope!). Likewise, clients do not project their false view of software into their final product and instead, trust the better judgment of a more trained practitioner (ideally).

If we take the common (not that I accept its legitimacy in this form) approach of unit testing and I write assert(software(42, 43) == 340), then this statement should hold forever. If it does not, then you have broken your contract to all clients of your software. The typical refutation to this assertion involves the developer ascertaining that no software clients have ever existed and so it is safe to continue. For example, if there is even a slight suspicion of a client, that developer might first deduce that if there is, then those client(s) are in the room. Then that developer asks all people in the room to commit their code so that she can search for occurrences of its use. If none are found, the software is removed, code committed and all persons in the room are instructed to update. Otherwise, if it is found that one or more clients exist, manual contingency measures are engaged.

The regularity of the occurrence of no clients bound to your software is often over-estimated. Just take a look at the mess that Microsoft has itself in with an attempt to ’separate out the components of Windows’ or any of the other countless examples that surround us. You might say, “but I am a consultant with only one client”. This is not true; in fact, the client/provider relationship can often be self-referential. It is entirely coincidental that no clients have bound to your software, but rarely is it so.

defmacro[1] = defmacro[0] + 1

Wednesday, December 6th, 2006

I read an article a while back that made some claims that were oh so true. One such claim I will reproduce here because it is one that is in violation of mainstream, but portrays so much truth in so few words. I hope my reproduction is OK with the original author.

Most people I’ve met have read the Design Patterns book by the Gang of Four. Any self respecting programmer will tell you that the book is language agnostic and the patterns apply to software engineering in general, regardless of which language you use. This is a noble claim. Unfortunately it is far removed from the truth.

Doesn’t it look shiny? No polish required :) I think it would make a great wall hanging.