Strong Type Systems

I have been reading articles lately that make a case for or against ’strong type systems’ by referring to artifacts that purport the exact opposite of a strong type system such as Java, C/C++, C# or a few other weak type systems. This is unfortunate, since the remainder of the article becomes immediately invalid within the first two or three sentences and I doubt that the author is even aware of the fact that they are wasting their own time while at the same time, misleading any potential readers. Quite often, these articles have trendy or appealing writing styles, such as “Parabola“, which attempts to universally refute the legitimacy of type safe (T.S. in the article) systems by making a rather clever, albeit erroneous, metaphor with a scenario in an airport.

There are countless articles that talk about the problems with null, but is this a problem with type systems? No, it is a very poor solution to work around one of the fundamental contradictions that plagues imperative programming. That is, even if we must keep this contradiction and imperative programming for some reason (which we don’t), there are better solutions than null. So this leaves a problem for any author who wishes to provide a higher understanding for their readers; do they explain all the problems with null in more formal, concise and correct terms knowing that the whole problem disappears with a better foundational grounding (eliminating imperative programming), or do they simply present some alternatives that may be unfamiliar and hope for initiative on the part of the reader who will use (the ever so common) gut instinct to feel that they have a superior solution? A horrible problem that is thankfully not mine.

Just what denotes Java et. al. having a weak type system? On what basis is this statement made? The answer to this question is not trivial by any means and answering it requires a reasonable amount of foundational mathematics knowledge, which leads to specialisation in type theory — both of which not many software developers have, nor have the initiative to acquire (this is sad :(). In any case, a brief attempt can be made. There is no doubting that these weak type systems disallow a certain amount of expressivity and force a rigid and sometimes unreasonable method of development. That is not to say that dynamic typing is the solution. Just like Dawkins said that since the beginning of life is so improbable as to be impossible (on the surface), then the existence of a supernatural creator is not the only alternative solution — referring to Darwinism. Whether or not you believe in a supernatural creator, it would be obscene and somewhat ignorant to immediately jump to such a conclusion without rational reason — such as critically analysing alternatives. Likewise, it is obscene to jump to the conclusion that dynamic typing solves the problems of (weak) existing type systems without considering alternatives, such as expressive type systems.

Using Java et. al., it is impossible to express certain fundamental programming concepts and many more concepts that are specific to the problem at hand. One such fundamental programming concept is the monadic bind computation. Before I start going into detail about this, I am compelled to first dispel a very common myth. Many ‘programmers’ believe that monads (and therefore, monadic bind) are not relevant to their problem, since their language does not support such a concept directly. Few programmers actually realise that they are in fact using monads, just without language support and often in a somewhat contrived manner — a result of the lack of formality in reasoning. In fact, I postulate that in my time working on IBM WebSphere 6 and the IBM JDK 1.5, I saw somewhere in the order of hundreds of monads littered throughout thousands of lines of code. Furthermore, Java/C# has one particular monad built right into the language. Providing concrete examples is worthy of a topic on its own, so I will instead take the liberty of claiming that if anyone were to send me their 2^gazillion LOC application, I would be capable of finding one monad per one thousand LOC — worst case scenario. At least, please assume so for now. Yes, monads are extremely important and common to programming, even if you don’t ever know it explicitly. Great, glad that is out of the way :)
We cannot express monadic bind with these weak type systems because they do not have higher-order types. What is a higher-order type? A few people with much more time, specialisation and knowledge have done a better job than I am at least willing and probably capable, of doing. Instead, I will attempt to demonstrate what happens when you attempt to generalise monadic bind with a weak type system. Haskell is a purely functional language that does have higher-order types. If we ask a Haskell interpreter for the type of bind (>>=), we get the following:

Prelude> :type (>>=)
(>>=) :: (Monad m) => m a -> (a -> m b) -> m b

This is akin to a Java/C# interface (called Monad) containing one function (>>=) or ‘bind’, however, we will note that there is no analogy for this expression in a weak type system. This fact, along with many others, seems to be quite a hurdle in grasping the very concept of the bind computation itself. If we examine this type expression, we could say that the bind function (>>=) takes two arguments:

  1. a monad with a type parameter ‘a’
  2. a function from ‘a’ to a monad with a type parameter ‘b’

The function returns a monad with a type parameter ‘b’. ‘Implementing this interface’ would make the class itself an instance of Monad and so itself can be used as the function arguments. This is one of many things that higher-order types gives us.

Let’s attempt this with a weak type system, Java. First, we acknowledge that Java has no first-class functions (cannot pass functions as arguments), but we can loosely emulate it (good enough for our purposes) with a single interface:

interface F<X, Y> {
  Y f(X x);
}

Let’s attempt to write the Monad interface:

interface Monad<A> {
  <B> Monad<B> bind(Monad<A> ma, F<A, Monad<B>> f);
}

Looking good so far. But don’t be fooled — there is a huge problem. Namely, that the interface cannot be implemented correctly. Keeping with tradition, let’s attempt to write the Maybe monad:

final class Maybe<A> implements Monad<A> {
  // BZZT! WRONG!
  Maybe<A> bind(final Maybe<A> ma, final F<A, Maybe<B>> f) {

We hit our first problem, can we work around it? In Java, we have co-variant return types, but we do not have higher-order types. This means that we can keep our return type declaration, but not the types of our arguments — they must match the interface. Then, we can resort to casting — an abomination that indicates a flaw of the type system.

Let’s try again:

final class Maybe<A> implements Monad<A> {
  public <B> Maybe<B> bind(final Monad<A> ma, final F<A, Monad<B>> f) {
    if(ma instanceof Maybe) {
    ...

This solution works, but we have to admit to having to work around the type system by resorting to casting. As a result, I will no longer call it a solution, but instead, it is a work around — a flaw in the expressitivity of the type system. The lack of higher-order types in common type systems is one of many problems with quite devastating consequences. In attempting to model a fundamental computational concept, we find that the type system of Java (and many other programming languages with weak type systems) is not a sound premise from which to extrapolate conclusions about type systems in general. If we were to do so, we would be making a fundamental mistake. In fact, this particular problem is one that dynamic typing appears to solve. But don’t let these superficial appearances delude you. A stronger type system is the correct solution.

Here are some conclusions from these findings:

  • Higher-order types are fundamental to a strong type system
  • Java/C/C++/C# et. al. do not support higher-order types, therefore, do not have a strong type system
  • Monadic bind is a relatively trivial, yet extremely common programming concept
  • Monadic bind cannot be easily expressed in common weak type systems
  • Dynamic typing is not the only apparent solution to resolving the problems of weak type systems

Here are some questions to ponder:

  • What if we tried to model a concept that is not as trivial as the bind computation? Does the difficulty of expression grow linearly to the complexity of the concept?
  • What other concepts are supported by strong type systems but not our common weak type systems?
  • Can we come up with other examples of common, trivial concepts that cannot be expressed because these weak type systems are missing some other feature besides higher-order types?
  • Most importantly, what does a comparison between a type system that exemplifies sound, strong and expressive type safety with dynamic typing (or no typing) give us?

It is this latter question that needs to be explored before drawing any potentially misleading conclusions about type systems. It is a crucial mistake to make a comparison to weak type systems that do not in any way exemplify soundness, as anything else but unsound examples of type systems. The implication that dynamic typing is a correct solution is a blind-man’s fallacy.

A compilable solution follows:

interface F<X, Y> {
  Y f(X x);
}

interface Monad<A> {
  <B> Monad<B> bind(Monad<A> ma, F<A, Monad<B>> f);
}

final class Maybe<A> implements Monad<A> {
  public <B> Maybe<B> bind(final Monad<A> ma, final F<A, Monad<B>> f) {
    if(ma instanceof Maybe) {
      final A j = ((Maybe<A>)ma).just();
      if (j == null) {
        return new Maybe<B>();
      } else {
        final Monad<B> b = f.f(j);

        if(b instanceof Maybe) {
          return (Maybe<B>)b;
        } else {
          throw new Error("Just because we don't have higher-order types," +
            "doesn't mean we start doing silly stuff");
        }
      }
    } else {
      throw new Error("I said stop doing silly stuff! Please!");
    }
  }

  // Let's just hack for now.
  // null denotes 'Nothing'.
  // Many programmers will be familiar
  // with this idiom (did I say hack?)
  // anyway.
  //
  // A more complete solution is provided at
  // http://blog.tmorris.net/revisiting-maybe-in-java
  private final A a;

  public Maybe() {
    this.a = null;
  }

  public Maybe(final A a) {
    this.a = a;
  }

  public A just() {
    return a;
  }
}

24 Responses to “Strong Type Systems”

  1. Cale Gibbard Says:

    One terminology issue: the word “monad” refers to the type constructor itself, not to the values of the type constructed. Usually those are referred to as monadic values, or actions, or computations.

    That is “Maybe” is a monad, but “Just 5″ isn’t, and IO is a monad, but getLine isn’t.

    You might already know this, but a little bit of the wording seemed to make me wonder.

  2. Tony Morris Says:

    Thanks Cale for the tip. Since I only just wrote this post a few hours ago, I will reflect later on your point and update accordingly.

  3. Cale Gibbard Says:

    Overall though, I’d like to say that I totally agree with the point of the article. :)

  4. Lorenz Goht Says:

    You’re not using very good terminology. In a ‘weakly’ typed system, type errors are not caught at all. In a strongly typed system type errors are caught. What you call ‘weak’ type systems are really dynamic type systems. C is the only language in common use which is weakly typed.

    The distinction you are trying to draw is between static and dynamic typing.

  5. Ricky Clarkson Says:

    There is a difference between a weakly-typed system, like PHP, and an inflexibly-typed system, like Java.

    Weak in terms of typing has quite a specific definition, which is not the same as “inflexible”.

    In this case, I think you’ve made a mistake with the code. Here’s a compilable version of a Monad interface with bind defined:

    http://docs.google.com/View?docid=dx5mfkq_17fnd4ss

  6. Tony Morris Says:

    Hi Lorenz,
    This is an argument that I have had before and I have to admit, I am not convinced by your statement.

    What happens in Ruby when you try this?
    “abc” * “def”

    It is a type error - of course it is caught at runtime, therefore, Ruby is weakly-typed; not completely untyped. In fact, conceiving a completely untyped environment is quite difficult.

    On the other side, we might say that Haskell is weakly-typed, since:
    head []
    Again, it is a type error that is caught at runtime (many mainstream programmers don’t immediately recognise this as a type error).

    Further, we might say that Epigram or Coq. are strongly-typed, while Haskell is not. Whatever the case, there is no clear and concise use of terminology, since attempting to do so runs into a contradiction. I am simply using “weakly-typed” and “strongly-typed” however I please, since that is what everyone else does and “what everyone else does” is the very definition of communication.

    I just decided to draw the line between higher-order types in this post - I’ll happily move to some other definition if someone proposes it in a conversation. Whatever you do, don’t go saying that a dynamically-typed language is not typed or weakly-typed and expect me to faithfully agree. I will only agree for the purposes of conversing.

    Hope this makes sense :)

  7. Jonathan Says:

    I have a “stupid” question. I have much more experience with C++ templates than Java generics, which may be the source of my confusion, so I apologize for the noise if this is so.

    It appears from your definition of the function that Maybe.bind is not polymorphic over any specialization of Monad. The expression of Maybe as a specialization of Monad is therefore an improper use of the Java type system. Phrased differently, what is the point of creating a Maybe that inherits from Monad? If it stood by itself, than wouldn’t the signature for Maybe be:

    final class Maybe&lt;A&gt; {
    public &lt;B&gt; Maybe&lt;B&gt; bind(final Maybe&lt;A&gt; ma, final F&lt;A&gt;&gt; f)
    {…}
    // Constructors for Nothing and Just, and Just’s accessor as before…
    }

    In this case, the type errors of calling bind with some other Monad or a function that takes some other monad are caught at compile-time.

    As a last question, the interface Monad suggests that you expect bind to be polymorphic over all Monad. Why do you have this expectation?

  8. Tony Morris Says:

    Thanks Ricky for your definitions regarding type systems - I respectfully, but adamantly disagree :)
    Furthermore, I point out that your code example is not correct at all - quite the opposite. To make this case I need only refer to the use of assignment in your constructors. I hope we both agree that imperative programming is not a sound premise from which to draw conclusions about ‘rightness’. More specifically, as soon as one accesses the (implicit or not) ‘this’ reference, one also forfeits the right to make any claims about soundness.

    If you want further elaboration on this misunderstanding, I’ll happily engage.

  9. Isaac Gouy Says:

    Tony Morris wrote I am simply using “weakly-typed” and “strongly-typed” however I please, since that is what everyone else does and “what everyone else does” is the very definition of communication.

    That becomes tiresome quite quickly so we sometimes refer to dictionaries or some textbook -

    So what is “strong typing”? This appears to be a meaningless phrase, and people often use it in a non-sensical fashion. To some it seems to mean “The language has a type checker”. To others it means “The language is sound” (that is, the type checker and run-time system are related). To most, it seems to just mean, “A language like Pascal, C or Java, related in a way I can’t quite make precise”. If someone uses this phrase, be sure to ask them to define it for you. (For amusement, watch them squirm.)’

    Chapter 28 Type Soundness
    http://www.cs.brown.edu/~sk/Publications/Books/ProgLangs/

    Tony Morris wrote It is a type error - of course it is caught at runtime, therefore, Ruby is weakly-typed; not completely untyped. In fact, conceiving a completely untyped environment is quite difficult.

    Well it’s a type error assuming no one has defined a method * on Strings. It is caught at runtime - Ruby is type-safe, and Ruby is dynamically type-checked.

    Perhaps you’ll agree that BCPL is an example of an “untyped environment”? You can download an implementation here:
    http://www.cl.cam.ac.uk/~mr10/BCPL.html

  10. Ricky Clarkson Says:

    Tony, really, you know as well as I do that I can easily remove the assignments from the constructor.

    I have done, the code is updated. http://docs.google.com/View?docid=dx5mfkq_17fnd4ss

    “More specifically, as soon as one accesses the (implicit or not) ‘this’ reference, one also forfeits the right to make any claims about soundness.”

    Can you explain that? I don’t know of a problem with ‘this’. Maybe you are complaining that polymorphism exists in competition with pattern matching (and other forms of double dispatch). Note that your own contains ‘this’.

  11. Tony Morris Says:

    Hi Ricky, no I don’t know as well as you do. I think you have it wrong. You cannot remove the assignment from the constructor (though you can move it). This is not quite true, since you can and when you do, you have the code that you originally said was wrong, which it isn’t.

    When I looked at your link, I still see an assignment occurring, so I’m not sure if you have made a mistake with that.

    To answer your question regarding ‘what is the problem with accessing the implicit this context (implicit on every non-static method)’ is exactly this issue that I think you have some confusion with. That is, there are not two separate issues here - there is one. It is fundamentally unsound to access the ‘this’ context - I think the very existence (and correctness) of pure functional programming is enough evidence to support this. Accessing the ‘this context’ logically implies the existence of imperative code and I think we’d both agree that change (from the perspective of software) is just an illusion purported by the hardware.

    Remove the assignment (the flawed premise of your claim) and you’ll have the code I originally gave and maintain the assertion that it is correct.

  12. Ricky Clarkson Says:

    “You cannot remove the assignment from the constructor (though you can move it)”

    There is still an implicit assign, sure, but that’s a language implementation detail and is irrelevant. We can imagine a compiler that doesn’t make that assignment (for stackless bytecode, for example).

    “When I looked at your link, I still see an assignment occurring”

    Your code has 2 explicit assignments, mine has none.

    “It is fundamentally unsound to access the ‘this’ context - I think the very existence (and correctness) of pure functional programming is enough evidence to support this.”

    No, the existence of one thing does not disprove the existence (or soundness) of another, unless one conflicts with the other. Imperative programming exists, and is correct (albeit limited).

    You can consider constructs like classes, methods, fields, to be emulating functions if you like, and that enables conversation, but doesn’t make them unsound. If I write in Roman numerals, and add two numbers together, getting the correct answer, it doesn’t really matter that I used Roman numerals to do that. The mathematical soundness is still complete.

  13. Tony Morris Says:

    Hi Ricky,
    I don’t know how else to explain what you’re missing, but I will try another tact. Your function ‘bind’ is impure - this logically implies the existence of imperative code or ‘implicit state’ (or something illegitimate that you seem to believe may be considered legitimate (this is another post for another day)).

    You might suggest that I have used assignment, but of course, this is easily replaced. You make the point that your code can easily be rewritten without the assignment. I agree, and when you do, you will have the function signature that I originally gave.

  14. Ricky Clarkson Says:

    My function bind is not impure; given the same inputs it will always give the same outputs, with no side-effects (except object creation and CPU time, which can both be optimised away through result cacheing).

    You may consider my function ‘bind’ as a curried function. Now a curried function is one that contains some state.

    E.g.,

    int operateOn(Object obj)

    operate=operateOn.curry(obj);

    Now operate has some state, but that state will never change. That doesn’t make this imperative code, or impure.

    If I rewrite operate as a method of Object, then it is just as pure as the curried function.

  15. Tony Morris Says:

    “My function bind is not impure.”

    So it seems we have identified your mistake. Your function is indeed impure. I truly struggle to find a way of pointing this out, since it is bleedingly obvious. Perhaps a use case if you don’t see it immediately upon reflection.

    Further, a curried function has nothing to do with state, but I can see how you have become confused. I still maintain that if you were to fix your code, you’d end up with my original function signature.

    It is not a secret that referential transparency is a property that is relative to the observer and you have simply shifted your point of observation, into the murky waters of imperative programming. You will then freely admit that there is an additional function argument that is not expressed in your code? It is represented by ‘this’ in Java.

    Passing your entire universe of discourse is guaranteed to provide referential transparency, regardless. Regress from that point and I think you’ll see your mistake.

  16. Ricky Clarkson Says:

    Any non-static method involves ‘this’ in Java. As your bind method is non-static, it’s no better than mine.

    Without pattern matching as part of the language, the optimum case is that the object can match patterns for you - this is why my Maybe type looks like:

    public interface Maybe&lt;T&gt;
    {
    &lt;R&gt; R apply(Lazy&lt;R&gt; ifNothing,Function&lt;T,R&gt; ifJust);
    }

    (not the same as the one posted earlier; this one is from real code)

    Lazy has T invoke() and Function has R invoke(T).

    I implement bind as a static method, and only implement it for Maybe, not for any general Monad interface, as I don’t have a meaningful abstraction for monads in general. You can consider apply as emulating pattern matching; it is the minimum and maximum required to enable all use cases of Maybe.

  17. Tony Morris Says:

    So this is a side issue - which is ‘better’? I will claim that a non-static method is by definition, not ‘better’ than just about anything (I say ‘just about’ because I can’t think of anything - but do not rule out the possibility). In fact, I will blame it (and its implications) on most (all?) software problems that plague the industry. But again, this is a side issue. I’ll go into more detail about it another time.

  18. Matías Giovannini Says:

    You just don’t know how to use F-bounded polymorphism. The Maybe monad can be written in Java perfectly, without excessive trouble. I grant that the lack of type inference makes things quite ugly, but it is doable (I have the code). In this case, the key insight is that Maybe is not rank-two, but rank-one. A simple encoding of the typeclass as a functor (much as you would do in ML) gives the solution.

    Why would you want to use a monad in Java is entirely another mystery. It is analogous to advocating the elimination of the while loop in favor of right folds. You should keep in mind that monads were always and are still a semantic concept (categories belong to the domain of semantics), not a syntactic one. That Haskell implementers found it appropriate to model (_not_ implement) sequencing, side-effecting, etc. doesn’t mean that they must be tacked onto any “imperative” problem whatsoever (whatever that is, if it is anything at all).

  19. Karsten Wagner Says:

    The problem here is that Tony obviously don’t know Java well enough to make the statements he makes. While there really are examples which shows the advantages of Haskell’s type-system. Funnily the example Tony presented here was simply based on a blatant lack of understanding of the Java type system.

    And the idea that ‘using this makes a method impure’ is so obviously wrong. Even for Java beginners. Just look at the String-class as an example. ‘this’ is simply an implicit parameter to a call, it can’t break purity more or less than any explicit parameter can. I really have ask myself, if Tony even knows the meaning of ‘pure’ in this context.

  20. Tony Morris Says:

    Hi Karsten,
    Would you mind substantiating your claims? The difference between ‘String’ and the given example is that String has (effectively) all fields declared final. This means that ‘this’ as an implicit argument maintains referential transparency. Without this restriction, ‘this’ becomes a mutating variable. Of course, referential transparency is relative to the observer, so no doubt a counter-claim might simply take a different point of observation.

    In fact, it is possible to demonstrate that the function void touch(String filename) is referentially transparent, simply by taking a stance that most audiences accept as plausible. The difference is of course, that the so-called ‘implicit’ arguments are not expressed by the function type, which leads to errors. In fact, I have yet to find a piece of software, be it the touch function or J2EE Jumping Joey’s database application, that cannot be written as a lambda expression. Maybe it is not me who does not understand purity? I can’t really refute any of your other claims, since they are simply unsupported.

    Further, there is no attempt to legitimise the use of a monad type in Java - simply to point out that higher-order types exist to Java users and postulate that they might be important.

  21. Tony Morris Says:

    http://kawagner.blogspot.com/2007/02/why-monads-are-evil.html

    Yep, it’s not (necessarily) me who doesn’t understand purity :)

  22. Deleted Says:

    Deleted pointless comment

  23. Jonathan2 Says:

    Clearly I arrive late. Apologies.

    May I re-ask Jonathan’s question?

    My understanding, from having read OOSC (instead of TAPL, which I hope to rectify but have not yet), is that method arguments can only be contravariant to the type hierarchy. I think you claim that Java’s type system is weak (in this instance) because it does not allow contravariant method arguments. How do functional types deal with this - it seems from my understanding - contradiction?

  24. &nbsp; Idempotence versus Referential Transparency&nbsp;by&nbsp;λ Tony&#8217;s blog λ Says:

    [...] I will assume that my reader is roughly familiar with the concept of referential transparency, but not necessarily its implications (since there are many), even though some clearly demonstrate a lack of understanding of this concept, strangely accusing others of same (go figure?) and even trying to proclaim that Java has higher-ranked types while suggesting that others do not understand Java&#8217;s (mediocre) type system. Ignorance is bliss and all that, anyway on with the story&#8230; [...]

Leave a Reply