λ Tony's blog λ
The weblog of Tony Morris

# Finding the Levenshtein Distance in Scala

Posted on April 24, 2008, in Programming

In the spirit of esoteric code snippets like this one, I thought I’d put my two bob in :)

I have written the Levenshtein Distance algorithm using Scala below. The Levensthein Distance algorithm is a Dynamic Programming Algorithm (DPA). This implementation is a little different to the Python one (which creates the arrays explicitly and fills them using loops):

• The code more closely represents the mathematical definition of the algorithm

• The code is easier to reason about because the destructive updates occur behind the scenes (in the memoisation library)

• The code below requires a third party library (Scalaz) note that Scalaz comes with demo code including the levenshtein distance and other DPAs

• The code has a better complexity than the typical loopy version by using lazy evaluation (notice that ‘c’ is not always evaluated)

• While the code memoises with an array, you could use say, a map and save some space as well

• The code builds the call stack as it traverses the matrix (the loopy one does not)

import scalaz.memo.Memo
import scalaz.memo.SizedMemo.arraySizedMemo

object Levenshtein {
def levenshtein[A](x: Array[A], y: Array[A]): Int = {
val im = arraySizedMemo
val m = im[Memo[Int, Int]](x.length + 1)
// the call matrix
def mx(i: Int, j: Int): Int = if(i == 0) j else if(j == 0) i else {
def f = (n: Int) => im[Int](y.length + 1)
val a = m(f)(i - 1)(mx(i - 1, _))(j) + 1
val b = m(f)(i - 1)(mx(i - 1, _))(j - 1) + (if(x(i - 1) == y(j - 1)) 0 else 1)
lazy val c = m(f)(i)(mx(i, _))(j - 1) + 1
if(a < b) a else if(b <= c) b else c
}
mx(x.length, y.length)
}

def main(args: Array[String]) =
println(levenshtein(args(0).toCharArray, args(1).toCharArray))
}

To run this code:

$wget http://projects.workingmouse.com/public/scalaz/artifacts/2.4/scalaz.jar # download Scalaz 2.4$ scalac -classpath scalaz.jar Levenshtein.scala # compile
\$ scala -classpath .:scalaz.jar Levenshtein algorithm altruistic # find the distance!
<strong>6</strong>