Sunday, September 6, 2009

Impressed with Haskell's concurrency

I have been playing with both Ocaml and Haskell* lately. In Ocaml, I have been rewriting a lot of simple Python scripts I use at work to see how they compare. They tend to be faster and more trustworthy. Ocaml, though, has pretty bad concurrency support. They have some thread library that wraps the OS threads, no real good message passing interface. Even the guys at Jane St have listed that as a complaint**.

So I decided to check out Haskell again. With Haskell I have always thought it was a great language, but could never grok it. That sounds a little weird, but reading Haskell code is simply great. When you figure out what it does, it tends to be very expressive and flexible. After looking at what Haskell offers for concurrency, I think that in theory it could give Erlang a serious run for its money. Here's why:

- Haskell, in general and in my opinion, is a safer language to write code in than Erlang. One can do a lot of work with Dialyzer to verify their code, but they are still somewhat limited. Haskells type system is very expressive and powerful. Haskell also has ADT's, which as far as I know Erlang still does not. For an idea of why ADT's are so powerful when it comes to writing safe code, see the Caml Trading video. And in this case, by 'safe' I mean writing correct code.

- Haskell produces generally fast code. GHC does some impressive optimizations, and supercompilation*** is most likely getting going to be part of GHC in the near future. It should noted that while Haskell can produce fast results, its lazyness can also be unpredictable in the optimizations (both speed and memory) that it performs. This is believed, by many, to be a big drawback to Haskell. I haven't done enough work in Haskell to really say how bad this is but Real World Haskell does give a complete chapter to diagnosing performance problems and optimizing them.

- Haskell's threading implementation uses green threads with a many-to-one mapping to OS threads. This means you can take advantage of multiple cores without modifying your Haskell code. The Erlang VM does this too, with SMP support. The greenthread implementation also appears to be blazing fast. Haskell is number one in the threadring solution on the language shootout. Erlang is about 4x slower. Take from that what you will.

- You can implement Erlang-like best-case programming in Haskell, from what I can tell. Haskell supports Exceptions, and you can throw exceptions to other threads using asynchronous exceptions. This gives you a way to 'link' threads like you can link Erlang processes. It is still programmer driven, so you can make a mistake, point for Erlang.

- STM, while I haven't read up on this, it seems like a pretty great way to handle synchronization between threads.

- Monads. When reading up on concurrency in Haskell, monads seemed to often come up as valuable in ensuring correctness. For example, in an STM transaction, one wants to restrict the transaction from doing things that cannot be rolled back, such as IO. This is done by the 'atomically' function taking an STM monad as input and wrapping the output in an IO monad. What this all means is one can't do IO in the atomically block unless it's unsafe. The ability to restrict what the user can do, when one wants to, with the type system is quite powerful.

That being said, Haskell does have some clear losses next to Erlang. The biggest drawback is the lack of a distributed model. There is distributed Haskell, I have not researched it much but I'm under the impression it is not 'there' yet. Erlang is easy to learn, very easy. Haskell is not. I have found that, in order to write really good Haskell code, one has to keep a lot of stuff in their head at once. Perhaps this is just because I am new but I have found Haskell better to read than to write. Erlang is not like this, while the syntax has some peculiarities to it, it is not hard to pick up and start writing good Erlang. I think Ocaml shares more in common to Erlang in this regard. The hump one needs to get over in order to write solid Haskell code is a real and legitimate reason to not choose Haskell.


All that being said, I admittedly am fairly new to Haskell so these opinions will change over time, I'm planning on putting more effort into writing real projects in Haskell to see how it goes. Needless to say, I am impressed by what Haskell has to offer for concurrency. If I'm factually wrong on anything here, please correct me, this is all based on some reading research I've been doing on Haskell, not my actual experiences.

* When I say Haskell, I really mean GHC here

** Yaron Minsky's Caml Trading lecture, very good! Makes a great, practical argument, for Ocaml - http://ocaml.janestreet.com/?q=node/61

*** Supercompilation http://community.haskell.org/~ndm/downloads/slides-supercompilation_for_haskell-03_mar_2009.pdf