2020-02-22 17:47:52

A use for monads

In my last post, I commented on the fact that the random package, for some reason, does not use a monadic interface. I thought this would be an excellent opportunity to discuss monads, since it seems to be one of the concepts that scare new Haskell programmers the most. Specifically, I'm going to try to give you an intuition for why monads are useful in the first place and why you'd want to use them in your real-life programs, and we're going to build a (very simple) monadic interface for random.

Our motivation

To start with, here's the code for generating a random colour in our program:

randomColour :: (RandomGen g) => g -> (RGB Word8, g)
randomColour gen =
  let (w1, gen1) = random gen
      (w2, gen2) = random gen1
      (w3, gen3) = random gen2
   in (RGB w1 w2 w3, gen3)

It's understandable enough, but the main problem with it is that we have to be constantly juggling the generators. We need to come up with a unique name for each of them, use them in the correct line, and if we wanted to add another value in the future (say, an alpha channel) then we'd need to be careful to return the value of that last generator in our function. It's doable, but clunky. What's the solution to this? Well, ideally, what we would want to do is something like this:

randomColour gen =
  let w1 = random
      w2 = random
      w3 = random
   in RGB w1 w2 w3

Or, if we're feeling ambitious, even something like:

randomColour gen = RGB random random random

That would be so much clearer, wouldn't it? Unfortunately, that's not possible in Haskell. Functions (in particular, the random function) are pure: if we run them with the same arguments, we'll get the same result. That's one of Haskell's biggest strengths, and what makes it so easy to test and debug, but it's holding us back here. But what if there was a way around this? What we need... is a context.

The core idea is this: we could have a datatype, let's call it M, that represents an action in a context. This would be a type constructor; that means we can't just have an M, but we could have an M Int, M String, M Word8, or whatever else. (For OOP programmers, this is similar to generic types.) So, instead of having our function return a Word8, we have it return an M Word8, which represents an action that, when run in a context, will give us a Word8. Sort of like... a promise that "I will give you a Word8 in the future when you provide me with a context". In our case, the context is the RandomGen generator.

Well, it might seem like we haven't gained much, right? We've just pushed the problem back one step. Now we can delay actually getting the Word8, big deal. Well, here's the trick: we can work within that context. So, say I have three M Word8s. What if I could combine all of them into an M RGB Word8? Remember, we're still dealing with actions that will be run in the future here, so we don't need any context so far. And now, what if we want our action to return a list of colours instead of just one? We could turn our M RGB Word8 into an M [RGB Word8]. Then, once we have what we need, we finally give it a context (a RandomGen) and claim back our [RGB Word8]. We only need to provide one generator, and the program takes care of the rest. Wouldn't that be much more comfortable to work with?

Well, that's exactly what a monad is. Monad is not a single type, but a whole class of types (a typeclass, similar to interfaces in OOP); any type that implements Monad represents a context, and values of that type represent actions in that context.

Writing a monad (code)

In order to show how to work with monads, we're going to write our own, based on an already existing monad from the mtl package: the State monad. State is a very useful monad that allows us to keep a read/write state in our context, and its actions are read/write operations. In our case, we'll want our state to be a RandomGen (we pick the particular type StdGen, since RandomGen is also a typeclass):

type RandomM = State StdGen

(Note: type is an awful choice of keyword. It should be either synonym or alias.)

Now, we need an action that returns any value that can be generated randomly. We'll call it random, which normally would overlap with the function of the same name in the random package, but we're assuming that won't be a problem, since we won't be using both our module and the one in random at the same time:

random :: (Random a) => RandomM a
random = state R.random

The implementation of this function might seem a bit confusing, but take a look at the documentations for the state and random functions and you'll see that they do exactly what we want: use the generator to create a value and a new generator, and store that new one in our state. We also wrote an equivalent randomR function, to get a random value in a range.

And finally, we need a way to run our actions given a context:

runRandom :: RandomM a -> StdGen -> a
runRandom = evalState

Once again, evalState does exactly what we want. (Yeah, the tools in these libraries are incredibly complete.) We could have made our functions more generic, to accept any RandomGen and not just an StdGen, but I didn't bother since we won't need it in this case.

Using our monad in practice (code)

Well, that's all the tools we need. As a refresher, here's what we (ideally) wanted our function to look like:

randomColour gen =
  let w1 = random
      w2 = random
      w3 = random
   in RGB w1 w2 w3

And here's what it ends up looking like:

randomColour = do
  w1 <- random
  w2 <- random
  w3 <- random
  return $ RGB w1 w2 w3

That's not half bad, is it? Like I said, the first form is impossible because of Haskell's purity, but the second form uses what we call a do block, which is a way of generating new monadic values (actions, remember?) that do what we want. Here's how to read that block:

randomColour = do         -- randomColour is an action that DOES this:
  w1 <- random            -- Run the "random" action and call its result "w1"
  w2 <- random            -- Run the "random" action and call its result "w2"
  w3 <- random            -- Run the "random" action and call its result "w3"
  return $ RGB w1 w2 w3   -- Run an action that does nothing and returns as its result "RGB w1 w2 w3"

And the action resulting from that block does all of these actions sequentially and returns whatever is returned by its last line, which in this case is the RGB that we want. No generator juggling required, since it all happens in the context. We'll give the randomFlag function a similar treatment:

randomFlag :: RandomM Flag
randomFlag = do
  n <- randomR (3, 7)
  cs <- randomColours n
  return $ Flag {stripes = n, colours = cs}

If you want an exercise, try to read this do block and figure out what it does. It's very similar to the last one.

Remember: up until this point, we don't actually have any flags, just actions that will return a flag when given the right context! So, the last step is to swap out this line in our previous code:

let image = rawFlag $ randomFlag (generator name)

For this one:

let image = rawFlag $ runRandom randomFlag (generator name)

And that's all! Our flag generator works just like before, but now, with more monads:

The monads pride flag!

Addendum: The monad laws and IO

I've mostly glossed over what monads actually are in favour of how they work, because I think the latter is more important for people trying to get a feel for functional programming. If you're left wondering about how this idea is formalized, let me explain that quickly. If not, you can skip this part; you won't miss much.

One big reason why monads are so powerful is that they are everywhere. Any type can be a monad as long as it follows a set of laws, which basically describe the utility that we need to be able to write do blocks. The laws define a few operations that we should be able to do on a monad, and a few rules stating that these operations do sensible things with each other. I won't go into details with the rules, but here's the operations. If a type is a (useful) monad, then we should be able to:

  1. Transform an action's return value with a unary function
  2. Create an action that does nothing but return a pure value
  3. Combine two (or n) actions into one with a binary (or n-ary) function
  4. Chain the result of one action into the next one

Remember, typeclasses are like interfaces: they define a few operations that we should be able to perform for a type, and, when we implement the typeclass for a type, we just say how each of these operations is actually done. So, in order to make a type into a monad, all we have to do is implement these four operators that correspond to the four points above, respectively: fmap, pure (also called return), <*> and >>=. Under the hood, do blocks aren't magic, they just use these definitions (particularly >>=). Feel free to check out the Monad documentation (or search them up on Hoogle) if you want to learn what each of them does!

As a last note, you might notice that one very useful operation is missing from that list: we can't extract an action's value. This tends to trip people up ("I have an M Int, how do I get an Int from it?"), but it makes perfect sense if you stop and think about it for a moment. A monad represents an action in a context. If all we have is an action and no context, then we can't actually get any value. It's up to each monad to define how we run it, and that operation is not part of the Monad typeclass. Why? Well, because of (arguably) the reason monads exist in the first place: the IO monad, which represents input and output with the world outside of our program.

An IO String represents an action that gathers a String from somewhere as input. It might be as simple as a user typing on a console, or as complex as a series of HTTP requests to a web service. The key part is, we can't know what that value will be ahead of time, because it will change every time. You can't rely on a user to always type the same things on a keyboard. This breaks Haskell's promise of purity... and it sounds awfully similar to what we went through with our RandomM monad, doesn't it? The difference is that, with RandomM, we can give it a generator to actually run the action; with IO, we can't give it a context. In a way, the outside world is the context. So how do we actually run an IO action? Simple: we call it main and make it the entry point of our program. That's it. There is no other way to run an IO action in Haskell. Hopefully you can see why all this heavy machinery that we have for monads, like do blocks, isn't just convenient, but crucial in the case of IO.

Well, that's all I have to say about monads for now. As always, feedback via email or fediverse is greatly appreciated, and I really hope this post helped a few ideas click in your head!

Posted by Emi Socks | Permanent link