Last week at FP Lunch I talked about how to generalise the notion of *monad* using lax functors to \(\mathsf{Cat}\).

Let’s begin by reviewing the classical definition. A **monad** is given by the following data:

- a category \(\mathcal{C}\);
- an endofunctor \(T : \mathcal{C}\to \mathcal{C}\);
- natural transformations \(\eta : I \to T\) and \(\mu : T \circ T \to T\);

satisfying certains laws (namely: \(\mu \circ \eta T = \mu \circ T \eta = \mathsf{id}\) and \(\mu \circ T \mu = \mu \circ \mu T\)). Note that the category \(\mathcal{C}\) is considered to be part of the data, rather than fixed beforehand.

In this post, I will illustrate a compact formulation of the above definition that can easily be generalised to include other similar notions, which appear from time to time in functional programming.

Here is the punchline:

A monad is a lax 2-functor from the terminal 2-category 1 to \(\mathsf{Cat}\).

To make sense of this definition, we need to venture into the marvellous world of *higher categories*. If we take the definition of category that we are familiar with, we can regard it as some sort of 1-dimensional structure: we have a set of objects, which we can picture as points, and a set of arrows between them, which we imagine as (oriented) 1-dimensional lines.

It is then relatively easy to go one dimension up, and imagine an entity with 3 levels of structure: objects, morphisms, and 2-dimensional “cells” connecting arrows. This is what we call a 2-category.

More precisely, a 2-category is given by:

- a set of
*objects*(or 0-cells) - for any two objects \(x, y\), a set of
*morphisms*(or 1-cells) \(\mathsf{hom}(x, y)\); - for any two objects \(x, y\), and any two morphisms \(f, g : \mathsf{hom}(x, y)\), a set of
*2-morphisms*\(\mathsf{hom}_2 (f, g)\).

Of course, this cannot really be the complete definition of 2-category, as we also need to be able to compose morphisms and 2-morphisms, but we won’t go into much detail here. The interested reader can find more details on this nLab page.

The primary example of 2-category is \(\mathsf{Cat}\), the 2-category of categories. Objects of \(\mathsf{Cat}\) are (ordinary) categories (also called 1-categories), morphisms are functors, and 2-morphisms are natural transformations. Another example is the terminal 2-category, containing only 1 object, and no non-identity morphisms or 2-morphisms.

As always happens in mathematics, every new structure that we define is accompanied by a corresponding notion of morphism. Given 2-categories \(\mathcal{C}\) and \(\mathcal{D}\), we want to define what it means to give a “map” \(\mathcal{C}\to \mathcal{D}\) that respects the 2-category structure. We call such maps *2-functors*.

As it turns out, there are multiple ways to give a definition of 2-functor. They differ in the amount of *strictness* that they require. More precisely, a 2-functor \(\mathcal{C}\to \mathcal{D}\) is given by:

- a function \(F\) mapping objects of \(\mathcal{C}\) to objects of \(\mathcal{D}\);
- for any two objects \(x, y\) of \(\mathcal{C}\), a function (also denoted \(F\)) mapping morphisms between \(x\) and \(y\) in \(\mathcal{C}\) to morphisms between \(F x\) and \(F y\) in \(\mathcal{D}\);
- for any two objects \(x, y\) of \(\mathcal{C}\), and morphisms \(f, g : \mathsf{hom}(x, y)\), a function mapping 2-morphisms between \(f\) and \(g\) to 2-morphisms between \(F f\) and \(F g\);

subject to certain “functoriality” properties. We can make this functoriality requirement precise in a number of different (non-equivalent) ways.

First, we might directly generalise the functoriality properties for functors, and require: \[ \begin{aligned} & F\mathsf{id}= \mathsf{id}, \\ & F (g \circ f) = F g \circ F f. \end{aligned} \]

If we do that, we get the notion of *strict functor*. However, the elements appearing in the above equations are objects of certain categories (namely, \(\mathsf{hom}\)-categories of \(\mathcal{D}\)), and if category theory has taught us anything, it is the idea that comparing objects of categories up to equality is often not very fruitful.

Therefore, we are naturally lead to the notion of *pseudofunctor*, which weakens the equalities to *isomorphisms*: \[
\begin{aligned}
& F\mathsf{id}\cong \mathsf{id}, \\
& F (g \circ f) \cong F g \circ F f.
\end{aligned}
\]

However, we are interested in an even weaker notion here, called *lax 2-functor*, which replaces the isomorphisms above with arbitrary (possibly not invertible) 2-morphisms: \[
\begin{aligned}
& \mathsf{id}\to F \mathsf{id}, \\
& F g \circ F f \to F (g \circ f).
\end{aligned}
\]

The direction of the arrows can be reversed, yielding the dual notion of *oplax functor*, which we won’t need here.

Now we understand all the terminology used in the definition above. Let \(F : 1 \to \mathsf{Cat}\) be a lax 2-functor. At the level of objects, \(F\) maps the unique object of \(1\) to \(\mathsf{Cat}\), which amounts to just picking a single category \(\mathcal{C}\). At the level of morphisms, we map the single (identity) morphisms of 1 to a functor \(T: \mathcal{C}\to \mathcal{C}\). Now, the “lax structure” produces 2-morphisms in \(\mathsf{Cat}\) (i.e. natural transformations): \(\eta : I \to T\) and \(\mu : T \circ T \to T\).

So it looks like lax 2-functors to \(\mathsf{Cat}\), at least ignoring certain details that we haven’t discussed, correspond perfectly to the classical definition of monad. I encourage the interested reader to look at the complete definition of lax functor, and verify that everything does indeed match, including the monad laws.

After all this work, generalising the definition is now extremely easy: just replace the 2-category 1 with a more general category. A simple example is: given a monoid \(S\), regard \(S\) as a 2-category with 1 object and no non-trivial 2-morphism. Lax functors \(S \to \mathsf{Cat}\) are exactly Wadler’s indexed monads.

It is also possible (although slightly more involved) to recover Atkey’s parameterised monads as lax functors. I’ll leave this as a fun exercise.