Free monads in category theory (part 1)

Introduction

Free monads can be used in Haskell for modelling a number of different concepts: trees with arbitrary branching, terms with free variables, or program fragments of an EDSL.

This series of posts is not an introduction to free monads in Haskell, but to the underlying theory. In the following, we will work in the category \(\mathsf{Set}\) of sets and functions. However, most of what we say can be trivially generalised to an arbitrary category.

Algebras of a functor

If \(F\) is an endofunctor on \(\mathsf{Set}\), an algebra of \(F\) is a set \(X\) (called its carrier), together with a morphism \(FX → X\). Algebras of \(F\) form a category, where morphisms are functions of their respective carriers that make the obvious square commute.

Bartosz Milewski wrote a nice introductory post on functor algebras from the point of view of functional programming, which I strongly recommend reading to get a feel for why it is useful to consider such objects.

More abstractly, a functor \(F : \mathsf{Set}→ \mathsf{Set}\) generalises the notion of a signature of an algebraic theory. For a signature with \(a_i\) operators of arity \(i\), for \(i = 0, \ldots, n\), the corresponding functor is the polynomial: \[ F X = a₀ + a₁ × X + ⋯ + a_n × X^n, \] where the natural number \(a_i\) denotes a finite set of cardinality \(a_i\).

For example, the theory of monoids has 1 nullary operation, and 1 binary operation. That results in the functor: \[ F X = 1 + X^2 \]

Suppose that \((X, θ)\) is an algebra for this particular functor. That is, \(X\) is a set, and \(θ\) is a function \(1 + X² → X\). We can split \(θ\) into its two components: \[ θ_e : 1 → X, \] which we can simply think of as an element of \(X\), and \[ θ_m : X × X → X. \]

So we see that an algebra for \(F\) is exactly a set, together with the operations of a monoid. However, there is nothing that tells us that \(X\) is indeed a monoid with those operations!

In fact, for \(X\) to be a monoid, the operations above need to satisfy the following laws: \[ \begin{aligned} & θ_m (θ_e(∗), x) = x \\ & θ_m (x, θ_e(∗)) = x \\ & θ_m (θ_m (x, y), z) = θ_m (x, θ_m (y, z)). \end{aligned} \]

However, any two operations \(θ_e\) and \(θ_m\) with the above types can be assembled into an \(F\)-algebra, regardless of whether they do satisfy the monoid laws or not.

“Lawful” algebras

The above example shows that functor algebras don’t quite capture the general notion of “algebraic structure” in the usual sense. They can express the idea of a set equipped with operations complying to a given signature, but we cannot enforce any sort of laws on those operations.

For the monoid example above, we noticed that we can realise any actual monoid as an \(F\)-algebra (for \(FX = 1 + X²\)), but that not every such algebra is a monoid. This means that monoids can be regarded as the objects of the subcategory of \(\mathsf{Alg}_F\) consisting of the “lawful” algebras (exercise: make this statement precise and prove it).

Therefore, we have the following commutative diagram of functors:

and it is easy to see that \(U\) (which is just the restriction of the obvious forgetful functor \(\mathsf{Alg}_F → \mathsf{Set}\) on the right side of the diagram) has a left adjoint \(L\), the functor that returns the free monoid on a set.

Explicitly, \(LX\) has \(X^*\) as carrier (i.e. the set of lists of elements of \(X\)), and the algebra is given by the coproduct of the function \(1 → X^*\) that selects the empty list, and the list concatenation function \(X^* × X^* → X^*\).

In Haskell, this algebra looks like:

The endofunctor \(UL\), obtained by taking the carrier of the free monoid, is a monad, namely the list monad.

Algebras of a monad

Given a monad \((T, η, μ)\) on \(\mathsf{Set}\), a monad algebra of \(T\) is an algebra \((X, θ)\) of the underlying functor of \(T\), such that the following two diagrams commute:

In Haskell notation, this means that the following two equations are satisfied:

In the case where the monad \(T\) returns the set of “terms” of some language for a given set of free variables, a monad algebra can be thought of as an evaluation function.

The first law says that a variable is evaluated to itself, while the second law expresses the fact that when you have a “term of subterms”, you can either evaluate every subterm and then evaluate the resulting term, or regard it as a single term and evaluate it directly, and these two procedures should give the same result.

Naturally, monad algebras of \(T\) form a full subcategory of \(\mathsf{Alg}_T\) which we denote by \(\mathsf{mAlg}_T\).

We can now go back to our previous example, and look at what the monad algebras for the list monad are. Suppose we have a set \(X\) and a function \(θ : X^* → X\) satisfying the two laws stated above.

We can now define a monoid instance for \(X\). In Haskell, it looks like this:

The monoid laws follow easily from the monad algebra laws. Verifying them explicitly is a useful (and fun!) exercise. Vice versa, any monoid can be given a structure of a \(T\)-algebra, simply by taking mconcat as \(θ\).

Therefore, we can extend the previous diagram of functors with an equivalence of categories:

where the top-left equivalence (which is actually an isomorphism) is determined by the Monoid instance that we defined above, while its inverse is given by mconcat.

Let’s step back at this whole derivation, and reflect on what it is exactly that we have proved. We started with some category of “lawful” algebras, a subcategory of \(\mathsf{Alg}_F\) for some endofunctor \(F\). We then observed that the forgetful functor from this category to \(\mathsf{Set}\) admits a left adjoint \(L\). We then considered monad algebras of the monad \(UL\), and we finally observed that these are exactly those “lawful” algebras that we started with!

Monadic functors

We will now generalise the previous example to an arbitrary category of algebra-like objects.

Suppose \(\mathsf{D}\) is a category equipped with a functor \(G : \mathsf{D} → \mathsf{Set}\). We want to think of \(G\) as some sort of “forgetful” functor, stripping away all the structure on the objects of \(\mathsf{D}\), and returning just their carrier.

To make this intuition precise, we say that \(G\) is monadic if:

  1. \(G\) has a left adjoint \(L\)
  2. The comparison functor \(\mathsf{D} → \mathsf{mAlg}_T\) is an equivalence of categories, where \(T = GL\).

The comparison functor is something that we can define for any adjunction \(L ⊢ G\), and it works as follows. For any object \(A : \mathsf{D}\), it returns the monad algebra structure on \(G A\) given by \(G \epsilon\), where \(\epsilon\) is the counit of the adjunction (exercise: check all the details).

So, what this definition is saying is that a functor is monadic if it really is the forgetful functor for the category of monad algebras for some monad. Sometimes, we say that a category is monadic, when the functor \(G\) is clear.

The monoid example above can then be summarised by saying that the category of monoids is monadic.

Conclusion

I’ll stop here for now. In the next post we will look at algebraically free monads and how they relate to the corresponding Haskell definition.

Another proof of function extensionality

The fact that the univalence axiom implies function extensionality is one of the most well-known results of Homotopy Type Theory.

The original proof by Voevodsky has been simplified over time, and eventually assumed the distilled form presented in the HoTT book.

All the various versions of the proof have roughly the same outline. They first show that the weak function extensionality principle (WFEP) follows from univalence, and then prove that this is enough to establish function extensionality.

Following the book, WFEP is the statement that contractible types are closed under \(Π\), i.e.:

WFEP :  i j  Set _
WFEP i j = {A : Set i}{B : A  Set j}
          ((x : A)  contr (B x))
          contr ((x : A)  B x)

WFEP implies function extensionality

Showing that WFEP implies function extensionality does not need univalence, and is quite straightforward. First, we define what we mean by function extensionality:

Funext :  i j  Set _
Funext i j = {A : Set i}{B : A  Set j}
            {f g : (x : A)  B x}
            ((x : A)  f x  g x)
            f  g

Now we want to show the following:

wfep-to-funext :  {i}{j}  WFEP i j  Funext i j
wfep-to-funext {i}{j} wfep {A}{B}{f}{g} h = p
  where

To prove that \(f\) and \(g\) are equal, we show that they both have values in the following dependent type, which we can think of as a subtype of \(B(x)\) for all \(x : A\):

    C : A  Set j
    C x = Σ (B x) λ y  f x  y

We denote by \(f'\) and \(g'\) the range restrictions of \(f\) and \(g\) to \(C\):

    f' g' : (x : A)  C x
    f' x = (f x , refl)
    g' x = (g x , h x)

where we made use of the homotopy \(h\) between \(f\) and \(g\) to show that \(g\) has values in \(C\). Now, \(C(x)\) is a singleton for all \(x : A\), so, by WFEP, \(f'\) and \(g'\) have the same contractible type, hence they are equal:

    p' : f'  g'
    p' = contr⇒prop (wfep  x  singl-contr (f x))) f' g'

The fact that \(f\) and \(g\) are equal then follows immediately by applying the first projection and (implicitly) using \(η\) conversion for \(Π\)-types:

    p : f  g
    p = ap  u x  proj₁ (u x)) p'

In the book, the strong version of extensionality, i.e.

StrongFunext :  i j  Set _
StrongFunext i j = {A : Set i}{B : A  Set j}
                  {f g : (x : A)  B x}
                  ((x : A)  f x  g x)
                  (f  g)

is obtained directly using a more sophisticated, but very similar argument.

Proving WFEP

Now we turn to proving WFEP itself. Most of the proofs I know use the fact that univalence implies a certain congruence rule for function-types, i.e. if \(B\) and \(B'\) are equivalent types, then \(A → B\) and \(A → B'\) are also equivalent, and furthermore the equivalence is given by precomposing with the equivalence between \(B\) and \(B'\).

However, if we have η conversion for record types, there is a much simpler way to obtain WFEP from univalence.

The idea is as follows: since \(B(x)\) is contractible for all \(x : A\), univalence implies that \(B(x) ≡ ⊤\), so the contractibility of \((x : A) → B(x)\) is a consequence of the contractibility of \(A → ⊤\), which is itself an immediate consequence of the definitional \(η\) rule for \(⊤\):

record  : Set j where
  constructor tt

⊤-contr : contr 
⊤-contr = tt , λ { tt  refl }

contr-exp-⊤ :  {i}{A : Set i}  contr (A  )
contr-exp-⊤ =  _  tt) ,  f  refl)

However, the proof sketch above is missing a crucial step: even though \(B(x)\) is pointwise equal to \(⊤\), in order to substitute \(⊤\) for \(B(x)\) in the \(Π\)-type, we need to show that \(B ≡ λ \_ → ⊤\), but we’re not allowed to use function extensionality, yet!

Fortunately, we only need a very special case of function extensionality. So the trick here is to apply the argument to this special case first, and then use it to prove the general result.

First we prove WFEP for non-dependent \(Π\)-types, by formalising the above proof sketch.

nondep-wfep :  {i j}{A : Set i}{B : Set j}
             contr B
             contr (A  B)
nondep-wfep {A = A}{B = B} hB = subst contr p contr-exp-⊤
  where
    p : (A  )  (A  B)
    p = ap  X  A  X) (unique-contr ⊤-contr hB)

Since \(B\) is non-dependent in this case, the proof goes through without function extensionality, so we don’t get stuck in an infinite regression: two iterations are enough!

Now we can prove the special case of function extensionality that we will need for the proof of full WFEP:

funext' :  {i j}{A : Set i}{B : Set j}
         (f : A  B)(b : B)(h : (x : A)  b  f x)
          _  b)  f
funext' f b h =
  ap  u x  proj₁ (u x))
       (contr⇒prop (nondep-wfep (singl-contr b))
                     _  (b , refl))
                     x  f x , h x))

Same proof as for wfep-to-funext, only written more succinctly.

Finally, we are ready to prove WFEP:

wfep :  {i j}  WFEP i j
wfep {i}{j}{A}{B} hB = subst contr p contr-exp-⊤
  where
    p₀ :  _  )  B
    p₀ = funext' B   x  unique-contr ⊤-contr (hB x))

    p : (A   {j})  ((x : A)  B x)
    p = ap  Z  (x : A)  Z x) p₀

By putting it all together we get function extensionality:

funext :  {i j}  Funext i j
funext = wfep-to-funext wfep

Avoiding η for records

This proof can also be modified to work in a theory where \(⊤\) does not have definitional η conversion.

The only point where η is used is in the proof of contr-exp-⊤ above. So let’s define a version of \(⊤\) without η, and prove contr-exp-⊤ for it.

data  : Set j where
  tt : 

⊤-contr : contr 
⊤-contr = tt , λ { tt  refl }

We begin by defining the automorphism \(k\) of \(⊤\) which maps everything to \(\mathsf{tt}\). Clearly, \(k\) is going to be the identity, but we can’t prove that until we have function extensionality.

k :   
k _ = tt

k-we : weak-equiv k
k-we tt = Σ-contr ⊤-contr  _  h↑ ⊤-contr tt tt)

Now we apply the argument sketched above, based on the fact that univalence implies congruence rules for function types. We extract an equality out of \(k\), and then transport it to the exponentials:

k-eq :   
k-eq = ≈⇒≡ (k , k-we)

k-exp-eq :  {i}(A : Set i)  (A  )  (A  )
k-exp-eq A = ap  X  A  X) k-eq

If we were working in a theory with computational univalence, coercion along k-exp-eq would reduce to precomposition with \(k\). In any case, we can manually show that this is the case propositionally by using path induction and the computational rule for ≈⇒≡:

ap-comp :  {i k}{A : Set i}{X X' : Set k}
         (p : X  X')
         (f : A  X)
         coerce (ap  X  A  X) p) f
         coerce p  f
ap-comp refl f = refl

k-exp-eq-comp' :  {i}{A : Set i}(f : A  )
                coerce (k-exp-eq A) f
                λ _  tt
k-exp-eq-comp' f = ap-comp k-eq f
                 · ap  c  c  f)
                      (uni-coherence (k , k-we))

Now it’s easy to conclude that \(A → ⊤\) is a mere proposition (hence contractible): given functions \(f g : A → ⊤\), precomposing them with \(k\) makes them both equal to \(λ \_ → \mathsf{tt}\). Since precomposing with \(k\) is an equivalence by the computational rule above, \(f\) must be equal to \(g\).

prop-exp-⊤ :  {i}{A : Set i}  prop (A  )
prop-exp-⊤ {i}{A} f g = ap proj₁
  ( contr⇒prop (coerce-equiv (k-exp-eq A)  _  tt))
      (f , k-exp-eq-comp' f)
      (g , k-exp-eq-comp' g) )

contr-exp-⊤ :  {i}{A : Set i}  contr (A  )
contr-exp-⊤ =  _  tt) , prop-exp-⊤ _

Families and fibrations

Introduction

The notion of family of “objects” indexed over an object of the same type is ubiquitous is mathematics and computer science.

It appears everywhere in topology and algebraic geometry, in the form of bundles, covering maps, or, more generally, fibrations.

In type theory, it is the fundamental idea captured by the notion of dependent type, on which Martin-Löf intuitionistic type theory is based.

Definition

Restricting ourselves to \(\mathrm{Set}\), the category of sets, for the time being (and ignoring issues of size), it is straightforward to give a formal definition of what a family of sets is:

Given a set A, a family over A is a function from A to the objects of the category of sets (or equivalently, on the other side of the adjunction, a functor from A regarded as a discrete category to \(\mathrm{Set}\)).

This is a perfectly valid definition, but it has two problems:

  1. It can be awkward to work with functions between objects of different “sorts” (like sets and universes).

  2. It is not clear how to generalize the idea to other categories, like \(\mathrm{Top}\) (the category of topological spaces and continuous maps), for example. In fact, we would like a family of spaces to be “continuous” in some sense, but in order for that to make sense, we would need to define a topology on the class of topological spaces.

Display maps

Fortunately, there is a very simple construction that helps bringing this definition to a form which is much easier to work with.

Let’s start with a family of sets B over A, defined as above: B : A → Set.

Define the “total space” of the family as the disjoint union (or dependent sum) of all the sets of the family (I’ll use type theoretic notation from now on):

E = Σ (a : A) . B a

The fibration (or display map) associated to the family B is defined to be the first projection:

proj₁ : E → A

So far, we haven’t done very much. The interesting observation is that we can always recover a family of sets from any function E → A.

In fact, suppose that now E is any set, and p : E → A any function. We can define a family of sets:

B : A → Set
B a = p ⁻¹ a

as the function that associates to each point in A, its inverse image (or fiber) in E.

It is now straightforward to check that these two mappings between families and fibrations are inverses of one another.

Intuitively, given a family B, the corresponding fibration maps each element of all possible sets in the family to its “index” in A. Viceversa, given a fibration p : E → A, the corresponding family is just the family of fibers of p.

Here is formalization in Agda of this correspondence as an isomorphism between families and fibrations. This uses agda-base instead of the standard library, as it needs univalence in order to make the isomorphism explicit.

Examples of constructions

Once we understand how families and fibrations are really two views of the same concept, we can look at a number of constructions for families, and check how they look like in the world of fibrations.

Dependent sum

The simplest construction is the total space:

E = Σ (x : A). B x

As we already know, this corresponds to the domain of the associated fibration.

Dependent product

Given a family of sets B over A, a choice function is a function that assigns to each element x of A, an element y of B x. This is called a dependent function in type theory.

The corresponding notion for a fibration p : E → A is a function s : A → E such that for each x : A, the index of s x is exactly x. In other words, p ∘ s ≡ id, i.e. s is a section of p.

The set of such sections is called the dependent product of the family B.

Pullbacks

Let A and A' be two sets, and B a family over A. Suppose we have a function

r : A' → A

We can easily define a family B' over A' by composing with r:

B' : A' → Set
B' x = B (r x)

What does the fibration p' : E' → A' associated to B' look like in terms of the fibration p : E → A associated to B?

Well, given an element b in the total space of B', b is going to be in B' x for some x : A'. Since B' x ≡ B (r x) by definition, b can also be regarded as an element of the total space of B. So we have a map s : E' → E, and we can draw the following diagram:

The commutativity of this diagram follows from the immediate observation that the index above s b is exactly r x.

Now, given elements x : A', and b : E, saying that p b ≡ r x is equivalent to saying that b is in B (r x). In that case, b can be regarded as an element of B' x, which means that there exists a b' in E' such that p' b' ≡ x and s b' ≡ b.

What this says is that the above diagram is a pullback square.

Adjoints

It is important to note that the previous constructions are related in interesting ways.

Let’s look at a simple special case of the pullback construction, i.e. when B is a trivial family of just one element. That means that the display map p associated to B is the canonical map

p : B → 1

So, if A' is any other type, we get that the pullback of p along the unique map r : A' → 1 is the product B × A.

This defines a functor

\[ A^\ast : \mathrm{Set} → \mathrm{Set}/A \]

where \(\mathrm{Set}/A\) denotes the slice category of sets over A. Furthermore, the dependent product and dependent sum constructions defined above give rise to functors:

\[ Σ_A, Π_A : \mathrm{Set}/A → \mathrm{Set} \]

Now, it is clear that, given a fibration p : X → A and a set Y, functions X → Y are the same as morphisms X → Y × A in the slice category. So \(Σ_A\) is left adjoint to \(A^\ast\).

Dually, functions from Y to the set of sections of p correspond to functions Y × A → X in the slice category, thus giving an adjuction between \(A^*\) and \(Π_A\).

So we have the following chain of adjunctions:

\[ Σ_A \vdash A^* \vdash Π_A \]

Conclusion

The correspondence between indexed families and fibrations exemplified here extends well beyond the category of sets, and can be abstracted using the notions of Cartesian morphisms and fibred categories.

In type theory, it is useful to think of this correspondence when working with models of dependently typed theories in locally cartesian closed categories, and I hope that the examples given here show why slice categories and pullback functors play an important role in that setting.

Continuation-based relative-time FRP

In a previous post I showed how it is possible to write asynchronous code in a direct style using the ContT monad. Here, I’ll extend the idea further and present an implementation of a very simple FRP library based on continuations.

Monadic events

Let’s start by defining a callback-based Event type:

A value of type Event a represents a stream of values of type a, each occurring some time in the future. The on function connects a callback to an event, and returns an object of type Dispose, which can be used to disconnect from the event:

The interesting thing about this Event type is that, like the simpler variant we defined in the above post, it forms a monad:

First of all, given a value of type a, we can create an event occurring “now” and never again:

Note that the notion of “time” for an Event is relative.

All time-dependent notions about Events are formulated in terms of a particular “zero” time, but this origin of times is not explicitly specified.

This makes sense, because, even though the definition of Event uses the IO monad, an Event object, in itself, is an immutable value, and can be reused multiple times, possibly with different starting times.

The definition of >>= is slightly more involved.

We call the function f every time an event occurs, and we connect to the resulting event each time using the helper function addD, accumulating the corresponding Dispose object in an IORef.

The resulting Dispose object is a function that reads the IORef accumulator and calls dispose on that.

Monadic bind
Monadic bind

As the diagram shows, the resulting event e >>= f includes occurrences of all the events originating from the occurrences of the initial event e.

Event union

Classic FRP comes with a number of combinators to manipulate event streams. One of the most important is event union, which consists in merging two or more event streams into a single one.

In our case, event union can be implemented very easily as an Alternative instance:

An empty Event never invokes its callback, and the union of two events is implemented by connecting a callback to both events simultaneously.

Other combinators

We need an extra primitive combinator in terms of which all other FRP combinators can be implemented using the Monad and Alternative instances.

The once combinator truncates an event stream at its first occurrence. It can be used to implement a number of different combinators by recursion.

Behaviors and side effects

We address behaviors and side effects the same way, using IO actions, and a MonadIO instance for Event:

Now we can implement something like the apply combinator in reactive-banana:

Events can also perform arbitrary IO actions, which is necessary to actually connect an Event to user-visible effects:

Executing event descriptions

An entire GUI program can be expressed as an Event value, usually by combining a number of basic events using the Alternative instance.

A complete program can be run with:

Underlying assumptions

For this simple system to work, events need to possess certain properties that guarantee that our implementations of the basic combinators make sense.

First of all, callbacks must be invoked sequentially, in the order of occurrence of their respective events.

Furthermore, we assume that callbacks for the same event (or simultaneous events) will be called in the order of connection.

Many event-driven frameworks provide those guarantees directly. For those that do not, a driver can be written converting underlying events to Event values satisfying the required ordering properties.

Conclusion

It’s not immediately clear whether this approach can scale to real-world GUI applications.

Although the implementation presented here is quite simplistic, it could certainly be made more efficient by, for example, making Dispose stricter, or adding more information to Event to simplify some common special cases.

This continuation-based API is a lot more powerful than the usual FRP combinator set. The Event type combines the functionalities of both the classic Event and Behavior types, and it offers a wider interface (Monad rather than only Applicative).

On the other hand, it is a lot less safe, in a way, since it allows to freely mix IO actions with event descriptions, and doesn’t enforce a definite separation between the two. Libraries like reactive-banana do so by distinguishing beween “network descriptions” and events/behaviors.

Finally, there is really no sharing of intermediate events, so expensive computations occurring, say, inside an accumE can end up being unnecessarily performed more than once.

This is not just an implementation issue, but a consequence of the strict equality model that this FRP formulation employs. Even if two events are identical, they might not actually behave the same when they are used, because they are going to be “activated” at different times.