# Free monads in category theory (part 2)

## Introduction

In the previous post, I introduced the notion of monadic functor, exemplified by the forgetful functor from the category of monoids to $$\set$$. We saw that monoids form a subcategory of the category of algebras of the functor $$F$$ defined by $$F X = 1 + X²$$, and we observed that those are the same as the monad algebras of the list monad.

More generally, we can try different subcategories of $$\cat{Alg}_F$$ and check whether they are monadic as well. So let’s start with possibly the simplest one: the whole of $$\cat{Alg}_F$$.

This leads us to the following definition: we say that an endofunctor $$F$$ admits an algebraically free monad if $$\cat{Alg}_F$$ is monadic. The corresponding monad is called the algebraically free monad over $$F$$.

Informally, the algebraically free monad over $$F$$ is a monad $$T$$ such that monad algebras of $$T$$ are the same as functor algebras of $$F$$.

Unfortunately, not all functors admit an algebraically free monad. For example, it is easy to see that the powerset functor does not.

## Free monads as initial algebras

The free package on Hackage defines something called “free monad” for every Haskell functor. What does this have to do with the notion of algebraically free monad defined above?

Here is the definition of Free from the above package:

data Free f a
= Pure a
| Free (f (Free f a))

Translating into categorical language, we can define, for an endofunctor $$F$$, the functor $$F^*$$, which returns, for a set $$X$$, a fixpoint of the functor $G Y = X + F Y.$

Let’s assume that the fixpoint is to be intended as inductive, i.e. as an initial algebra. Therefore, we get, for all objects $$X$$, an initial algebra: $X + F (F^* X) → F^* X.$

Of course, those initial algebras might not exist, but they do if we choose $$F$$ carefully. For example, if $$F$$ is polynomial, then all the functors $$G$$ above are also polynomial, thus they have initial algebras.

In general, if we assume that those initial algebras all exist, then we can prove that the resulting functor $$F^*$$ is a monad, and is indeed the algebraically free monad over $$F$$.

We will first show that $$F^*$$ allows us to define a left adjoint $$L$$ for the forgetful functor $$U : \cat{Alg}_F → \set$$. In fact, for any set $$X$$, let the carrier of $$L X$$ be $$F^* X$$, and define the algebra morphism by restriction from the initial algebra structure on $$F^* X$$: $F (F^* X) → X + F (F^* X) → F^* X.$

By definition, $$F^* X$$ is the initial object in the category of algebras of the functor $$Y ↦ X + F Y$$. Moreover, it is easy to see that the latter category is equivalent to the comma category $$(X ↓ U)$$, where the equivalence maps $$F^* X$$ to the obvious morphism $$X → U L X$$. By the characterisation of adjunctions in terms of universal arrows, it follows that $$L$$ is left adjoint to $$U$$. Clearly, $$U L = F^*$$, therefore $$F^*$$ is a monad.

To conclude the proof, we need to show that the adjunction $$L ⊣ U$$ is monadic, i.e. that the comparison functor from $$F$$-algebras to $$F^*$$-algebras is an equivalence. One way to do that is to appeal to Beck’s monadicity theorem. Verifying the hypotheses is a simple exercise.

It is also instructive to look at the comparison functor as implemented in haskell:

iter :: Functor f => (f x → x) → (Free f x → x)
iter θ (Pure x) = x
iter θ (Free t) = θ (fmap (iter θ) t)

and its inverse

uniter : Functor f => (Free f x → x) → (f x → x)
uniter ψ = ψ . liftF
where liftF = Free . fmap Pure

It is not hard to prove directly, using equational reasoning, that iter θ is a monad algebra, and that iter and uniter are inverses to each other.

## Algebraically free monads are free

The documentation for Free says:

A Monad n is a free Monad for f if every monad homomorphism from n to another monad m is equivalent to a natural transformation from f to m

which doesn’t look at all like our definition of algebraically free monad. Rather, this says that $$N$$ is defined to be the free monad over $$F$$ if the canonical natural transformation $$F → N$$ is a universal arrow from $$F$$ to the forgetful functor $$\cat{Mon}(\set) → \cat{Func}(\set, \set)$$.

If that forgetful functor had a left adjoint, then we could just say that the free monad is obtained by applying this left adjoint to any endofunctor. This is actually the case if we replace $$\set$$ with a so-called algebraically complete category, such as the ones modelled by Haskell, where the left adjoint is given by the (higher order) functor Free.

In $$\set$$, however, we need to stick to the more awkward definition in terms of universal arrows, as not all functors are going to admit free monads. In any case, the relationship with the previously defined notion of algebraically free monad is not immediately clear.

Fortunately, we can prove that a monad is algebraically free if and only if it is free! Proving that an algebraically free monad $$F^*$$ on $$F$$ is free amounts to proving that the following natural transformation (corresponding to liftF in the Haskell code above): $\require{AMScd} \begin{CD} F X @>{F η}>> F (F^* X) @>>> F^* X \end{CD}$ is universal, which is a simple exercise.

To prove the converse, we will be using Haskell notation. Suppose given a functor f, and a monad t that is free on f. Therefore, we have a natural transformation:

l :: f x → t x

and a function that implements the universal property for t:

hoist :: Monad m => (∀ x . f x → m x) → t x → m x

Now we define a functor $$\set → \cat{Alg}_f$$ which is going to be the left adjoint of the forgetful functor. The carrier of this functor is given by t itself, so we only need to define the algebra morphism:

alg :: f (t x) → t x
alg u = join (l u)

To show that this functor is the sought left adjoint, we have to fix a type x and an f-algebra θ : f y → y, define functions:

φ :: (x → y) → (t x → y)
ψ :: (t x → y) → (x → y)

then prove that φ g is an f-algebra morphism for all g : x → y, and that φ and ψ are inverses to each other.

The function ψ is easy to implement:

ψ a h = h . return

Defining φ is a bit more involved. The only tool at our disposal to define functions out of t x is hoist. For that, we need a monad m, and a natural transformation f → m.

The trick is to consider the continuation monad Cont y. Using θ, we define a natural transformation

w :: f z → Cont y z
w u = Cont (\k -> θ (fmap k u))

on which we can apply the universal property of t to get φ:

φ g = (runCont id) . hoist w . fmap g

From here, the proof proceeds by straightforward equational reasoning, and is left as an exercise.

## Conclusion

We looked at two definitions of “free monad”, proved that they are equivalent, and shown the relationship with the Haskell definition of Free. In the next post, we will resume our discussion of algebraic theories “with laws” and try to approach them from the point of view of free monads and monadic functors.

# Free monads in category theory (part 1)

## Introduction

Free monads can be used in Haskell for modelling a number of different concepts: trees with arbitrary branching, terms with free variables, or program fragments of an EDSL.

This series of posts is not an introduction to free monads in Haskell, but to the underlying theory. In the following, we will work in the category $$\set$$ of sets and functions. However, most of what we say can be trivially generalised to an arbitrary category.

## Algebras of a functor

If $$F$$ is an endofunctor on $$\set$$, an algebra of $$F$$ is a set $$X$$ (called its carrier), together with a morphism $$FX → X$$. Algebras of $$F$$ form a category, where morphisms are functions of their respective carriers that make the obvious square commute.

Bartosz Milewski wrote a nice introductory post on functor algebras from the point of view of functional programming, which I strongly recommend reading to get a feel for why it is useful to consider such objects.

More abstractly, a functor $$F : \set → \set$$ generalises the notion of a signature of an algebraic theory. For a signature with $$a_i$$ operators of arity $$i$$, for $$i = 0, \ldots, n$$, the corresponding functor is the polynomial: $F X = a₀ + a₁ × X + ⋯ + a_n × X^n,$ where the natural number $$a_i$$ denotes a finite set of cardinality $$a_i$$.

For example, the theory of monoids has 1 nullary operation, and 1 binary operation. That results in the functor: $F X = 1 + X^2$

Suppose that $$(X, θ)$$ is an algebra for this particular functor. That is, $$X$$ is a set, and $$θ$$ is a function $$1 + X² → X$$. We can split $$θ$$ into its two components: $θ_e : 1 → X,$ which we can simply think of as an element of $$X$$, and $θ_m : X × X → X.$

So we see that an algebra for $$F$$ is exactly a set, together with the operations of a monoid. However, there is nothing that tells us that $$X$$ is indeed a monoid with those operations!

In fact, for $$X$$ to be a monoid, the operations above need to satisfy the following laws: \begin{aligned} & θ_m (θ_e(∗), x) = x \\ & θ_m (x, θ_e(∗)) = x \\ & θ_m (θ_m (x, y), z) = θ_m (x, θ_m (y, z)). \end{aligned}

However, any two operations $$θ_e$$ and $$θ_m$$ with the above types can be assembled into an $$F$$-algebra, regardless of whether they do satisfy the monoid laws or not.

## “Lawful” algebras

The above example shows that functor algebras don’t quite capture the general notion of “algebraic structure” in the usual sense. They can express the idea of a set equipped with operations complying to a given signature, but we cannot enforce any sort of laws on those operations.

For the monoid example above, we noticed that we can realise any actual monoid as an $$F$$-algebra (for $$FX = 1 + X²$$), but that not every such algebra is a monoid. This means that monoids can be regarded as the objects of the subcategory of $$\cat{Alg}_F$$ consisting of the “lawful” algebras (exercise: make this statement precise and prove it).

Therefore, we have the following commutative diagram of functors: $\require{AMScd} \begin{CD} \mathsf{Mon} @>⊆>> \mathsf{Alg}_F\\ @VUVV @VVV\\ \mathsf{Set} @>=>> \mathsf{Set} \end{CD}$

and it is easy to see that $$U$$ (which is just the restriction of the obvious forgetful functor $$\cat{Alg}_F → \set$$ on the right side of the diagram) has a left adjoint $$L$$, the functor that returns the free monoid on a set.

Explicitly, $$LX$$ has $$X^*$$ as carrier (i.e. the set of lists of elements of $$X$$), and the algebra is given by the coproduct of the function $$1 → X^*$$ that selects the empty list, and the list concatenation function $$X^* × X^* → X^*$$.

In Haskell, this algebra looks like:

alg :: Either () ([x], [x]) → [x]
alg (Left _) = []
alg (Right (xs, ys)) = xs ++ ys

The endofunctor $$UL$$, obtained by taking the carrier of the free monoid, is a monad, namely the list monad.

Given a monad $$(T, η, μ)$$ on $$\set$$, a monad algebra of $$T$$ is an algebra $$(X, θ)$$ of the underlying functor of $$T$$, such that the following two diagrams commute:

$\begin{CD} X @>η>> T X \\ @V=VV @VVθV\\ X @>=>> X \end{CD}$

$\begin{CD} T(T X) @>μ>> T X \\ @V{T θ}VV @VVθV \\ T X @>θ>> X \end{CD}$

In Haskell notation, this means that the following two equations are satisfied:

θ (return x) = x
θ (fmap θ m) = θ (join m)

In the case where the monad $$T$$ returns the set of “terms” of some language for a given set of free variables, a monad algebra can be thought of as an evaluation function.

The first law says that a variable is evaluated to itself, while the second law expresses the fact that when you have a “term of subterms”, you can either evaluate every subterm and then evaluate the resulting term, or regard it as a single term and evaluate it directly, and these two procedures should give the same result.

Naturally, monad algebras of $$T$$ form a full subcategory of $$\cat{Alg}_T$$ which we denote by $$\cat{mAlg}_T$$.

We can now go back to our previous example, and look at what the monad algebras for the list monad are. Suppose we have a set $$X$$ and a function $$θ : X^* → X$$ satisfying the two laws stated above.

We can now define a monoid instance for $$X$$. In Haskell, it looks like this:

instance Monoid X where
empty = θ []
mappend x y = θ [x, y]

The monoid laws follow easily from the monad algebra laws. Verifying them explicitly is a useful (and fun!) exercise. Vice versa, any monoid can be given a structure of a $$T$$-algebra, simply by taking mconcat as $$θ$$.

Therefore, we can extend the previous diagram of functors with an equivalence of categories: $\begin{CD} \mathsf{mAlg}_T @>≅>> \mathsf{Mon} @>⊆>> \mathsf{Alg}_F\\ @VVV @VUVV @VVV\\ \mathsf{Set} @>=>> \mathsf{Set} @>=>> \mathsf{Set} \end{CD}$ where the top-left equivalence (which is actually an isomorphism) is determined by the Monoid instance that we defined above, while its inverse is given by mconcat.

Let’s step back at this whole derivation, and reflect on what it is exactly that we have proved. We started with some category of “lawful” algebras, a subcategory of $$\cat{Alg}_F$$ for some endofunctor $$F$$. We then observed that the forgetful functor from this category to $$\set$$ admits a left adjoint $$L$$. We then considered monad algebras of the monad $$UL$$, and we finally observed that these are exactly those “lawful” algebras that we started with!

We will now generalise the previous example to an arbitrary category of algebra-like objects.

Suppose $$\cat{D}$$ is a category equipped with a functor $$G : \cat{D} → \set$$. We want to think of $$G$$ as some sort of “forgetful” functor, stripping away all the structure on the objects of $$\cat{D}$$, and returning just their carrier.

To make this intuition precise, we say that $$G$$ is monadic if:

1. $$G$$ has a left adjoint $$L$$
2. The comparison functor $$\cat{D} → \cat{mAlg}_T$$ is an equivalence of categories, where $$T = GL$$.

The comparison functor is something that we can define for any adjunction $$L ⊢ G$$, and it works as follows. For any object $$A : \cat{D}$$, it returns the monad algebra structure on $$G A$$ given by $$G \epsilon$$, where $$\epsilon$$ is the counit of the adjunction (exercise: check all the details).

So, what this definition is saying is that a functor is monadic if it really is the forgetful functor for the category of monad algebras for some monad. Sometimes, we say that a category is monadic, when the functor $$G$$ is clear.

The monoid example above can then be summarised by saying that the category of monoids is monadic.

## Conclusion

I’ll stop here for now. In the next post we will look at algebraically free monads and how they relate to the corresponding Haskell definition.

# Another proof of function extensionality

The fact that the univalence axiom implies function extensionality is one of the most well-known results of Homotopy Type Theory.

The original proof by Voevodsky has been simplified over time, and eventually assumed the distilled form presented in the HoTT book.

All the various versions of the proof have roughly the same outline. They first show that the weak function extensionality principle (WFEP) follows from univalence, and then prove that this is enough to establish function extensionality.

Following the book, WFEP is the statement that contractible types are closed under $$Π$$, i.e.:

WFEP : ∀ i j → Set _
WFEP i j = {A : Set i}{B : A → Set j}
→ ((x : A) → contr (B x))
→ contr ((x : A) → B x)

### WFEP implies function extensionality

Showing that WFEP implies function extensionality does not need univalence, and is quite straightforward. First, we define what we mean by function extensionality:

Funext : ∀ i j → Set _
Funext i j = {A : Set i}{B : A → Set j}
→ {f g : (x : A) → B x}
→ ((x : A) → f x ≡ g x)
→ f ≡ g

Now we want to show the following:

wfep-to-funext : ∀ {i}{j} → WFEP i j → Funext i j
wfep-to-funext {i}{j} wfep {A}{B}{f}{g} h = p
where

To prove that $$f$$ and $$g$$ are equal, we show that they both have values in the following dependent type, which we can think of as a subtype of $$B(x)$$ for all $$x : A$$:

    C : A → Set j
C x = Σ (B x) λ y → f x ≡ y

We denote by $$f'$$ and $$g'$$ the range restrictions of $$f$$ and $$g$$ to $$C$$:

    f' g' : (x : A) → C x
f' x = (f x , refl)
g' x = (g x , h x)

where we made use of the homotopy $$h$$ between $$f$$ and $$g$$ to show that $$g$$ has values in $$C$$. Now, $$C(x)$$ is a singleton for all $$x : A$$, so, by WFEP, $$f'$$ and $$g'$$ have the same contractible type, hence they are equal:

    p' : f' ≡ g'
p' = contr⇒prop (wfep (λ x → singl-contr (f x))) f' g'

The fact that $$f$$ and $$g$$ are equal then follows immediately by applying the first projection and (implicitly) using $$η$$ conversion for $$Π$$-types:

    p : f ≡ g
p = ap (λ u x → proj₁ (u x)) p'

In the book, the strong version of extensionality, i.e.

StrongFunext : ∀ i j → Set _
StrongFunext i j = {A : Set i}{B : A → Set j}
→ {f g : (x : A) → B x}
→ ((x : A) → f x ≡ g x)
≅ (f ≡ g)

is obtained directly using a more sophisticated, but very similar argument.

### Proving WFEP

Now we turn to proving WFEP itself. Most of the proofs I know use the fact that univalence implies a certain congruence rule for function-types, i.e. if $$B$$ and $$B'$$ are equivalent types, then $$A → B$$ and $$A → B'$$ are also equivalent, and furthermore the equivalence is given by precomposing with the equivalence between $$B$$ and $$B'$$.

However, if we have η conversion for record types, there is a much simpler way to obtain WFEP from univalence.

The idea is as follows: since $$B(x)$$ is contractible for all $$x : A$$, univalence implies that $$B(x) ≡ ⊤$$, so the contractibility of $$(x : A) → B(x)$$ is a consequence of the contractibility of $$A → ⊤$$, which is itself an immediate consequence of the definitional $$η$$ rule for $$⊤$$:

record ⊤ : Set j where
constructor tt

⊤-contr : contr ⊤
⊤-contr = tt , λ { tt → refl }

contr-exp-⊤ : ∀ {i}{A : Set i} → contr (A → ⊤)
contr-exp-⊤ = (λ _ → tt) , (λ f → refl)


However, the proof sketch above is missing a crucial step: even though $$B(x)$$ is pointwise equal to $$⊤$$, in order to substitute $$⊤$$ for $$B(x)$$ in the $$Π$$-type, we need to show that $$B ≡ λ \_ → ⊤$$, but we’re not allowed to use function extensionality, yet!

Fortunately, we only need a very special case of function extensionality. So the trick here is to apply the argument to this special case first, and then use it to prove the general result.

First we prove WFEP for non-dependent $$Π$$-types, by formalising the above proof sketch.

nondep-wfep : ∀ {i j}{A : Set i}{B : Set j}
→ contr B
→ contr (A → B)
nondep-wfep {A = A}{B = B} hB = subst contr p contr-exp-⊤
where
p : (A → ⊤) ≡ (A → B)
p = ap (λ X → A → X) (unique-contr ⊤-contr hB)

Since $$B$$ is non-dependent in this case, the proof goes through without function extensionality, so we don’t get stuck in an infinite regression: two iterations are enough!

Now we can prove the special case of function extensionality that we will need for the proof of full WFEP:

funext' : ∀ {i j}{A : Set i}{B : Set j}
→ (f : A → B)(b : B)(h : (x : A) → b ≡ f x)
→ (λ _ → b) ≡ f
funext' f b h =
ap (λ u x → proj₁ (u x))
(contr⇒prop (nondep-wfep (singl-contr b))
(λ _ → (b , refl))
(λ x → f x , h x))

Same proof as for wfep-to-funext, only written more succinctly.

Finally, we are ready to prove WFEP:

wfep : ∀ {i j} → WFEP i j
wfep {i}{j}{A}{B} hB = subst contr p contr-exp-⊤
where
p₀ : (λ _ → ⊤) ≡ B
p₀ = funext' B ⊤ (λ x → unique-contr ⊤-contr (hB x))

p : (A → ⊤ {j}) ≡ ((x : A) → B x)
p = ap (λ Z → (x : A) → Z x) p₀

By putting it all together we get function extensionality:

funext : ∀ {i j} → Funext i j
funext = wfep-to-funext wfep

### Avoiding η for records

This proof can also be modified to work in a theory where $$⊤$$ does not have definitional η conversion.

The only point where η is used is in the proof of contr-exp-⊤ above. So let’s define a version of $$⊤$$ without η, and prove contr-exp-⊤ for it.

data ⊤ : Set j where
tt : ⊤

⊤-contr : contr ⊤
⊤-contr = tt , λ { tt → refl }

We begin by defining the automorphism $$k$$ of $$⊤$$ which maps everything to $$\mathsf{tt}$$. Clearly, $$k$$ is going to be the identity, but we can’t prove that until we have function extensionality.

k : ⊤ → ⊤
k _ = tt

k-we : weak-equiv k
k-we tt = Σ-contr ⊤-contr (λ _ → h↑ ⊤-contr tt tt)

Now we apply the argument sketched above, based on the fact that univalence implies congruence rules for function types. We extract an equality out of $$k$$, and then transport it to the exponentials:

k-eq : ⊤ ≡ ⊤
k-eq = ≈⇒≡ (k , k-we)

k-exp-eq : ∀ {i}(A : Set i) → (A → ⊤) ≡ (A → ⊤)
k-exp-eq A = ap (λ X → A → X) k-eq

If we were working in a theory with computational univalence, coercion along k-exp-eq would reduce to precomposition with $$k$$. In any case, we can manually show that this is the case propositionally by using path induction and the computational rule for ≈⇒≡:

ap-comp : ∀ {i k}{A : Set i}{X X' : Set k}
→ (p : X ≡ X')
→ (f : A → X)
→ coerce (ap (λ X → A → X) p) f
≡ coerce p ∘ f
ap-comp refl f = refl

k-exp-eq-comp' : ∀ {i}{A : Set i}(f : A → ⊤)
→ coerce (k-exp-eq A) f
≡ λ _ → tt
k-exp-eq-comp' f = ap-comp k-eq f
· ap (λ c → c ∘ f)
(uni-coherence (k , k-we))

Now it’s easy to conclude that $$A → ⊤$$ is a mere proposition (hence contractible): given functions $$f g : A → ⊤$$, precomposing them with $$k$$ makes them both equal to $$λ \_ → \mathsf{tt}$$. Since precomposing with $$k$$ is an equivalence by the computational rule above, $$f$$ must be equal to $$g$$.

prop-exp-⊤ : ∀ {i}{A : Set i} → prop (A → ⊤)
prop-exp-⊤ {i}{A} f g = ap proj₁
( contr⇒prop (coerce-equiv (k-exp-eq A) (λ _ → tt))
(f , k-exp-eq-comp' f)
(g , k-exp-eq-comp' g) )

contr-exp-⊤ : ∀ {i}{A : Set i} → contr (A → ⊤)
contr-exp-⊤ = (λ _ → tt) , prop-exp-⊤ _

# Free Applicative Functors

After my post on option parsers with applicative functors, I’ve been working on a paper to develop the idea of “free applicative” contained in that post.

A draft of the paper, joint work with Ambrus Kaposi, has been submitted to ICFP 2013, and is available here.

# Families and fibrations

## Introduction

The notion of family of “objects” indexed over an object of the same type is ubiquitous is mathematics and computer science.

It appears everywhere in topology and algebraic geometry, in the form of bundles, covering maps, or, more generally, fibrations.

In type theory, it is the fundamental idea captured by the notion of dependent type, on which Martin-Löf intuitionistic type theory is based.

## Definition

Restricting ourselves to $$\mathrm{Set}$$, the category of sets, for the time being (and ignoring issues of size), it is straightforward to give a formal definition of what a family of sets is:

Given a set A, a family over A is a function from A to the objects of the category of sets (or equivalently, on the other side of the adjunction, a functor from A regarded as a discrete category to $$\mathrm{Set}$$).

This is a perfectly valid definition, but it has two problems:

1. It can be awkward to work with functions between objects of different “sorts” (like sets and universes).

2. It is not clear how to generalize the idea to other categories, like $$\mathrm{Top}$$ (the category of topological spaces and continuous maps), for example. In fact, we would like a family of spaces to be “continuous” in some sense, but in order for that to make sense, we would need to define a topology on the class of topological spaces.

## Display maps

Fortunately, there is a very simple construction that helps bringing this definition to a form which is much easier to work with.

Let’s start with a family of sets B over A, defined as above: B : A → Set.

Define the “total space” of the family as the disjoint union (or dependent sum) of all the sets of the family (I’ll use type theoretic notation from now on):

E = Σ (a : A) . B a

The fibration (or display map) associated to the family B is defined to be the first projection:

proj₁ : E → A

So far, we haven’t done very much. The interesting observation is that we can always recover a family of sets from any function E → A.

In fact, suppose that now E is any set, and p : E → A any function. We can define a family of sets:

B : A → Set
B a = p ⁻¹ a

as the function that associates to each point in A, its inverse image (or fiber) in E.

It is now straightforward to check that these two mappings between families and fibrations are inverses of one another.

Intuitively, given a family B, the corresponding fibration maps each element of all possible sets in the family to its “index” in A. Viceversa, given a fibration p : E → A, the corresponding family is just the family of fibers of p.

Here is formalization in Agda of this correspondence as an isomorphism between families and fibrations. This uses agda-base instead of the standard library, as it needs univalence in order to make the isomorphism explicit.

## Examples of constructions

Once we understand how families and fibrations are really two views of the same concept, we can look at a number of constructions for families, and check how they look like in the world of fibrations.

### Dependent sum

The simplest construction is the total space:

E = Σ (x : A). B x

As we already know, this corresponds to the domain of the associated fibration.

### Dependent product

Given a family of sets B over A, a choice function is a function that assigns to each element x of A, an element y of B x. This is called a dependent function in type theory.

The corresponding notion for a fibration p : E → A is a function s : A → E such that for each x : A, the index of s x is exactly x. In other words, p ∘ s ≡ id, i.e. s is a section of p.

The set of such sections is called the dependent product of the family B.

### Pullbacks

Let A and A' be two sets, and B a family over A. Suppose we have a function

r : A' → A

We can easily define a family B' over A' by composing with r:

B' : A' → Set
B' x = B (r x)

What does the fibration p' : E' → A' associated to B' look like in terms of the fibration p : E → A associated to B?

Well, given an element b in the total space of B', b is going to be in B' x for some x : A'. Since B' x ≡ B (r x) by definition, b can also be regarded as an element of the total space of B. So we have a map s : E' → E, and we can draw the following diagram:

$\newcommand{\ra}[1]{\kern-1.5ex\xrightarrow{\ \ #1\ \ }\phantom{}\kern-1.5ex} \newcommand{\ras}[1]{\kern-1.5ex\xrightarrow{\ \ \smash{#1}\ \ }\phantom{}\kern-1.5ex} \newcommand{\da}[1]{\bigg\downarrow\raise.5ex\rlap{\scriptstyle#1}} \begin{array}{c} E' & \ra{s} & E \\ \da{p'} & & \da{p} \\ A' & \ra{r} & A \\ \end{array}$

The commutativity of this diagram follows from the immediate observation that the index above s b is exactly r x.

Now, given elements x : A', and b : E, saying that p b ≡ r x is equivalent to saying that b is in B (r x). In that case, b can be regarded as an element of B' x, which means that there exists a b' in E' such that p' b' ≡ x and s b' ≡ b.

What this says is that the above diagram is a pullback square.

It is important to note that the previous constructions are related in interesting ways.

Let’s look at a simple special case of the pullback construction, i.e. when B is a trivial family of just one element. That means that the display map p associated to B is the canonical map

p : B → 1

So, if A' is any other type, we get that the pullback of p along the unique map r : A' → 1 is the product B × A.

This defines a functor

$A^\ast : \mathrm{Set} → \mathrm{Set}/A$

where $$\mathrm{Set}/A$$ denotes the slice category of sets over A. Furthermore, the dependent product and dependent sum constructions defined above give rise to functors:

$Σ_A, Π_A : \mathrm{Set}/A → \mathrm{Set}$

Now, it is clear that, given a fibration p : X → A and a set Y, functions X → Y are the same as morphisms X → Y × A in the slice category. So $$Σ_A$$ is left adjoint to $$A^\ast$$.

Dually, functions from Y to the set of sections of p correspond to functions Y × A → X in the slice category, thus giving an adjuction between $$A^*$$ and $$Π_A$$.

So we have the following chain of adjunctions:

$Σ_A \vdash A^* \vdash Π_A$

## Conclusion

The correspondence between indexed families and fibrations exemplified here extends well beyond the category of sets, and can be abstracted using the notions of Cartesian morphisms and fibred categories.

In type theory, it is useful to think of this correspondence when working with models of dependently typed theories in locally cartesian closed categories, and I hope that the examples given here show why slice categories and pullback functors play an important role in that setting.

# Continuation-based relative-time FRP

In a previous post I showed how it is possible to write asynchronous code in a direct style using the ContT monad. Here, I’ll extend the idea further and present an implementation of a very simple FRP library based on continuations.

> {-# LANGUAGE DoRec, BangPatterns #-}
> import Control.Applicative
> import Data.IORef
> import Data.Monoid
> import Data.Void

Let’s start by defining a callback-based Event type:

> newtype Event a = Event { on :: (a -> IO ()) -> IO Dispose }

A value of type Event a represents a stream of values of type a, each occurring some time in the future. The on function connects a callback to an event, and returns an object of type Dispose, which can be used to disconnect from the event:

> newtype Dispose = Dispose { dispose :: IO () }
>
> instance Monoid Dispose where
>   mempty = Dispose (return ())
>   mappend d1 d2 = Dispose $do > dispose d1 > dispose d2 The interesting thing about this Event type is that, like the simpler variant we defined in the above post, it forms a monad: > instance Monad Event where First of all, given a value of type a, we can create an event occurring “now” and never again: > return x = Event$ \k -> k x >> return mempty

Note that the notion of “time” for an Event is relative.

All time-dependent notions about Events are formulated in terms of a particular “zero” time, but this origin of times is not explicitly specified.

This makes sense, because, even though the definition of Event uses the IO monad, an Event object, in itself, is an immutable value, and can be reused multiple times, possibly with different starting times.

>   e >>= f = Event $\k -> do > dref <- newIORef mempty > addD dref e$ \x ->
>       addD dref (f x) k
>     return . Dispose $> readIORef dref >>= dispose > > addD :: IORef Dispose -> Event a -> (a -> IO ()) -> IO () > addD d e act = do > d' <- on e act > modifyIORef d (mappend d') The definition of >>= is slightly more involved. We call the function f every time an event occurs, and we connect to the resulting event each time using the helper function addD, accumulating the corresponding Dispose object in an IORef. The resulting Dispose object is a function that reads the IORef accumulator and calls dispose on that. As the diagram shows, the resulting event e >>= f includes occurrences of all the events originating from the occurrences of the initial event e. ## Event union Classic FRP comes with a number of combinators to manipulate event streams. One of the most important is event union, which consists in merging two or more event streams into a single one. In our case, event union can be implemented very easily as an Alternative instance: > instance Functor Event where > fmap = liftM > > instance Applicative Event where > pure = return > (<*>) = ap > > instance Alternative Event where > empty = Event$ \_ -> return mempty
>   e1 <|> e2 = Event $\k -> do > d1 <- on e1 k > d2 <- on e2 k > return$ d1 <> d2

An empty Event never invokes its callback, and the union of two events is implemented by connecting a callback to both events simultaneously.

## Other combinators

We need an extra primitive combinator in terms of which all other FRP combinators can be implemented using the Monad and Alternative instances.

> once :: Event a -> Event a
> once e = Event $\k -> do > rec d <- on e$ \x -> do
>              dispose d
>              k x
>   return d

The once combinator truncates an event stream at its first occurrence. It can be used to implement a number of different combinators by recursion.

> accumE :: a -> Event (a -> a) -> Event a
> accumE x e = do
>   f <- once e
>   let !x' = f x
>   pure x' <|> accumE x' e
>
> takeE :: Int -> Event a -> Event a
> takeE 0 _ = empty
> takeE 1 e = once e
> takeE n e | n > 1 = do
>   x <- once e
>   pure x <|> takeE (n - 1) e
> takeE _ _ = error "takeE: n must be non-negative"
>
> dropE :: Int -> Event a -> Event a
> dropE n e = replicateM_ n (once e) >> e

## Behaviors and side effects

We address behaviors and side effects the same way, using IO actions, and a MonadIO instance for Event:

> instance MonadIO Event where
>   liftIO m = Event $\k -> do > m >>= k > return mempty > > newtype Behavior a = Behavior { valueB :: IO a } > > getB :: Behavior a -> Event a > getB = liftIO . valueB Now we can implement something like the apply combinator in reactive-banana: > apply :: Behavior (a -> b) -> Event a -> Event b > apply b e = do > x <- e > f <- getB b > return$ f x

Events can also perform arbitrary IO actions, which is necessary to actually connect an Event to user-visible effects:

> log :: Show a => Event a -> Event ()
> log e = e >>= liftIO . print

## Executing event descriptions

An entire GUI program can be expressed as an Event value, usually by combining a number of basic events using the Alternative instance.

A complete program can be run with:

> runEvent :: Event Void -> IO ()
> runEvent e = void $on e absurd > > runEvent_ :: Event a -> IO () > runEvent_ = runEvent . (>> empty) ## Underlying assumptions For this simple system to work, events need to possess certain properties that guarantee that our implementations of the basic combinators make sense. First of all, callbacks must be invoked sequentially, in the order of occurrence of their respective events. Furthermore, we assume that callbacks for the same event (or simultaneous events) will be called in the order of connection. Many event-driven frameworks provide those guarantees directly. For those that do not, a driver can be written converting underlying events to Event values satisfying the required ordering properties. ## Conclusion It’s not immediately clear whether this approach can scale to real-world GUI applications. Although the implementation presented here is quite simplistic, it could certainly be made more efficient by, for example, making Dispose stricter, or adding more information to Event to simplify some common special cases. This continuation-based API is a lot more powerful than the usual FRP combinator set. The Event type combines the functionalities of both the classic Event and Behavior types, and it offers a wider interface (Monad rather than only Applicative). On the other hand, it is a lot less safe, in a way, since it allows to freely mix IO actions with event descriptions, and doesn’t enforce a definite separation between the two. Libraries like reactive-banana do so by distinguishing beween “network descriptions” and events/behaviors. Finally, there is really no sharing of intermediate events, so expensive computations occurring, say, inside an accumE can end up being unnecessarily performed more than once. This is not just an implementation issue, but a consequence of the strict equality model that this FRP formulation employs. Even if two events are identical, they might not actually behave the same when they are used, because they are going to be “activated” at different times. # Pipes 2.0 vs pipes-core With the release of pipes 2.0 by Gabriel Gonzalez, I feel it’s time to address the question of whether my fork will eventually be merged or not. The short answer is no, I will continue to maintain my separate incarnation pipes-core. In this post, I will discuss the reasoning behind this decision, and hopefully explain the various trade-offs that the two libraries make. ## The issue with termination pipes 1.0 can be quite accurately described as “composable monadic stream processors”. “Composable” alludes to horizontal composition (i.e. the Category instance), while “monadic” refers to vertical composition. The existence of a Monad instance has a number of consequences, the most important being the fact that pipes can carry a “return value”, and, in particular, they can terminate. The fact that pipes can terminate poses the greatest challenge when reasoning about the properties of (horizontal) composition, but termination is also one of the nicest features of pipes, so we want to deal with this difficulty appropriately. Termination implies that any pipe has to deal somehow with the fact that its upstream pipe can exit before yielding a value, which basically means that an await can fail. Gabriel’s pipes address this issue by simply “propagating termination downstream”. A pipe awaiting on a terminated pipe is forcibly terminated itself, and the upstream return value is returned. My guarded pipes idea (later integrated into pipes-core), proposes a new primitive tryAwait :: Pipe a b m (Maybe a) that returns Nothing when upstream terminates before providing a value. Using tryAwait, a pipe can then handle a failure caused by termination, and either return a value, or use the upstream value (the latter can be accomplished by simply awaiting again). ## Exception handling Once you realize that pipes should be able to handle failure on await, it becomes very natural to extend the idea to other kinds of failure. That’s exactly the rationale behind pipes-core. It introduces slightly more involved primitives that take into account the fact that actions in the base monad, as well as pipes themselves, can throw an exception at any time. One very interesting consequence of built-in exception handling is that the “guarded pipes” concept can be integrated seamlessly by introducing a special BrokenPipe exception. The exception handling implementation in pipes-core works in any monad, and deals with asynchronous exceptions correctly. Of course, actual exceptions thrown from Haskell code can only be caught when the base monad is IO. ## What about finalization? Since all the finalization primitives in Control.Exception are implemented on top of exception handling primitives like catch and mask, I initially believed that finalization would follow automatically from exception handling capabilities in pipes. Unfortunately, there is a fundamental operational difference between IO and Pipe, which makes exception handling alone insufficient to guarantee finalization of resources. The problem is that some of the pipes in a pipeline are not guaranteed to be executed at all. In fact, a pipe only plays a role in pipeline execution if its downstream pipe awaits at some point (or if it is the last one). The same applies to “portions” of pipes, so a pipe can execute partially, and then be completely forgotten, even if no exceptional condition occurs. After a number of failed attempts (including the broken 0.0.1 release of pipes-core), I realized that Gabriel’s finalizer passing idea was the right one, and used it to replace my flawed ensure primitive. ## Balancing safety and dynamicity The question remains of how to guarantee that a pipe never awaits again after its upstream terminated. My solution is dynamic: if upstream terminated because of an exception (that has been handled), just throw the exception again on await; if upstream terminated normally, throw a BrokenPipe exception. Gabriel’s solution is static: a pipe is not allowed to await again after termination, and the invariant is enforced by the types. The static solution has obvious advantages, but, on closer inspection, it reveals a number of downsides: 1. It prevents Pipe from forming a Monad; the solution implemented in pipes 2.0 is to separate the Monad instance from the Category instance, and suggesting that the Monad instance should actually be replaced with an indexed monad. 2. It doesn’t provide any exception handling mechanism, and doesn’t guarantee that finalizers will be called in case any exception occurs. I imagine that some sort of exception support could be layered on top of the current solution, but I’m guessing it’s not going to be straightforward. 3. Folds are not compositional. This can be clearly seen in the tutorial, where strict is not defined in terms of toList. With pipes-core, you would simply have: strict = consume >>= mapM yield -- note that toList is called consume in pipes-core ## What’s next for pipes-core The current version of pipes-core already provides exception handling and guaranteed finalization in the face of asynchronous exceptions. Things that could be improved in its finalization support are: 1. Finalization is currently guaranteed, but not always prompt. When an exception handler is provided, upstream finalization gets delayed unnecessarily. 2. It is not possible to prematurely force finalization. I haven’t yet seen an example where this would be useful, but it would be nice to have it for completeness. I think I know how these points can be addressed, and hopefully they will make it into the next release. For future releases, I’d like to focus on performance. Aside from micro-optimizations, I can see two main areas that would benefit from improvements: the Monad instance and the Category instance. The current monadic bind unfortunately displays a quadratic behavior, since it basically works like a naive list concatenation function. The Codensity transformation should address that. For the Category instance, it would be interesting to explore whether it is possible to achieve some form of fusion of intermediate data structures, similarly to classic stream fusion for lists. This is probably going to be more of a challenge, and will likely require some significant restructuring, but the prospective benefits are enormous. There is some research on this topic and an initial attempt I plan to draw ideas from. My last point is about the absence of an unawait primitive for Pipe. There has been quite a lot of discussion on this topic, but I remain unconvinced that having builtin parsing capabilities is a good thing. Whenever there is a need to chain unconsumed input, there are a few viable options already: 1. Return leftover data, and add some manual wiring so that it’s passed to the “next” pipe. 2. Use PutbackPipe from pipes-extra. 3. Use an actual parser library and convert the parser to a Pipe (see pipes-attoparsec). In all the examples I have seen, however, pipes are composable enough that all the special logic to deal with boundaries of chunked streams can be implemented in a single “filter” pipe, and the rest of the pipeline can ignore the issue altogether. # Applicative option parser There are quite a few option parsing libraries on Hackage already, but they either depend on Template Haskell, or require some boilerplate. Although I have nothing against the use of Template Haskell in general, I’ve always found its use in this case particularly unsatisfactory, and I’m convinced that a more idiomatic solution should exist. In this post, I present a proof of concept implementation of a library that allows you to define type-safe option parsers in Applicative style. The only extension that we actually need is GADT, since, as will be clear in a moment, our definition of Parser requires existential quantification. > {-# LANGUAGE GADTs #-} > import Control.Applicative Let’s start by defining the Option type, corresponding to a concrete parser for a single option: > data Option a = Option > { optName :: String > , optParser :: String -> Maybe a > } > > instance Functor Option where > fmap f (Option name p) = Option name (fmap f . p) > > optMatches :: Option a -> String -> Bool > optMatches opt s = s == '-' : '-' : optName opt For simplicity, we only support “long” options with exactly 1 argument. The optMatches function checks if an option matches a string given on the command line. We can now define the main Parser type: > data Parser a where > NilP :: a -> Parser a > ConsP :: Option (a -> b) > -> Parser a -> Parser b > > instance Functor Parser where > fmap f (NilP x) = NilP (f x) > fmap f (ConsP opt rest) = ConsP (fmap (f.) opt) rest > > instance Applicative Parser where > pure = NilP > NilP f <*> p = fmap f p > ConsP opt rest <*> p = > ConsP (fmap uncurry opt) ((,) <$> rest <*> p)

The Parser GADT resembles a heterogeneous list, with two constructors.

The NilP r constructor represents a “null” parser that doesn’t consume any arguments, and always returns r as a result.

The ConsP constructor is the combination of an Option returning a function, and an arbitrary parser returning an argument for that function. The combined parser applies the function to the argument and returns a result.

The definition of (<*>) probably needs some clarification. The variables involved have types:

opt :: Option (a -> b -> c)
rest :: Parser a
p :: Parser b

and we want to obtain a parser of type Parser c. So we uncurry the option, obtaining:

fmap uncurry opt :: Option ((a, b) -> c)

and compose it with a parser for the (a, b) pair, obtained by applying the (<*>) operator recursively:

(,) <$> rest <*> p :: Parser (a, b) This is already enough to define some example parsers. Let’s first add a couple of convenience functions to help us create basic parsers: > option :: String -> (String -> Maybe a) -> Parser a > option name p = ConsP (fmap const (Option name p)) (NilP ()) > optionR :: Read a => String -> Parser a > optionR name = option name p > where > p arg = case reads arg of > [(r, "")] -> Just r > _ -> Nothing And a record to contain the result of our parser: > data User = User > { userName :: String > , userId :: Integer > } deriving Show A parser for User is easily defined in applicative style: > parser :: Parser User > parser = User <$> option "name" Just <*> optionR "id"

To be able to actually use this parser, we need a “run” function:

> runParser :: Parser a -> [String] -> Maybe (a, [String])
> runParser (NilP x) args = Just (x, args)
> runParser (ConsP _ _) [] = Nothing
> runParser p (arg : args) =
>   case stepParser p arg args of
>     Nothing -> Nothing
>     Just (p', args') -> runParser p' args'
>
> stepParser :: Parser a -> String -> [String] -> Maybe (Parser a, [String])
> stepParser p arg args = case p of
>   NilP _ -> Nothing
>   ConsP opt rest
>     | optMatches opt arg -> case args of
>         [] -> Nothing
>         (value : args') -> do
>           f <- optParser opt value
>           return (fmap f rest, args')
>     | otherwise -> do
>         (rest', args') <- stepParser rest arg args
>         return (ConsP opt rest', args')

The idea is very simple: we take the first argument, and we go over each option of the parser, check if it matches, and if it does, we replace it with a NilP parser wrapping the result, consume the option and its argument from the argument list, then call runParser recursively.

Here is an example of runParser in action:

> ex1 :: Maybe User
> ex1 = fst <$> runParser parser ["--name", "fry", "--id", "1"] > {- Just (User {userName = "fry", userId = 1}) -} The order of arguments doesn’t matter: > ex2 :: Maybe User > ex2 = fst <$> runParser parser ["--id", "2", "--name", "bender"]
> {- Just (User {userName = "bender", userId = 2}) -}

Missing arguments will result in a parse error (i.e. Nothing). We don’t support default values but they are pretty easy to add.

> ex3 :: Maybe User
> ex3 = fst <$> runParser parser ["--name", "leela"] > {- Nothing -} I think the above Parser type represents a pretty clean and elegant solution to the option parsing problem. To make it actually usable, I would need to add a few more features (boolean flags, default values, a help generator) and improve error handling and performance (right now parsing a single option is quadratic in the size of the Parser), but it looks like a fun project. Does anyone think it’s worth adding yet another option parser to Hackage? # Monoidal instances for pipes In this post, I’m going to introduce a new class of combinators for pipes, with an interesting categorical interpretation. I will be using the pipe implementation of my previous post. > {-# LANGUAGE MultiParamTypeClasses #-} > {-# LANGUAGE FlexibleInstances #-} > {-# LANGUAGE TypeFamilies #-} > {-# LANGUAGE GeneralizedNewtypeDeriving #-} > module Blog.Pipes.MonoidalInstances where > > import Blog.Pipes.Guarded hiding (groupBy) > import qualified Control.Arrow as A > import Control.Category > import Control.Categorical.Bifunctor > import Control.Category.Associative > import Control.Category.Braided > import Control.Category.Monoidal > import Control.Monad (forever) > import Control.Monad.Free > import Data.Maybe > import Data.Void > import Prelude hiding ((.), id, filter, until) When pipes were first released, some people noticed the lack of an Arrow instance. In fact, it is not hard to show that, even identifying pipes modulo some sort of observational equality, there is no Arrow instance that satisfies the arrow laws. The problem, of course, is with first, because we already have a simple implementation of arr. If we try to implement first we immediately discover that there’s a problem with the Yield case: first (Yield x c) = yield (x, ???) >> first c Since ??? can be of any type, the only possible value is bottom, which of course we don’t want to introduce. Alternative definitions of first that alter the structure of a yielding pipe are not possible if we want to satisfy the law: first p >+> pipe fst == pipe fst >+> p Concretely, the problem is that the cartesian product in the type of first forces a sort of “synchronization point” that doesn’t necessarily exist. This is better understood if we look at the type of (***), of which first can be thought of as a special case: (***) :: Arrow k => k a b -> k a' b' -> k (a, a') (b, b') first = (*** id) If the two input pipes yield at different times, there is no way to faithfully match their yielded values into a pair. There are hacks around that, but they don’t behave well compositionally, and exhibit either arbitrarily large space leaks or data loss. This has been addressed before: stream processors, like those of the Fudgets library, being very similar to Pipes, have the same problem, and some resolutions have been proposed, although not entirely satisfactory. ## Arrows as monoidal categories It is well known within the Haskell community that Arrows correspond to so called Freyd categories, i.e. premonoidal categories with some extra structures. Using the Monoidal class by Edward Kmett (now in the categories package on Hackage), we can try to make this idea precise. Unfortunately, we have to use a newtype to avoid overlapping instances in the case of the Hask category: > newtype ACat a b c = ACat { unACat :: a b c } > deriving (Category, A.Arrow) First, cartesian products are a bifunctor in the category determined by an Arrow. > instance A.Arrow a => PFunctor (,) (ACat a) (ACat a) where > first = ACat . A.first . unACat > instance A.Arrow a => QFunctor (,) (ACat a) (ACat a) where > second = ACat . A.second . unACat > instance A.Arrow a > => Bifunctor (,) (ACat a) (ACat a) (ACat a) where > bimap (ACat f) (ACat g) = ACat$ f A.*** g

Now we can say that products are associative, using the associativity of products in Hask:

> instance A.Arrow a => Associative (ACat a) (,) where
>   associate = ACat $A.arr associate > instance A.Arrow a => Disassociative (ACat a) (,) where > disassociate = ACat$ A.arr disassociate

Where we use the Disassociative instance to express the inverse of the associator. And finally, the Monoidal instance:

> type instance Id (ACat a) (,) = ()
> instance A.Arrow a => Monoidal (ACat a) (,) where
>   idl = ACat $A.arr idl > idr = ACat$ A.arr idr
> instance A.Arrow a => Comonoidal (ACat a) (,) where
>   coidl = ACat $A.arr coidl > coidr = ACat$ A.arr coidr

Where, again, the duals are actually inverses. Also, products are symmetric:

> instance A.Arrow a => Braided (ACat a) (,) where
>   braid = ACat $A.arr braid > instance A.Arrow a => Symmetric (ACat a) (,) As you see, everything is trivially induced by the cartesian structure on Hask, since A.arr gives us an identity-on-objects functor. Note, however, that the Bifunctor instance is legitimate only if we assume a strong commutativity law for arrows: first f >>> second g == second g >>> first f which we will, for the sake of simplicity. ## Replacing products with arbitrary monoidal structures Once we express the Arrow concept in terms of monoidal categories, it is easy to generalize it to arbitrary monoidal structures on Hask. In particular, coproducts work particularly well in the category of pipes: > instance Monad m > => PFunctor Either (PipeC m r) (PipeC m r) where > first = PipeC . firstP . unPipeC > > firstP :: Monad m => Pipe a b m r > -> Pipe (Either a c) (Either b c) m r > firstP (Pure r) = return r > firstP (Free (M m)) = lift m >>= firstP Yielding a sum is now easy: just yield on the left component. > firstP (Free (Yield x c)) = yield (Left x) >> firstP c Awaiting is a little bit more involved, but still easy enough: receive left and null values normally, and act like an identity on the right. > firstP (Free (Await k)) = go > where > go = tryAwait > >>= maybe (firstP$ k Nothing)
>                      (either (firstP . k . Just)
>                              (\x -> yield (Right x) >> go))

And of course we have an analogous instance on the right:

> instance Monad m
>       => QFunctor Either (PipeC m r) (PipeC m r) where
>   second = PipeC . secondP . unPipeC
>
> secondP :: Monad m => Pipe a b m r
>         -> Pipe (Either c a) (Either c b) m r
> secondP (Pure r) = return r
> secondP (Free (M m)) = lift m >>= secondP
> secondP (Free (Yield x c)) = yield (Right x) >> secondP c
> secondP (Free (Await k)) = go
>         where
>           go = tryAwait
>            >>= maybe (secondP $k Nothing) > (either (\x -> yield (Left x) >> go) > (secondP . k . Just)) And a bifunctor instance obtained by composing first and second in arbitrary order: > instance Monad m > => Bifunctor Either (PipeC m r) > (PipeC m r) (PipeC m r) where > bimap f g = first f >>> second g At this point we can go ahead and define the remaining instances in terms of the identity-on-objects functor given by pipe: > instance Monad m => Associative (PipeC m r) Either where > associate = PipeC$ pipe associate
> instance Monad m => Disassociative (PipeC m r) Either where
>   disassociate = PipeC $pipe disassociate > > type instance Id (PipeC m r) Either = Void > instance Monad m => Monoidal (PipeC m r) Either where > idl = PipeC$ pipe idl
>   idr = PipeC $pipe idr > instance Monad m => Comonoidal (PipeC m r) Either where > coidl = PipeC$ pipe coidl
>   coidr = PipeC $pipe coidr > > instance Monad m => Braided (PipeC m r) Either where > braid = PipeC$ pipe braid
> instance Monad m => Symmetric (PipeC m r) Either

## Multiplicative structures

There is still a little bit of extra structure that we might want to exploit. Since PipeC m r is a monoidal category, it induces a (pointwise) monoidal structure on its endofunctor category, so we can speak of monoid objects there. In particular, if the identity functor is a monoid, it means that we can define a “uniform” monoid structure for all the objects of our category, given in terms of natural transformations (i.e. polymorphic functions).

We can represent this specialized monoid structure with a type class (using kind polymorphism and appropriately generalized category-related type classes, it should be possible to unify this class with Monoid and even Monad, similarly to how it’s done here):

> class Monoidal k p => Multiplicative k p where
>   unit :: k (Id k p) a
>   mult :: k (p a a) a

Dually, we can have a sort of uniform coalgebra:

> class Comonoidal k p => Comultiplicative k p where
>   counit :: k a (Id k p)
>   comult :: k a (p a a)

The laws for those type classes are just the usual laws for a monoid in a (not necessarily strict) monoidal category:

first unit . mult == idl
second unit . mult == idr
mult . first mult == mult . second mult . associate

first counit . comult == coidl
second counit . comult == coidr
first diag . diag == disassociate . second diag . diag

Now, products have a comultiplicative structure on Hask (as in every category with finite products), given by the terminal object and diagonal natural transformation:

> instance Comultiplicative (->) (,) where
>   counit = const ()
>   comult x = (x, x)

while coproducts have a multiplicative structure:

> instance Multiplicative (->) Either where
>   unit = absurd
>   mult = either id id

that we can readily transport to PipeC m r using pipe:

> instance Monad m => Multiplicative (PipeC m r) Either where
>   unit = PipeC $pipe absurd > mult = PipeC$ pipe mult

Somewhat surprisingly, pipes also have a comultiplicative structure of their own:

> instance Monad m => Comultiplicative (PipeC m r) Either where
>   comult = PipeC . forever $do > x <- await > yield (Left x) > yield (Right x) ## Heterogeneous metaprogramming All the combinators we defined can actually be used in practice, and the division in type classes certainly sheds some light on their structure and properties, but there’s actually something deeper going on here. The fact that the standard Arrow class uses (,) as monoidal structure is not coincidental: Hask is a cartesian closed category, so to embed Haskell’s simply typed λ-calculus into some other category structure, we need at the very least a way to transport cartesian products, i.e. a premonoidal functor. However, as long as our monoidal structure is comultiplicative and symmetric, we can always recover a first-order fragment of $$\lambda$$-calculus inside the “guest” category, and we don’t even need an identity-on-objects functor (see for example this paper). The idea is that we can use the monoidal structure of the guest category to represent contexts, where weakening is given by counit, contraction by comult, and exchange by swap. There is an experimental GHC branch with a preprocessor which is able to translate expressions written in an arbitrary guest language into Haskell, given instances of appropriate type classes , which correspond exactly to the ones we have defined above. ## Examples This exposition was pretty abstract, so we end with some examples. We first need to define a few wrappers for our monoidal combinators, so we don’t have to deal with the PipeC newtype: > split :: Monad m => Pipe a (Either a a) m r > split = unPipeC comult > > join :: Monad m => Pipe (Either a a) a m r > join = unPipeC mult > > (*+*) :: Monad m => Pipe a b m r -> Pipe a' b' m r > -> Pipe (Either a a') (Either b b') m r > f *+* g = unPipeC$ bimap (PipeC f) (PipeC g)
>
> discardL :: Monad m => Pipe (Either Void a) a m r
>
> discardR :: Monad m => Pipe (Either a Void) a m r
> discardR = unPipeC idr

Now let’s write a tee combinator, similar to the tee command for shell pipes:

> tee :: Monad m => Pipe a Void m r -> Pipe a a m r
> tee p = split >+> firstP p >+> discardL
>
> printer :: Show a => Pipe a Void IO r
> printer = forever $await >>= lift . print > > ex6 :: IO () > ex6 = do > (sourceList [1..5] >+> > tee printer >+> > (fold (+) 0 >>= yield) $$> printer) > return () > {- ex6 == mapM_ print [1,2,3,4,5,15] -} Another interesting exercise is reimplementing the groupBy combinator of the previous post: > groupBy :: Monad m => (a -> a -> Bool) -> Pipe a [a] m r > groupBy p = > -- split the stream in two > split >+> > > -- yield Nothing whenever (not (p x y)) > -- for consecutive x y > ((consec >+> > filter (not . uncurry p) >+> > pipe (const Nothing)) *+* > > -- at the same time, let everything pass through > pipe Just) >+> > > -- now rejoin the two streams > join >+> > > -- then accumulate results until a Nothing is hit > forever (until isNothing >+> > pipe fromJust >+> > (consume >>= yield)) > > -- yield consecutive pairs of values > consec :: Monad m => Pipe a (a, a) m r > consec = await >>= go > where > go x = await >>= \y -> yield (x, y) >> go y > > ex7 :: IO () > ex7 = do (sourceList [1,1,2,2,2,3,4,4] > >+> groupBy (==) > >+> pipe head >$$ printer) > return () > {- ex7 == mapM_ print [1,2,3,4] -} # An introduction to guarded pipes Pipes are a very simple but powerful abstraction which can be used to implement stream-based IO, in a very similar fashion to iteratees and friends, or conduits. In this post, I introduce guarded pipes: a slight generalization of pipes which makes it possible to implement a larger class of combinators. > {-# LANGUAGE NoMonomorphismRestriction #-} > module Blog.Pipes.Guarded where > > import Control.Category > import Control.Monad.Free > import Control.Monad.Identity > import Data.Maybe > import Data.Void > import Prelude hiding (id, (.), until, filter) The idea behind pipes is straightfoward: fix a base monad m, then construct the free monad over a specific PipeF functor: > data PipeF a b m x = M (m x) > | Yield b x > | Await (Maybe a -> x) > > instance Monad m => Functor (PipeF a b m) where > fmap f (M m) = M$ liftM f m
>   fmap f (Yield x c) = Yield x (f c)
>   fmap f (Await k) = Await (f . k)
>
> type Pipe a b m r = Free (PipeF a b m) r

Generally speaking, a free monad can be thought of as an embedded language in CPS style: every summand of the base functor (PipeF in this case), is a primitive operation, while the x parameter represents the continuation at each step.

In the case of pipes, M corresponds to an effect in the base monad, Yield produces an output value, and Await blocks until it receives an input value, then passes it to its continuation. You can see that the Await continuation takes a Maybe a type: this is the only thing that distinguishes guarded pipes from regular pipes (as implemented in the pipes package on Hackage). The idea is that Await will receive Nothing whenever the pipe runs out of input values. That will give it a chance to do some cleanup or yield extra outputs. Any additional Await after that point will terminate the pipe immediately.

We can write a simplistic list-based (strict) interpreter formalizing the semantics I just described:

> evalPipe :: Monad m => Pipe a b m r -> [a] -> m [b]
> evalPipe p xs = go False xs [] p

The boolean parameter is going to be set to True as soon as we execute an Await with an empty input list.

A Pure value means that the pipe has terminated spontaneously, so we return the accumulated output list:

>   where
>     go _ _ ys (Pure r) = return (reverse ys)

>     go t xs ys (Free (M m)) = m >>= go t xs ys

Save yielded values into the accumulator:

>     go t xs ys (Free (Yield y c)) = go t xs (y : ys) c

If we still have values in the input list, feed one to the continuation of an Await statement.

>     go t (x:xs) ys (Free (Await k)) = go t xs ys $k (Just x) If we run out of inputs, pass Nothing to the Await continuation… > go False [] ys (Free (Await k)) = go True [] ys (k Nothing) … but only the first time. If the pipe awaits again, terminate it. > go True [] ys (Free (Await _)) = return (reverse ys) To simplify the implementation of actual pipes, we define the following basic combinators: > tryAwait :: Monad m => Pipe a b m (Maybe a) > tryAwait = wrap$ Await return
>
> yield :: Monad m => b -> Pipe a b m ()
> yield x = wrap $Yield x (return ()) > > lift :: Monad m => m r -> Pipe a b m r > lift = wrap . M . liftM return and a couple of secondary combinators, very useful in practice. First, a pipe that consumes all input and never produces output: > discard :: Monad m => Pipe a b m r > discard = forever tryAwait then a simplified await primitive, that dies as soon as we stop feeding values to it. > await :: Monad m => Pipe a b m a > await = tryAwait >>= maybe discard return now we can write a very simple pipe that sums consecutive pairs of numbers: > sumPairs :: (Monad m, Num a) => Pipe a a m () > sumPairs = forever$ do
>   x <- await
>   y <- await
>   yield $x + y we get: > ex1 :: [Int] > ex1 = runIdentity$ evalPipe sumPairs [1,2,3,4]
> {- ex1 == [3, 7] -}

## Composing pipes

The usefulness of pipes, however, is not limited to being able to express list transformations as monadic computations using the await and yield primitives. In fact, it turns out that two pipes can be composed sequentially to create a new pipe.

> infixl 9 >+>
> (>+>) :: Monad m => Pipe a b m r -> Pipe b c m r -> Pipe a c m r
> (>+>) = go False False
>   where

When implementing evalPipe, we needed a boolean parameter to signal upstream input exhaustion. This time, we need two boolean parameters, one for the input of the upstream pipe, and one for its output, i.e. the input of the downstream pipe. First, if downstream does anything other than waiting, we just let the composite pipe execute the same action:

>     go _ _ p1 (Pure r) = return r
>     go t1 t2 p1 (Free (Yield x c)) = yield x >> go t1 t2 p1 c
>     go t1 t2 p1 (Free (M m)) = lift m >>= \p2 -> go t1 t2 p1 p2

then, if upstream is yielding and downstream is waiting, we can feed the yielded value to the downstream pipe and continue from there:

>     go t1 t2 (Free (Yield x c)) (Free (Await k)) =
>       go t1 t2 c $k (Just x) if downstream is waiting and upstream is running a monadic computation, just let upstream run and keep downstream waiting: > go t1 t2 (Free (M m)) p2@(Free (Await _)) = > lift m >>= \p1 -> go t1 t2 p1 p2 if upstream terminates while downstream is waiting, finalize downstream: > go t1 False p1@(Pure _) (Free (Await k)) = > go t1 True p1 (k Nothing) but if downstream awaits again, terminate the whole composite pipe: > go _ True (Pure r) (Free (Await _)) = return r now, if both pipes are waiting, we keep the second pipe waiting and we feed whatever input we get to the first pipe. If the input is Nothing, we set the first boolean flag, so that next time the first pipe awaits, we can finalize the downstream pipe. > go False t2 (Free (Await k)) p2@(Free (Await _)) = > tryAwait >>= \x -> go (isNothing x) t2 (k x) p2 > go True False p1@(Free (Await _)) (Free (Await k)) = > go True True p1 (k Nothing) > go True True p1@(Free (Await _)) p2@(Free (Await _)) = > tryAwait >>= \_ -> {- unreachable -} go True True p1 p2 This composition can be shown to be associative (in a rather strong sense), with identity given by: > idP :: Monad m => Pipe a a m r > idP = forever$ await >>= yield

So we can define a Category instance:

> newtype PipeC m r a b = PipeC { unPipeC :: Pipe a b m r }
>
> instance Monad m => Category (PipeC m r) where
>   id = PipeC idP
>   (PipeC p2) . (PipeC p1) = PipeC $p1 >+> p2 ## Running pipes A runnable pipe, also called Pipeline, is a pipe that doesn’t yield any value and doesn’t wait for any input. We can formalize this in the types as follows: > type Pipeline m r = Pipe () Void m r Disregarding bottom, calling await on such a pipe does not return any useful value, and yielding is impossible. Another way to think of Pipeline is as an arrow (in PipeC) from the terminal object to the initial object of Hask1. Running a pipeline is straightforward: > runPipe :: Monad m => Pipeline m r -> m r > runPipe (Pure r) = return r > runPipe (Free (M m)) = m >>= runPipe > runPipe (Free (Await k)) = runPipe$ k (Just ())
> runPipe (Free (Yield x c)) = absurd x

where the impossibility of the last case is guaranteed by the types, unless of course the pipe introduced a bottom value at some point.

The three primitive operations tryAwait, yield and lift, together with pipe composition and the runPipe function above, are basically all we need to define most pipes and pipe combinators. For example, the simple pipe interpreter evalPipe can be easily rewritten in terms of these primitives:

> evalPipe' :: Monad m => Pipe a b m r -> [a] -> m [b]
> evalPipe' p xs = runPipe $> (mapM_ yield xs >> return []) >+> > (p >> discard) >+> > collect id > where > collect xs = > tryAwait >>= maybe (return$ xs [])
>                          (\x -> collect (xs . (x:)))

Note that we use the discard pipe to turn the original pipe into an infinite one, so that the final return value will be taken from the final pipe.

## Extra combinators

The rich structure on pipes (category and monad) makes it really easy to define new higher-level combinators. For example, here are implementations of some of the combinators in Data.Conduit.List, translated to pipes:

> sourceList = mapM_ yield
> sourceNull = return ()
> fold f z = go z
>   where
>     go x = tryAwait >>= maybe (return x) (go . f x)
> consume = fold (\xs x -> xs . (x:)) id >>= \xs -> return (xs [])
> take n = (isolate n >> return []) >+> consume
> drop n = replicateM n await >> idP
> pipe f = forever $await >>= yield . f -- called map in conduit > concatMap f = forever$ await >>= mapM_ yield . f
> until p = go
>   where
>     go = await >>= \x -> if p x then return () else yield x >> go
> groupBy (~=) = p >+>
>   forever (until isNothing >+>
>            pipe fromJust >+>
>            (consume >>= yield))
>   where
>     -- the pipe p yields Nothing whenever the current item y
>     -- and the previous one x do not satisfy x ~= y, and behaves
>     -- like idP otherwise
>     p = await >>= \x -> yield (Just x) >> go x
>     go x = do
>       y <- await
>       unless (x ~= y) $yield Nothing > yield$ Just y
>       go y
> isolate n = replicateM_ n $await >>= yield > filter p = forever$ until (not . p)

To work with the equivalent of sinks, it is useful to define a source to sink composition operator:

> infixr 2 $$> ($$) :: Monad m => Pipe () a m r' -> Pipe a Void m r -> m (Maybe r)
> p1 $$p2 = runPipe  (p1 >> return Nothing) >+> liftM Just p2 which ignores the source return type, and just returns the sink return value, or Nothing if the source happens to terminate first. So we have, for example: > ex2 :: Maybe [Int] > ex2 = runIdentity  sourceList [1..10] >+> isolate 4$$ consume
> {- ex2 == Just [1,2,3,4] -}
>
> ex3 :: Maybe [Int]
> ex3 = runIdentity $sourceList [1..10] $$discard > {- ex3 == Nothing -} > > ex4 :: Maybe Int > ex4 = runIdentity sourceList [1,1,2,2,2,3,4,4] > >+> groupBy (==) > >+> pipe head >$$ fold (+) 0 > {- ex4 == Just 10 -} > > ex5 :: Maybe [Int] > ex5 = runIdentity$ sourceList [1..10]
>                 >+> filter (\x -> x mod 3 == 0)
>                   consume
> {- ex5 == Just [3, 6, 9] -}

## Pipes in practice

You can find an implementation of guarded pipes in my fork of pipes. There is also a pipes-extra repository where you can find some pipes to deal with chunked ByteStreams and utilities to convert conduits to pipes.

I hope to be able to merge this into the original pipes package once the guarded pipe concept has proven its worth. Without the tryAwait primitive, combinators like fold and consume cannot be implemented, nor even a simple stateful pipe like one to split a chunked input into lines. So I think there are enough benefits to justify a little extra complexity in the definition of composition.

1. In reality, Hask doesn’t have an initial object, and the terminal object is actually Void, because of non-strict semantics.