Families and fibrations

Introduction

The notion of family of “objects” indexed over an object of the same type is ubiquitous is mathematics and computer science.

It appears everywhere in topology and algebraic geometry, in the form of bundles, covering maps, or, more generally, fibrations.

In type theory, it is the fundamental idea captured by the notion of dependent type, on which Martin-Löf intuitionistic type theory is based.

Definition

Restricting ourselves to \(\mathrm{Set}\), the category of sets, for the time being (and ignoring issues of size), it is straightforward to give a formal definition of what a family of sets is:

Given a set A, a family over A is a function from A to the objects of the category of sets (or equivalently, on the other side of the adjunction, a functor from A regarded as a discrete category to \(\mathrm{Set}\)).

This is a perfectly valid definition, but it has two problems:

  1. It can be awkward to work with functions between objects of different “sorts” (like sets and universes).

  2. It is not clear how to generalize the idea to other categories, like \(\mathrm{Top}\) (the category of topological spaces and continuous maps), for example. In fact, we would like a family of spaces to be “continuous” in some sense, but in order for that to make sense, we would need to define a topology on the class of topological spaces.

Display maps

Fortunately, there is a very simple construction that helps bringing this definition to a form which is much easier to work with.

Let’s start with a family of sets B over A, defined as above: B : A → Set.

Define the “total space” of the family as the disjoint union (or dependent sum) of all the sets of the family (I’ll use type theoretic notation from now on):

E = Σ (a : A) . B a

The fibration (or display map) associated to the family B is defined to be the first projection:

proj₁ : E → A

So far, we haven’t done very much. The interesting observation is that we can always recover a family of sets from any function E → A.

In fact, suppose that now E is any set, and p : E → A any function. We can define a family of sets:

B : A → Set
B a = p ⁻¹ a

as the function that associates to each point in A, its inverse image (or fiber) in E.

It is now straightforward to check that these two mappings between families and fibrations are inverses of one another.

Intuitively, given a family B, the corresponding fibration maps each element of all possible sets in the family to its “index” in A. Viceversa, given a fibration p : E → A, the corresponding family is just the family of fibers of p.

Here is formalization in Agda of this correspondence as an isomorphism between families and fibrations. This uses agda-base instead of the standard library, as it needs univalence in order to make the isomorphism explicit.

Examples of constructions

Once we understand how families and fibrations are really two views of the same concept, we can look at a number of constructions for families, and check how they look like in the world of fibrations.

Dependent sum

The simplest construction is the total space:

E = Σ (x : A). B x

As we already know, this corresponds to the domain of the associated fibration.

Dependent product

Given a family of sets B over A, a choice function is a function that assigns to each element x of A, an element y of B x. This is called a dependent function in type theory.

The corresponding notion for a fibration p : E → A is a function s : A → E such that for each x : A, the index of s x is exactly x. In other words, p ∘ s ≡ id, i.e. s is a section of p.

The set of such sections is called the dependent product of the family B.

Pullbacks

Let A and A' be two sets, and B a family over A. Suppose we have a function

r : A' → A

We can easily define a family B' over A' by composing with r:

B' : A' → Set
B' x = B (r x)

What does the fibration p' : E' → A' associated to B' look like in terms of the fibration p : E → A associated to B?

Well, given an element b in the total space of B', b is going to be in B' x for some x : A'. Since B' x ≡ B (r x) by definition, b can also be regarded as an element of the total space of B. So we have a map s : E' → E, and we can draw the following diagram:

The commutativity of this diagram follows from the immediate observation that the index above s b is exactly r x.

Now, given elements x : A', and b : E, saying that p b ≡ r x is equivalent to saying that b is in B (r x). In that case, b can be regarded as an element of B' x, which means that there exists a b' in E' such that p' b' ≡ x and s b' ≡ b.

What this says is that the above diagram is a pullback square.

Adjoints

It is important to note that the previous constructions are related in interesting ways.

Let’s look at a simple special case of the pullback construction, i.e. when B is a trivial family of just one element. That means that the display map p associated to B is the canonical map

p : B → 1

So, if A' is any other type, we get that the pullback of p along the unique map r : A' → 1 is the product B × A.

This defines a functor

\[ A^\ast : \mathrm{Set} → \mathrm{Set}/A \]

where \(\mathrm{Set}/A\) denotes the slice category of sets over A. Furthermore, the dependent product and dependent sum constructions defined above give rise to functors:

\[ Σ_A, Π_A : \mathrm{Set}/A → \mathrm{Set} \]

Now, it is clear that, given a fibration p : X → A and a set Y, functions X → Y are the same as morphisms X → Y × A in the slice category. So \(Σ_A\) is left adjoint to \(A^\ast\).

Dually, functions from Y to the set of sections of p correspond to functions Y × A → X in the slice category, thus giving an adjuction between \(A^*\) and \(Π_A\).

So we have the following chain of adjunctions:

\[ Σ_A \vdash A^* \vdash Π_A \]

Conclusion

The correspondence between indexed families and fibrations exemplified here extends well beyond the category of sets, and can be abstracted using the notions of Cartesian morphisms and fibred categories.

In type theory, it is useful to think of this correspondence when working with models of dependently typed theories in locally cartesian closed categories, and I hope that the examples given here show why slice categories and pullback functors play an important role in that setting.

Continuation-based relative-time FRP

In a previous post I showed how it is possible to write asynchronous code in a direct style using the ContT monad. Here, I’ll extend the idea further and present an implementation of a very simple FRP library based on continuations.

Monadic events

Let’s start by defining a callback-based Event type:

A value of type Event a represents a stream of values of type a, each occurring some time in the future. The on function connects a callback to an event, and returns an object of type Dispose, which can be used to disconnect from the event:

The interesting thing about this Event type is that, like the simpler variant we defined in the above post, it forms a monad:

First of all, given a value of type a, we can create an event occurring “now” and never again:

Note that the notion of “time” for an Event is relative.

All time-dependent notions about Events are formulated in terms of a particular “zero” time, but this origin of times is not explicitly specified.

This makes sense, because, even though the definition of Event uses the IO monad, an Event object, in itself, is an immutable value, and can be reused multiple times, possibly with different starting times.

The definition of >>= is slightly more involved.

We call the function f every time an event occurs, and we connect to the resulting event each time using the helper function addD, accumulating the corresponding Dispose object in an IORef.

The resulting Dispose object is a function that reads the IORef accumulator and calls dispose on that.

Monadic bind
Monadic bind

As the diagram shows, the resulting event e >>= f includes occurrences of all the events originating from the occurrences of the initial event e.

Event union

Classic FRP comes with a number of combinators to manipulate event streams. One of the most important is event union, which consists in merging two or more event streams into a single one.

In our case, event union can be implemented very easily as an Alternative instance:

An empty Event never invokes its callback, and the union of two events is implemented by connecting a callback to both events simultaneously.

Other combinators

We need an extra primitive combinator in terms of which all other FRP combinators can be implemented using the Monad and Alternative instances.

The once combinator truncates an event stream at its first occurrence. It can be used to implement a number of different combinators by recursion.

Behaviors and side effects

We address behaviors and side effects the same way, using IO actions, and a MonadIO instance for Event:

Now we can implement something like the apply combinator in reactive-banana:

Events can also perform arbitrary IO actions, which is necessary to actually connect an Event to user-visible effects:

Executing event descriptions

An entire GUI program can be expressed as an Event value, usually by combining a number of basic events using the Alternative instance.

A complete program can be run with:

Underlying assumptions

For this simple system to work, events need to possess certain properties that guarantee that our implementations of the basic combinators make sense.

First of all, callbacks must be invoked sequentially, in the order of occurrence of their respective events.

Furthermore, we assume that callbacks for the same event (or simultaneous events) will be called in the order of connection.

Many event-driven frameworks provide those guarantees directly. For those that do not, a driver can be written converting underlying events to Event values satisfying the required ordering properties.

Conclusion

It’s not immediately clear whether this approach can scale to real-world GUI applications.

Although the implementation presented here is quite simplistic, it could certainly be made more efficient by, for example, making Dispose stricter, or adding more information to Event to simplify some common special cases.

This continuation-based API is a lot more powerful than the usual FRP combinator set. The Event type combines the functionalities of both the classic Event and Behavior types, and it offers a wider interface (Monad rather than only Applicative).

On the other hand, it is a lot less safe, in a way, since it allows to freely mix IO actions with event descriptions, and doesn’t enforce a definite separation between the two. Libraries like reactive-banana do so by distinguishing beween “network descriptions” and events/behaviors.

Finally, there is really no sharing of intermediate events, so expensive computations occurring, say, inside an accumE can end up being unnecessarily performed more than once.

This is not just an implementation issue, but a consequence of the strict equality model that this FRP formulation employs. Even if two events are identical, they might not actually behave the same when they are used, because they are going to be “activated” at different times.

Pipes 2.0 vs pipes-core

With the release of pipes 2.0 by Gabriel Gonzalez, I feel it’s time to address the question of whether my fork will eventually be merged or not.

The short answer is no, I will continue to maintain my separate incarnation pipes-core. In this post, I will discuss the reasoning behind this decision, and hopefully explain the various trade-offs that the two libraries make.

The issue with termination

pipes 1.0 can be quite accurately described as “composable monadic stream processors”. “Composable” alludes to horizontal composition (i.e. the Category instance), while “monadic” refers to vertical composition.

The existence of a Monad instance has a number of consequences, the most important being the fact that pipes can carry a “return value”, and, in particular, they can terminate.

The fact that pipes can terminate poses the greatest challenge when reasoning about the properties of (horizontal) composition, but termination is also one of the nicest features of pipes, so we want to deal with this difficulty appropriately.

Termination implies that any pipe has to deal somehow with the fact that its upstream pipe can exit before yielding a value, which basically means that an await can fail.

Gabriel’s pipes address this issue by simply “propagating termination downstream”. A pipe awaiting on a terminated pipe is forcibly terminated itself, and the upstream return value is returned.

My guarded pipes idea (later integrated into pipes-core), proposes a new primitive

that returns Nothing when upstream terminates before providing a value.

Using tryAwait, a pipe can then handle a failure caused by termination, and either return a value, or use the upstream value (the latter can be accomplished by simply awaiting again).

Exception handling

Once you realize that pipes should be able to handle failure on await, it becomes very natural to extend the idea to other kinds of failure.

That’s exactly the rationale behind pipes-core. It introduces slightly more involved primitives that take into account the fact that actions in the base monad, as well as pipes themselves, can throw an exception at any time.

One very interesting consequence of built-in exception handling is that the “guarded pipes” concept can be integrated seamlessly by introducing a special BrokenPipe exception.

The exception handling implementation in pipes-core works in any monad, and deals with asynchronous exceptions correctly. Of course, actual exceptions thrown from Haskell code can only be caught when the base monad is IO.

What about finalization?

Since all the finalization primitives in Control.Exception are implemented on top of exception handling primitives like catch and mask, I initially believed that finalization would follow automatically from exception handling capabilities in pipes.

Unfortunately, there is a fundamental operational difference between IO and Pipe, which makes exception handling alone insufficient to guarantee finalization of resources.

The problem is that some of the pipes in a pipeline are not guaranteed to be executed at all. In fact, a pipe only plays a role in pipeline execution if its downstream pipe awaits at some point (or if it is the last one).

The same applies to “portions” of pipes, so a pipe can execute partially, and then be completely forgotten, even if no exceptional condition occurs.

After a number of failed attempts (including the broken 0.0.1 release of pipes-core), I realized that Gabriel’s finalizer passing idea was the right one, and used it to replace my flawed ensure primitive.

Balancing safety and dynamicity

The question remains of how to guarantee that a pipe never awaits again after its upstream terminated.

My solution is dynamic: if upstream terminated because of an exception (that has been handled), just throw the exception again on await; if upstream terminated normally, throw a BrokenPipe exception.

Gabriel’s solution is static: a pipe is not allowed to await again after termination, and the invariant is enforced by the types.

The static solution has obvious advantages, but, on closer inspection, it reveals a number of downsides:

  1. It prevents Pipe from forming a Monad; the solution implemented in pipes 2.0 is to separate the Monad instance from the Category instance, and suggesting that the Monad instance should actually be replaced with an indexed monad.
  2. It doesn’t provide any exception handling mechanism, and doesn’t guarantee that finalizers will be called in case any exception occurs. I imagine that some sort of exception support could be layered on top of the current solution, but I’m guessing it’s not going to be straightforward.
  3. Folds are not compositional. This can be clearly seen in the tutorial, where strict is not defined in terms of toList. With pipes-core, you would simply have:

What’s next for pipes-core

The current version of pipes-core already provides exception handling and guaranteed finalization in the face of asynchronous exceptions. Things that could be improved in its finalization support are:

  1. Finalization is currently guaranteed, but not always prompt. When an exception handler is provided, upstream finalization gets delayed unnecessarily.
  2. It is not possible to prematurely force finalization. I haven’t yet seen an example where this would be useful, but it would be nice to have it for completeness.

I think I know how these points can be addressed, and hopefully they will make it into the next release.

For future releases, I’d like to focus on performance. Aside from micro-optimizations, I can see two main areas that would benefit from improvements: the Monad instance and the Category instance.

The current monadic bind unfortunately displays a quadratic behavior, since it basically works like a naive list concatenation function. The Codensity transformation should address that.

For the Category instance, it would be interesting to explore whether it is possible to achieve some form of fusion of intermediate data structures, similarly to classic stream fusion for lists.

This is probably going to be more of a challenge, and will likely require some significant restructuring, but the prospective benefits are enormous. There is some research on this topic and an initial attempt I plan to draw ideas from.

My last point is about the absence of an unawait primitive for Pipe. There has been quite a lot of discussion on this topic, but I remain unconvinced that having builtin parsing capabilities is a good thing.

Whenever there is a need to chain unconsumed input, there are a few viable options already:

  1. Return leftover data, and add some manual wiring so that it’s passed to the “next” pipe.
  2. Use PutbackPipe from pipes-extra.
  3. Use an actual parser library and convert the parser to a Pipe (see pipes-attoparsec).

In all the examples I have seen, however, pipes are composable enough that all the special logic to deal with boundaries of chunked streams can be implemented in a single “filter” pipe, and the rest of the pipeline can ignore the issue altogether.

Applicative option parser

There are quite a few option parsing libraries on Hackage already, but they either depend on Template Haskell, or require some boilerplate. Although I have nothing against the use of Template Haskell in general, I’ve always found its use in this case particularly unsatisfactory, and I’m convinced that a more idiomatic solution should exist.

In this post, I present a proof of concept implementation of a library that allows you to define type-safe option parsers in Applicative style.

The only extension that we actually need is GADT, since, as will be clear in a moment, our definition of Parser requires existential quantification.

Let’s start by defining the Option type, corresponding to a concrete parser for a single option:

For simplicity, we only support “long” options with exactly 1 argument. The optMatches function checks if an option matches a string given on the command line.

We can now define the main Parser type:

The Parser GADT resembles a heterogeneous list, with two constructors.

The NilP r constructor represents a “null” parser that doesn’t consume any arguments, and always returns r as a result.

The ConsP constructor is the combination of an Option returning a function, and an arbitrary parser returning an argument for that function. The combined parser applies the function to the argument and returns a result.

The definition of (<*>) probably needs some clarification. The variables involved have types:

and we want to obtain a parser of type Parser c. So we uncurry the option, obtaining:

and compose it with a parser for the (a, b) pair, obtained by applying the (<*>) operator recursively:

This is already enough to define some example parsers. Let’s first add a couple of convenience functions to help us create basic parsers:

And a record to contain the result of our parser:

A parser for User is easily defined in applicative style:

To be able to actually use this parser, we need a “run” function:

The idea is very simple: we take the first argument, and we go over each option of the parser, check if it matches, and if it does, we replace it with a NilP parser wrapping the result, consume the option and its argument from the argument list, then call runParser recursively.

Here is an example of runParser in action:

The order of arguments doesn’t matter:

Missing arguments will result in a parse error (i.e. Nothing). We don’t support default values but they are pretty easy to add.

I think the above Parser type represents a pretty clean and elegant solution to the option parsing problem. To make it actually usable, I would need to add a few more features (boolean flags, default values, a help generator) and improve error handling and performance (right now parsing a single option is quadratic in the size of the Parser), but it looks like a fun project.

Does anyone think it’s worth adding yet another option parser to Hackage?