Archive for the ‘Category theory’ Category

Fractals and Monads – Part 2

30 January 2019

Last time I discussed the power set monad and how it shows up in certain methods of building fractals.  This time I’ll go a bit further, and also explain some Haskell code I actually use to generate fractal pictures like this:

penta

To do this, I’ll introduce Kleisli categories, the list monad, and a bit about how monads work in Haskell!

Recall we were considering fractals that can be built using a “replication” function

f\colon X \to \mathcal{P}X

on some set of “shapes” X.  As an example, we had considered a map that takes any triangle in the plane to sets of three new triangles as shown here:

sierpinski-generator

and we noted that “iterating” this map indefinitely gives us all of the triangular “holes” in a Sierpinski triangle:

steps-sierpinski

I also described how iterating this map, if we do everything carefully, involves the structure of the power set monad.

In fact, considering maps like f leads us directly to the “Kleisli category” associated to the power set monad.  I want to explain this here, because in computer science people actually tend to think of monads essentially in terms of their associated Kleisi categories, and that’s how monads are implemented in Haskell.

Every monad has an associated Kleisli category, but since I’ve been working with the power set monad (\mathcal{P}, \eta, \mu) in the category of sets, I’ll start with that special case.

First, in the Kleisli category, which we’ll denote \mathrm{Set}_\mathcal{P}, the objects are the same as the original category \mathrm{Set}: they’re just sets.  A morphism from the set X to the set Y, however, is a function

f\colon X \to \mathcal{P}Y

For now, at least, I’ll avoid making up any special notation for morphisms in the Kleisli category, and instead freely regard f\colon X \to \mathcal{P}Y either as a morphism from X to \mathcal{P}Y in the category \mathrm{Set} of sets, or as a morphism from X to Y in the Kleisli category \mathrm{Set}_\mathcal{P}; these are really the same thing.  If this is confusing, whenever you’re thinking of f as a morphism in the Kleisli category, just think of the \text{``}\mathcal{P}\text{''} as decoration on the arrow , rather than on the codomain object Y.

Given two functions f\colon X \to \mathcal{P} Y and g\colon Y \to \mathcal{P} Z, the Kleisli composite g\bullet f is defined by

g\bullet f := \mu_Z\circ \mathcal{P} g\circ f

Checking the domain and codomain of each of the functions on the right, we can see that this indeed gives a morphism from X to Z in the Kleisli category:

\qquad \qquad X \stackrel f \longrightarrow \mathcal{P} Y \stackrel {\mathcal{P} g} \longrightarrow \mathcal{P} \mathcal{P} Z \stackrel {\mu_Z} \longrightarrow \mathcal{P} Z

You can check that \bullet is associative, and that the maps \eta_X\colon X \to \mathcal{P} X are identity morphisms, so we really do get a category. That’s all there is to the Kleisli category.

Returning to the sort of function we had been considering for generating fractals:

f\colon X \to \mathcal{P}X

these are clearly just endomorphisms in the Kleisli category.  This also makes it clear what we mean by “iterating” such a function: it’s an endomorphism, so we can just compose it with itself as many times as we want.  In particular, analogously to how we define “powers” of ordinary functions by repeated composition, we can define the nth Kleisli power of f recursively by

\displaystyle f^{{\bullet}n} = \left\{\begin{array}{cc}\eta_X & n=0\\ f \bullet f^{{\bullet}(n-1)}& n\ge 1 \end{array}\right.

That is, the 0th power of f is the identity, while the nth power of is the Kleisli composite of f with its (n-1)st power.

For example, consider the set T of all equilateral triangles in the plane and consider the map f\colon T \to \mathcal{P}T defined so that f(t) has six elements, each 1/3 the size of, and linked together around the boundary of the original triangle t, as shown here:

koch-map

Then, starting with an arbitrary triangle t \in T the images under successive Kleisli powers of the generating endomorphism:

f^{\bullet 0}(t), f^{\bullet 1}(t), f^{\bullet 2}(t), f^{\bullet 3}(t), f^{\bullet 4}(t), \ldots

look like this:

koch-progression

Some Haskell

Let’s do some Haskell—If you don’t know Haskell, don’t worry; I’ll explain what everything means. If you do know Haskell, this will be super easy, perhaps even obvious, though I may write things in ways you won’t expect, since I’m transitioning over from the categorical definition of monad, rather than diving in with how monads are actually defined in Haskell.

The code we really need is almost embarrassingly short, mainly because monads are extremely well suited to the task at hand.  What we need most is just to be able to compute Kleisli powers!

Here’s how the formula for the Kleisli power of an endomorphism becomes a function in Haskell:

import Control.Monad

kpow :: Monad m => Int -> (a -> m a) -> a -> m a
kpow 0 _ = return
kpow n f = f <=< kpow (n-1) f

Let’s go through this line by line.

The first line just loads some tools for working with monads; there’s not much I need to say about that now.

The next line is the first one of real content; it’s the “type signature” of the function “kpow”.  The first bit, Monad m, just says that m is any monad; Haskell knows about several monads, and you can also define your own.   Then, the rest of that line:

Int -> (a -> m a) -> a -> m a

can be interpreted as declaring that we have a function that takes one integer (Int), the “power,” and one element of \hom(a, ma), (i.e. something of type a -> m a), and gives us another element of \hom(a, ma).  So if I were to write this line in standard mathematical notation, I would write:

\mathrm{kpow}:\mathbb{N} \times \hom(a, ma) \to \hom(a, ma)

where m is any monad, and a is any type.  We take an integer and a Kleisli endomorphism for the monad m, and return another endomorphism. Notice there’s a slight discrepancy here, in that an Int is any integer, whereas I used the natural numbers \mathbb{N}.  But the integer really is intended to be non-negative.  If you fed a negative integer into kpow, the recursion would never terminate; so, my code doesn’t forbid it, but you shouldn’t do it.

The last two lines of the code give the actual formula for the Kleisli power. If you just look at my earlier formula:

\displaystyle f^{{\bullet}n} = \left\{\begin{array}{cc}\eta_X & n=0\\ f \bullet f^{{\bullet}(n-1)}& n\ge 1 \end{array}\right.

and compare this to these two lines:

kpow 0 _ = return
kpow n f = f <=< kpow (n-1) f

you can probably figure out all of the Haskell syntax just by matching up symbols.

The first line says “whenever the first argument of kpow is 0, the result is “return.” This may be confusing if you’re used to how the keyword “return” is used in many other programming languages. Sorry; it’s not my fault! In Haskell “return” means “\eta_x for any object x“, where the objects in this case are always data types.

The second line says “the nth Kleisli power of f is the composite of f with the (n-1)st power of f”, just as in the other formula. You probably guessed that the Kleisli composite f \bullet g is written f <=< g in Haskell.

Here’s a dictionary relating the notation I’ve been using to Haskell notation for the same thing:

monad-dictionary

I didn’t need to use \mu, a.k.a. join explicitly in the definition of Kleisli powers, since it’s already wrapped into the definition of Kleisli composition.

List monad

To do some examples I’ll use the list monad, which is a built-in monad in Haskell. This is very much like the power set monad, since “lists” are like sets in many ways, so I can introduce this pretty quickly. We really just need a very basic idea of how the list monad works.

A “list” of elements of “type” a can be written written simply as

[x_1, x_2, \ldots x_n] \quad \text{where } x_i\in a

for a finite list, or [x_1, x_2, \ldots] for an infinite list.

Making lists of elements of a given type clearly gives a functor analogous to the power set functor: for any function f\colon a \to b we get a function that just acts element-wise on lists:

([x_1,x_2,\ldots, x_n]) \mapsto [f(x_1),f(x_2), \ldots f(x_n)]

mapping a list of elements of type a into a list of elements of type b.

Based on what we’ve done with sets, you can probably guess how the two natural transformations in the list monad work, too.

First, any single item can be regarded as a list with only one item on it, so

\eta_a(x) = [x]

In Haskell, for example, you could take an element of type Int, say the integer 5, and turn it into a list of integers by using return 5. As long as Haskell has a context to know it is working in the list monad rather than some other monad, this will output [5]. Voilà!

Second, the natural transformation \mu turns lists of lists into lists.  If you’ve got a list of to-do lists, you can always turn it into a single list by first writing down the items on the first list, in order, then continuing with the items on the second list, and so on.

In Haskell, If you enter join [[1,2,3],[4,5]], you’ll get [1,2,3,4,5] as output. That’s \mu.

There’s one noteworthy subtlety, which is that infinite lists are also allowed. So, if you have a list of lists:

[Fun stuff to do, Work to do]

and the first item happens to be an infinite list, then if you try joining these lists up into one in the way I described, you’ll never get to any items on the second list. This is one way \mu in the list monad differs from \mu in the power set monad: In the former, as soon as one of the lists is infinite, everything after that gets thrown away. This last part is not obvious; you certainly could devise ways to join a list of infinite lists together and include all of the items, but that’s not how things work in this monad.

You can easily work out how Kleisli composition works from this, so we’ve now got everything we need.

Cantor set in the List Monad.

Let’s do an example: the Cantor set. I’ll represent closed intervals on the real line by ordered pairs in Haskell. So, when I write (0.3, 1.5), I mean the set \{x\in \mathbb{R} : 0.3 \le x \le 1.5\}. I know, round brackets for closed intervals is weirdly non-standard, but in Haskell, square brackets are reserved for lists, and it’s better to keep things straight.

To generate the Cantor set, all we need is the Kleisli endomorphism in the list monad that describes the general step. This can be defined by a one-line Haskell function:

f (p,q) = [(p, p + (q-p)/3), (q-(q-p)/3, q)]

If we apply this to the unit interval (0, 1), here’s what we get:

[(0.0,0.3333333333333333),(0.6666666666666667,1.0)]

So this works: it’s mapping the closed interval [0,1] to a list of two closed intervals: [0, 1/3] and [2/3, 1].

Now we can compute any step in the construction of the Cantor set by computing Kleisli powers. Let’s just do one. Once you’ve loaded the function kpow, enter this in Haskell:

kpow 3 f (0,1)

and—kapow!—you’ll get this output:

[(0.0,3.7037037037037035e-2),
 (7.407407407407407e-2,0.1111111111111111), 
 (0.2222222222222222,0.25925925925925924),
 (0.2962962962962963,0.3333333333333333),
 (0.6666666666666667,0.7037037037037037),
 (0.7407407407407408,0.7777777777777778),
 (0.888888888888889,0.9259259259259259),
 (0.962962962962963,1.0)]

Ugly isn’t it? But convert to fractions and you’ll see this is doing just the right thing.

cantor

I’ll do more examples in the next post.

Fractals and Monads — Part 1

18 January 2019

A while back, I posted a bunch of fractal artwork that I did for fun, based on variations of the Koch snowflake; like this one, for example:

koch_triangles

This is composed entirely of equilateral triangles, but it’s just one step in an iterated process leading to a fractal consisting of infinitely many Koch snowflakes at ever smaller scales.

I had a lot of fun coming up with designs like this, but actually I didn’t start out wanting to do new things with the Koch snowflake—that was just a fun tangent.  Originally, I was thinking about monads!

It’s high time I got back to what I was originally working on and explained what these fractals have to do with the “power set monad.”  To do that, I should probably start by explaining what the power set monad is.

For any set X the power set \mathcal{P} X is the set of all its subsets.  If you’ve ever studied any set theory, this should be familiar.  Unfortunately, a typical first course in set theory may not go into the how power sets interact with functions; this is what the “monad” structure of power sets is all about. This structure has three parts, each corresponding to a very common and important mathematical construction, but also so simple enough that we often use it without even thinking about it.

Let me finally say what the monad structure of power sets actually is.

First, mapping sets to their power sets is actually a functor:  For any function f\colon X \to Y, there corresponds a function

\mathcal{P}f\colon \mathcal{P}X \to \mathcal{P}Y

which sends each element S \in \mathcal{P}X, i.e. each subset S \subseteq X, to its image:

(\mathcal{P}f)(S) = \{ f(s) : s \in S\}.

This makes \mathcal{P} a functor because the mapping of functions gets along nicely with identity functions and composition.   This functor is the first part of the power set monad.

You might be accustomed to writing the image of the set S under f simply as \text{``}f(S)\text{''}, rather than \mathcal{P}f(S).  But here, we need to be more careful: despite its frequent use in mathematics, \text{``}f(S)\text{''} is a notational abuse for which \mathcal{P}f(S) is the proper form.

To motivate the next piece in the power set monad, notice that there’s a strong sense in which \mathcal{P}f is just an extension of f, namely, on singleton sets, it essentially “is” f:

\mathcal{P}f(\{ x \}) = \{f(x)\}.

This might make it tempting write \mathcal{P}f( x ) =  f(x) for x \in X, as sort of “dual” version of the previous abuse of notation, though I don’t think anyone does that.

A better way to express how \mathcal{P}f extends f is to introduce the canonical map sending elements of X to the corresponding singleton sets:

\displaystyle \begin{array}{rcl} \eta_X \colon X & \to & \mathcal{P}X \\ x & \mapsto & \{x\} \end{array}

Then, the previous equation can be written \mathcal{P}f \circ \eta_X = \eta_Y \circ f, which is precisely the equation that makes \eta a natural transformation. This natural transformation is the second piece in the monad structure.

Anyone who’s taught elementary set theory can attest that some students have a hard time at first making with conceptual distinction between elements and singleton subsets of a set.  Indeed, conflating the two is a very “natural” thing to do, which is why this is a natural transformation.

The last piece of the power set monad is another natural transformation. Much like \eta, which formalizes a way to think of elements as being subsets, there is a natural transformation that lets us think of sets of subsets as being subsets.  This is given by, for each set X, a map:

\displaystyle \begin{array}{rcl} \mu_X \colon \mathcal{P}^2X & \to & \mathcal{P}X \\ S & \mapsto & \bigcup_{U\in S} U \end{array}

Here \mathcal{P}^2X is the power set of the power set of X, so an element of it is a sets of subsets of X, and we turn this into a single subset by taking the union.

That’s it!  Or, almost.  To form a monad, the data \mathcal{P}, \eta, \mu must be compatible in such a way that these diagrams commute:

monad_laws

It’s a simple exercise to check that this is indeed the case.  You can extract the general definition of monad from what I’ve written here: you need an endofunctor on some category, and two natural transformations such that the diagrams analogous to those above commute.

Now let’s come back to fractals:

img_9839.jpg

Many kinds of fractals are created by taking a simple shape, making multiple copies, transforming them somehow—for example, by shrinking or rotating—and then repeating.

For example, here’s one way to build a Sierpinski triangle, or rather its complement, staring from a single triangle and a rule for how triangles should “reproduce”:

steps-sierpinski

If we look closely, we see that this process naturally involves ingredients from the power set monad.

In particular, the basic building block is the self-replication rule that we can write as a function

f\colon T \to \mathcal{P}T

where T is the set of triangles in the plane.  We won’t need an explicit formula for the function, since a picture suffices to explain what it does:

sierpinski-generator

We can get all of the triangular holes in the Sierpinski triangle by “iterating” f. But, what does this mean exactly?  It would be nice if we could just do \text{``}f(f(t))\text{''}, but, strictly speaking, this doesn’t parse: we can’t feed a set of triangles into f, since this is a function that expects a single triangle as input.

Fortunately, we know the solution to this problem: since f(t) \in \mathcal{P}T, we can’t feed it into f, but we can feed it into \mathcal{P}f.  Good so far.

The problem with this is that \mathcal{P}f(f(t)) again doesn’t live in T, nor does it live in \mathcal{P}T—it lives in \mathcal{P}^2T:

\mathcal{P}f \colon \mathcal{P}T \to \mathcal{P}^2 T

It seems we would be doomed to march into ever more deeply-nested sets, were it not for the natural transformation \mu.  Fortunately, using \mu_T we get

\mu_T \circ \mathcal{P}f \colon \mathcal{P}T \to \mathcal{P} T

So, now we can see the process of building a fractal as just repeatedly applying this function \mu_T \circ \mathcal{P}f.

One last problem: What do we apply it to?  We want to apply it to our initial triangle t, but that’s an element of T, not \mathcal{P}T.  Good thing we can think of this as an element of \mathcal{P}T using the natural transformation \eta!

So, here’s a recipe to build a wide variety of fractals, like some of the ones I’ve posted on this blog:

but also tons of others, using the power set monad:

  1. Start with a replication rule f\colon X \to \mathcal{P}X on some set X of “shapes.”
  2. Pick a shape x\in X.
  3. Feed your initial shape into \eta_X to get the initial state.
  4. Iterate feeding the current state into \mu_X \circ \mathcal{P} T to get a new state, as many times as you want.

In fact, in calling this a “recipe,” I mean more than just an abstract mathematical recipe: this is essentially how I really drew all of those fractal pictures.  I generated them using Haskell—a functional programming language that knows about monads.  I didn’t use the power set monad, but rather Haskell’s built in “list monad,” which works very similarly.