Experiments in Markov chains

This is page on Markov chains and Markov processes, focusing on their appearance in network theory.

Suppose you start with a molecule of ethane, which has 2 carbons and 6 hydrogens arranged like this:

Then suppose you remove one hydrogen. The result is a positively charged ion, or ‘cation’. This particular one is called an ‘ethyl cation’. People used to think it looked like this:

They also thought a hydrogen could hop from the carbon with 3 hydrogens attached to it to the carbon with 2. So, they drew a graph with a vertex for each way the hydrogens could be arranged, and an edge for each hop. It looked like this:

The red vertices come from arrangements where the *first* carbon has 2 hydrogens attached to it, and the blue vertices come from those where the *second* carbon has 2 hydrogens attached to it. So, each edge goes between a red vertex and a blue vertex.

This graph has 20 vertices, which are arrangements or ‘states’ of the ethyl cation. It has 30 edges, which are hops or ‘transitions’. Let’s see why those numbers are right.

First I need to explain the rules of the game. The rules say that the 2 carbon atoms are distinguishable: there’s a ‘first’ one and a ‘second’ one. The 5 hydrogen atoms are also distinguishable. But, all we care about is which carbon atom each hydrogen is bonded to: we don’t care about further details of its location. And we require that 2 of the hydrogens are bonded to one carbon, and 3 to the other.

So, there are 2 choices of which carbon has two hydrogens attached to it. Then there are $\binom{5}{2} = 10$ choices of which two hydrogens are attached to it. This gives a total of $2 \times 10 = 20$ states. These are the vertices of our graph.

The edges of the graph are transitions between states. The idea is that any hydrogen in the group of 3 can hop over to the group of 2. There are 3 choices for which hydrogen atom makes the jump. So, starting from any vertex in the graph there are 3 edges. This means there are $3 \times 20 / 2 = 30$ edges. That’s 3 edges touching each of 20 vertices, but then to avoid double-counting we have to divide by 2, since each edge actually touches 2 vertices.

The idea of using this graph in chemistry goes back to this paper:

• A. T. Balaban, D. Fǎrcaşiu and R. Bǎnicǎ, Graphs of multiple 1,2-shifts in carbonium ions and related systems, *Rev. Roum. Chim.* **11** (1966), 1205.

This paper is famous because it was the first to use graphs in chemistry to describe molecular transitions, as opposed to merely using them as pictures of molecules!

But this particular graph was already famous for other reasons. It’s called the **Desargues-Levi graph**, or **Desargues graph** for short:

• Desargues graph, Wikipedia.

There are lots of nice ways to draw the Desargues graph:

It’s very symmetrical. Clearly any permutation of the 5 hydrogens acts as a symmetry of the graph, and so does any permutation of the 2 carbons. This gives a symmetry group $S_5 \times S_2$, which has $5! \times 2! = 240$ elements. And in fact this turns out to be the full symmetry group of the Desargues graph.

The Desargues graph, its symmetry group, and further applications to chemistry are discussed here:

• Milan Randić, Symmetry properties of graphs of interest in chemistry: II: Desargues-Levi graph, *Int. Jour. Quantum Chem.* **15** (1997), 663-682.

If at any moment the state of the ethyl cation corresponds to some vertex of the Desargues graph, and it hops randomly along edges as time goes by, we can describe how this works using a random walk on the Desargues graph. This is a nice example of a Markov process. So, I’d like to say a bit more about this.

But first, an admission. After the paper by Balaban, Fǎrcaşiu and Bǎnicǎ came out, people gradually realized that the ethyl cation doesn’t really look like the drawing I showed you! In fact it’s a ‘nonclassical’ ion. In other words, its actual structure is not what you’d expect from the traditional ball-and-stick model where you take an ethane molecule, rip off a hydrogen and see what’s left. The ethyl cation really looks like this:

For more details, and pictures that you can actually rotate, see:

• Stephen Bacharach, Ethyl cation, Computational Organic Chemistry.

So, if we stubbornly insist on applying the Desargues graph to realistic chemistry, we need to find some other molecule to apply it to.

Luckily, there are lots of options! They’re called trigonal bipyramidal molecules. They look like this:

The 5 balls on the outside are called ‘ligands’: they could be atoms or bunches of atoms. In chemistry, ‘ligand’ just means something that’s stuck onto a central thing. For example, in phosphorus pentachloride the ligands are chlorine atoms, all attached to a central phosphorus atom:

It’s a colorless solid, but as you might expect, it’s pretty nasty stuff: it’s not flammable, but it reacts with water or heat to produce toxic chemicals like hydrogen chloride.

Another example is iron pentacarbonyl, where 5 carbon-oxygen ligands are attached to a central iron atom:

You can make this stuff by letting powdered iron react with carbon monoxide. It’s a straw-colored liquid with a pungent smell!

Whenever you’ve got a molecule of this shape, the ligands come in two kinds. There are the 2 ‘axial’ ones, and the 3 ‘equatorial’ ones:

And the molecule has $20$ states… but only if count the states a certain way. We have to treat all 5 ligands as distinguishable, but think of two arrangements of them as the same if we can rotate one to get the other. The trigonal bipyramid has a rotational symmetry group with 6 elements. So, there are $5! / 6 = 20$ states.

The transitions between states are devilishly tricky. They’re called **pseudorotations**, and they look like this:

If you look carefully, you’ll see what’s going on. First the 2 axial ligands move towards each other to become equatorial. Then 2 of the 3 equatorial ones swing out to become axial. This fancy dance is called the Berry pseudorotation mechanism.

To get from one state to another this way, we have to pick 2 of the 3 equatorial ligands to swing out and become axial. There are 3 choices here. So, if we draw a graph with states as vertices and transitions as edges, it will have 20 vertices and $20 \times 3 / 2 = 30$ edges. That sounds suspiciously like the Desargues graph!

**Puzzle 1**. Show that the graph with states of a trigonal bipyramidal molecule as vertices and pseudorotations as edges is indeed the Desargues graph.

I think this fact was first noticed here:

• Paul C. Lauterbur and Fausto Ramirez, Pseudorotation in trigonal-bipyramidal molecules, *J. Am. Chem. Soc.* **90** (1968), 6722–6726.

Now I want to go back to the Desargues graph and sketch how to use some fancy math to study the Markov process it describes. For this it’s probably easiest to go back to our naive model of an ethyl cation:

Even though ethyl cations don’t really look like this, and we should be talking about some trigonal bipyramidal molecule instead, it won’t affect the math to come. Mathematically, the two problems are isomorphic! So let’s stick with this nice simple picture.

We can be a bit more abstract, though. A state of the ethyl cation is like having 5 balls, with 3 in one pile and 2 in the other. And we can focus on the first pile and forget the second, because whatever isn’t in the first pile must be in the second.

Of course a mathematician calls a pile of things a ‘set’, and calls those things ‘elements’. So let’s say we’ve got a set with 5 elements. Draw a red dot for each 2-element subset, and a blue dot for each 3-element subset. Draw an edge between a red dot and a blue dot whenever the 2-element subset is *contained* in the 3-element subset. We get the Desargues graph:

Now, I keep saying that we get this, but it would be quite a lot of work to check that it’s true. It’s actually less work to draw a big graph with vertices for *all* the subsets of our 5-element set! If we draw an edge whenever an $n$-element subset is contained in an $(n+1)$-element subset, the Desargues graph will be sitting inside this big graph.

Here’s what it looks like:

This graph has $2^5$ vertices. It’s actually a picture of a 5-dimensional cube! The vertices are arranged in columns. There’s

• one 0-element subset,

• five 1-element subsets,

• ten 2-element subsets,

• ten 3-element subsets,

• five 4-element subsets,

• one 5-element subset.

So, the numbers of vertices in each column go like this:

$1 \quad 5 \quad 10 \quad 10 \quad 5 \quad 1$

which is a row in Pascal’s triangle. We get the Desargues graph if we keep only the vertices corresponding to 2- and 3-element subsets, like this:

So, this is yet another way to draw the Desargues graph!

Now, suppose that the process whereby our molecule hops from one state to another is ‘decoherent’ enough to be treated classically, rather than using quantum mechanics. Then we can describe this process as a random walk on the Desargues graph!

Note, all the transitions are reversible here: each edge in the Desargues graph really gives two edges in a directed graph, one pointing forward and one pointing backward. Thanks to the enormous amount of symmetry, the rates of all these transitions must be equal.

Let’s write $D$ for the set of vertices of the Desargues graph. A probability distribution of states of our molecule is a function

$\psi : D \to [0,1]$

with

$\sum_{i \in D} \psi = 1$

We can think of these probability distributions as living in this vector space:

$L^1(D) = \{ \psi: D \to \mathbb{C} \}$

I’m calling this space $L^1$ because of the general abstract nonsense explained in Part 5: probability distributions on any measure space $X$ live in a vector space called $L^1(X)$, consisting of the integrable functions on $X$. Today the notation is overkill, since *every* function on $D$ is integrable. It’s just a finite set, after all, and these ‘integrals’ are just finite sums! But humor me, please.

The point is that we’ve got a general setup that applies here. There’s a Hamiltonian:

$H : L^1(D) \to L^1(D)$

describing the rate at which the molecule randomly hops from one state to another… and the probability distribution $\Psi \in L^1(D)$ evolves in time according to the master equation

$\frac{d}{d t} \Psi(t) = H \Psi(t)$

But what’s the Hamiltonian $H$? It’s very simple, because it’s equally likely for the state to hop from any vertex $x \in D$ to any other vertex $y \in D$ that’s connected to $x$ by an edge. Lets write $(y x)$ to mean that $y$ is connected to $x$ by an edge. Then

$(H \Psi)(x) = \left( \sum_{y \in G : (y x)} \Psi(y) \right) - 3 \Psi(x)$

We’re subtracting $3 \Psi(x)$ because there are 3 edges coming out of each vertex $x$, so this is the rate at which the probability of staying at $x$ decreases.

This Hamiltonian an example of a graph Laplacian. We could do the same thing for any graph, but it would get a tiny bit more complicated when different vertices have different numbers of edges coming out of them. Suppose $G$ is the set of vertices of a graph. Let $n(x)$ be the number of vertices coming out of the vertex $x \in G$. As before, write $(y x)$ when there’s an edge connecting $x$ and $y$. Then the **graph Laplacian** is this operator on $L^1(G)$:

$(H \Psi)(x) = \left( \sum_{y \in G : (y x)} \Psi(y) \right) - n(x) \Psi(x)$

There is a huge amount to say about graph Laplacians, but let’s just say that $\exp(t H)$ describes time evolution for a random walk on the graph, where hopping from one vertex to any neighboring vertex has unit probability per unit time. We can make the hopping faster or slower by multiplying $H$ by a constant. (Here is a good time to admit that most people use a graph Laplacian that’s the negative of ours, and write time evolution as $\exp(-t H)$.

**Puzzle 2**. Show that for any finite graph $G$, the graph Laplacian $H: L^1(G) \to L^1(G)$ is **infinitesimal stochastic**, meaning that:

1) for any $\Psi \in L^1(G)$ we have

$\sum_{x \in G} (H \Psi)(x) = 0$

and

2) the off-diagonal entries of $H$ are nonnegative when we regard it as a $G \times G$ matrix in the obvious way.

This implies that for any $t \ge 0$, the operator $\exp(t H)$ is stochastic—just what we want.

Now I’d like to give a cute description of the graph Laplacian that only works for the Desargues graph. Since this graph has 20 vertices, 10 red and 10 blue, the space $L^1(D)$ is 20-dimensional, the direct sum of two 10-dimensional pieces. But if you’re a mathematician, when you see these numbers

$1 \quad 5 \quad 10 \quad 10 \quad 5 \quad 1$

you should instantly think of the exterior algebra

$\Lambda(\mathbb{C}^5) = \Lambda^0(\mathbb{C}^5) \oplus \Lambda^1(\mathbb{C}^5) \oplus \Lambda^2(\mathbb{C}^5) \oplus \Lambda^3(\mathbb{C}^5) \oplus \Lambda^5(\mathbb{C}^5)$

where the dimensions of the summands form that particular row of Pascal’s triangle. Indeed, there’s a natural isomorphism of vector spaces:

$L^1(D) \cong \Lambda^2(\mathbb{C}^5) \oplus \Lambda^3(\mathbb{C}^5)$

The symmetries of the Desargues graph are nicely evident now. $S_5$ acts to permute the basis vectors of $\mathbb{R}^5$. $S_2$ switches the two summands; up to some signs it’s just the Hodge star operator!

I can’t resist adding that if we were studying our molecule quantum-mechanically instead of stochastically, we’d use a complex wavefunction instead of a real probability distribution, and we’d think of that as living in $L^2(D)$ instead of $L^1(D)$. But since $D$ is a finite set, $L^1(D)$ and $L^2(D)$ are the same. The big difference is that for the quantum problem must use a different Hamiltonian, since time evolution should be unitary, not stochastic.

And while we’re talking about quantum mechanics, I can’t resist adding that $\Lambda(\C^5)$ is fundamental to the description of fermions in the $\mathrm{SU}(5)$ grand unified theory! Each of the 32 basis vectors corresponds to some sort of quark, lepton, antiquark or antilepton. For more details, try:

• John Baez and John Huerta, The algebra of grand unified theories, *AMS Notices*, **47** (2010), 483-552.

I never expected this math to show up in chemistry! But when I saw it here, I got me an idea.

Recall how the exterior algebra $\Lambda(\mathbb{C}^5)$ actually works. (If you don’t know, you can’t ‘recall’ tit, but I’m about to tell you, so you can pretend you knew.) The vector space $\mathbb{C}^5$ has a standard basis, say $e_1, e_2, e_3, e_4, e_5$. We get a basis for $\Lambda(\mathbb{C}^5)$ by taking ‘wedge products’ of these guys, for example

$e_1 \wedge e_3 \wedge e_2 \wedge e_4$

and these wedge products satisfy some rules, most notably

$e_i \wedge e_j = - e_j \wedge e_i$

and therefore

$e_i \wedge e_i = 0$

So, when we form our basis for $\Lambda(\mathbb{C}^5)$, we only need to take wedge products of $e_i$‘s in *increasing order*, since for example

$e_1 \wedge e_3 \wedge e_2 \wedge e_4 = - e_1 \wedge e_2 \wedge e_3 \wedge e_4$

and we don’t want the same $e_i$ to show up twice in one of these products, since then we’ll just get zero.

So, each $e_i$ can either be in the product, or not, so $\Lambda(\mathbb{C}^5)$ has $2^5$ different basis vectors, which is why it’s related to a 5-dimensional hypercube. And returning to our ethyl cation for a moment, we can say that each of these basis vectors corresponds to a state in which certain hydrogen atoms are attached to the first carbon atom. For example,

$e_1 \wedge e_2 \wedge e_5$

corresponds to the state where the 1st, 2nd and 5th hydrogen are attached to the first carbon atom. The rest are attached to the second carbon atom.

By definition, $\Lambda^p(\mathbb{C}^5)$ is the subspace whose basis consists of wedge products of $p$ different $e_i$‘s. So, this subspace has dimension $\binom{5}{p}$. That’s why the dimensions of these subspaces give this row of Pascal’s triangle:

$1 \quad 5 \quad 10 \quad 10 \quad 5 \quad 1$

Now, there are well-known **creation operators**

$a_i^\dagger : \Lambda(\mathbb{C}^5) \to \Lambda(\mathbb{C}^5)$

given by

$a_i^\dagger \Psi = e_i \, \wedge \, \Psi$

These have the effect of moving an extra hydrogen atom over to the first carbon atom. For example:

$a_2 (e_4 \wedge e_5) = e_2 \wedge e_4 \wedge e_5$

But what if that hydrogen atom were already there? Well, then we just get zero:

$a_2^\dagger (e_2 \wedge e_5) = e_2 \wedge e_2 \wedge e_5 = 0$

The creation operators have adjoints $a_i$ called **annihilation operators**, defined using the inner product on $\Lambda(\mathbb{C}^5)$ for which the basis we’re talking about is orthonormal:

$\langle \Psi, a_i \Phi \rangle = \langle a_i^\dagger \Psi, \Phi \rangle$

These have the effect of moving a given hydrogen atom away from the first carbon. For example:

$a_1 (e_1 \wedge e_3 \wedge e_5) = e_3 \wedge e_5$

If that hydrogen isn’t there, we get zero. For example:

$a_1 (e_2 \wedge e_3 \wedge e_5) = 0$

All this suggests a nice idea for the Hamiltonian of our system. Namely:

$H = \sum_{i = 1}^5 a_i + a_i^\dagger \qquad \qquad (\mathrm{guess} \; \#1)$

The $a_i$ terms take hydrogens off our carbon, and the $a_i^\dagger$ terms put them back on! Unfortunately this is wrong for several reasons.

For starters, in our ethyl cation problem, we want our carbon atom to have 2 or 3 hydrogens on it, not more or fewer. If it only has 2, we don’t want an annihilation operator to hit it and take away more! And if it has 3, we don’t want a creation operator to stick on more.

So, we need to think about how the space we’re really interested in sits inside the exterior algebra. There’s an inclusion

$\iota : \Lambda^2(\mathbb{C}^5) \oplus \Lambda^3(\mathbb{C}^5) \to \Lambda(\mathbb{C}^5)$

and the adjoint of this operator, namely

$\iota^\dagger : \Lambda(\mathbb{C}^5) \to \Lambda^2(\mathbb{C}^5) \oplus \Lambda^3(\mathbb{C}^5)$

is the orthogonal projection onto this subspace. So a better guess for our Hamiltonian is this:

$H = \iota^\dagger \; \left( \sum_{i = 1}^5 a_i + a_i^\dagger \right) \; \iota \qquad \qquad (\mathrm{guess}\; \#2)$

Sandwiching our original guess with $\iota$ and $\iota^\dagger$ this way keeps us in the space of states with 2 or 3 hydrogens. Now the annihilation operators will give zero if we’re in a state with only 2 hydrogens, and the creation operators will give zero if we’re in a state with 3.

But there’s still a problem with using an exterior algebra, which I have concealed up to now. By definition, the wedge product is ‘anticommutative’:

$e_i \wedge e_j = - e_j \wedge e_i$

So in fact, our creation operator sometimes introduces minus signs, like this:

$a_2^\dagger (e_1 \wedge e_3) = e_2 \wedge e_1 \wedge e_3 = -e_1 \wedge e_2 \wedge e_3$

I didn’t show you this before, because I wanted to trick you, or see if you were paying careful attention. We also sometimes get minus signs showing up in the annihilation operators. For example:

$a_2 (e_1 \wedge e_2 \wedge e_3) = - a_2 (e_2 \wedge e_1 \wedge e_3) = - e_1 \wedge e_3$

These minus signs are no good in probability theory! A probability distribution has to be a *nonnegative* linear combination of our basis vectors. I hoped for a while that we could weasel around this somehow, but I don’t think we can.

So in fact the exterior algebra is the wrong thing! Instead we want a different algebra, namely

$A = \mathbb{C}[e_1, \dots, e_5]/\langle e_1^2 = \cdots = e_5^2 = 0 \rangle$

This is the algebra of polynomials in 5 variables $e_1, \dots e_5$ modulo relations saying each variable gives zero when you square it. This algebra is *commutative*. So now we have

$e_i e_j = e_j e_i$

without minus signs, but we still have

$e_i e_i = 0$

Thus, we still have

$A = A_0 \oplus A_1 \oplus A_2 \oplus A_3 \oplus A_4 \oplus A_5$

where $A_p$ has a basis of consisting of products of $p$ different $e_i$‘s, arranged in increasing order. The dimension of $A_p$ matches that of $\Lambda^p(\mathbb{C}^5)$: they’re isomorphic as vector spaces, but not canonically so, thanks to those darn minus signs.

In short, $A$ is isomorphic to $\Lambda(\mathbb{C}^5)$ as a graded vector space, but not as an algebra. Nonetheless, we still have annihilation and creation operators defined almost as before:

$a_i^\dagger \Psi = e_i \, \Psi$

$\langle \Psi, a_i \Phi \rangle = \langle a_i^\dagger \Psi, \Phi \rangle$

where the inner product is chosen so that our basis is orthonormal. So everything works almost the same way. The Hamiltonian

$H = \iota^\dagger \; \left( \sum_{i = 1}^5 a_i + a_i^\dagger \right) \; \iota \qquad \qquad (\mathrm{guess}\; \#3)$

still makes sense, but now the nasty minus signs are gone! All the matrix elements of $H$ are nonnegative now, with respect to the basis we’re using.

Unfortunately, there’s still one more problem: this operator $H$ is self-adjoint, which is great for quantum mechanics, but it’s not infinitesimal stochastic, so $\exp(t H)$ won’t be stochastic.

Luckily this problem is easily fixed… and I think it’s the last problem. I think we just need to subtract 3 times the identity operator, much as we did when defining the Laplacian of the Desargues graph:

$H = \iota^\dagger \; \left( \sum_{i = 1}^5 a_i + a_i^\dagger \right) \; \iota - 3 I \qquad \qquad (\mathrm{guess}\; \#4)$

I claim that this operator is infinitesimal stochastic. Indeed I claim it’s just a disguised version of the graph Laplacian for the Desargues graph:

The vertices here correspond to the basis states $e_i \wedge e_j$ and $e_i \wedge e_j \wedge e_k$ ($i \lt j \lt k$). The edges correspond to ways an annihilation or creation operator can map one basis state to another. And we need a $-3 I$ term in our Hamiltonian because there are 3 edges coming out of each vertex.

**Puzzle 3**. Show that this is true: if we use the obvious isomorphism to identify $L^1(D)$ with the space $A^2 \oplus A^3$, the graph Laplacian of the Desargues graph equals

$H = \iota^\dagger \; \left( \sum_{i = 1}^5 a_i + a_i^\dagger \right) \; \iota - 3 I$

If I’m wrong, please let me know!

There’s much more to say, but I’ll be amazed if you’ve read this far, so let’s stop here. Anyway, if you ever actually see any iron pentacarbonyl or phosphorus pentachloride, you can now rejoice in the fact that it consists of many quadrillions of trigonal bipyramidal molecules, all doing pseudorotations, tracing out random walks on the Desargues graph.

One final thing. You may be curious as to why it’s called the ‘Desargues graph’. This name comes from Desargues’ theorem, a famous result in projective geometry. Suppose you have two triangles ABC and abc, like this:

Suppose the lines Aa, Bb, and Cc all meet at a single point, the ‘center of perspectivity’. Then the point of intersection of ab and AB, the point of intersection of ac and AC, and the point of intersection of bc and BC all lie on a single line, the ‘axis of perspectivity’! The converse is true too.

The **Desargues configuration** consists of all the actors in this drama:

• 10 points: A, B, C, a, b, c, the center of perspectivity, and the three points on the axis of perspectivity

and

• 10 lines: Aa, Bb, Cc, AB, AC, BC, ab, ac, bc and the axis of perspectivity

Given any configuration of points and lines, we can form its Levi graph by drawing a vertex for each point or line, and drawing an edge whenever a point lies on a line. And now for the punchline: Levi graph of the Desargues configuration is the ‘Desargues-Levi graph’!—or Desargues graph, for short.

But I don’t know how this is relevant to anything I’ve discussed. For now it’s just a tantalizing curiosity.

category: experiments