The Azimuth Project
Blog - network theory (part 19) (changes)

Showing changes from revision #14 to #15: Added | Removed | Changed

This page is a blog article in progress, written by John Baez and Jacob Biamonte. To see discussions of this article while it was being written, go to the Azimuth Forum. For the final polished version, visit the Azimuth Blog.

joint with Jacob Biamonte

It’s time to resume the network theory series! It’s been a long time, so we’ll assume you forgot everything we’ve said before, and make this post as self-contained as possible. Last time we started looking at a simple example: a diatomic gas.

Two atoms can recombine to form a diatomic molecule:

A+AA 2 A + A \to A_2

and conversely, a diatomic molecule can break apart into two atoms:

A 2A+A A_2 \to A + A

We can draw both these reactions using a Petri net:

where we’re writing BB instead of A 2A_2 to abstract away some detail that’s just distracting here. Or, equivalently, we can use a chemical reaction network:

Last time we looked at the rate equation for this chemical reaction network, and found equilibrium solutions of that equation. Now let’s look at the master equation, and find equilibrium solutions of that. This will serve as a review of three big theorems.

The master equation

We’ll start from scratch, in case you’re just tuning in. The master equation is all about how atoms or molecules or rabbits or wolves or other things interact randomly and turn into other things. So, let’s write write ψ m,n\psi_{m,n} for the probability that we have mm atoms of AA and nn molecule of BB in our container. These probabilities are functions of time, and master equation will say how they change.

First we need to pick a rate constant for each reaction. Let’s say the rate constant for this reaction is some number α>0\alpha > 0:

A+AB A + A \to B

while the rate constant for this reaction is some number β>0\beta > 0:

BA+A B \to A + A

Before we make it pretty using the ideas we’ve been explaining, the master equation says

ddtψ m,n(t)=α(m+2)(m+1)ψ m+2,n1(t)αm(m1)ψ m,n(t)+β(n+1)ψ m2,n+1(t)βnψ m,n(t) \displaystyle{ \frac{d}{d t} \psi_{m,n} (t)} = \alpha (m+2)(m+1)\psi_{m+2,n-1}(t) - \alpha m(m-1) \psi_{m,n}(t) + \beta (n+1) \psi_{m-2,n+1}(t) - \beta n \psi_{m,n}(t)

where we define ψ i,j\psi_{i,j} to be zero if either ii or jj is negative.

Yuck! Normally we don’t show you such nasty equations. Indeed the whole point of our work has been to show you that by packaging the equations in a better way, we can understand them using high-level concepts instead of mucking around with millions of scribbled symbols. But we thought we’d show you what’s secretly lying behind our beautiful abstract formalism, just once.

Each term has a meaning. For example, the first one:

α(m+2)(m+1)ψ m+2,n1(t) \alpha (m+2)(m+1)\psi_{m+2,n-1}(t)

means that the reaction A+ABA + A \to B will tend to increase the probability of there being mm atoms of AA and nn molecules of BB if we start with 2 more AA‘s and 1 fewer BB. This reaction can happen in (m+2)(m+1)(m+2)(m+1) ways if we start with m+2m+2 atoms of AA. And it happens at a probabilistic rate proportional to the rate constant for this reaction, α\alpha.

We won’t go through the rest of the terms. It’s a good exercise to do so, but there could easily be a typo in the formula, since it’s so long and messy. So let us know if you find one!

To simplify this mess, the key trick is to introduce a generating function that summarizes all the probabilities in a single power series:

Ψ= m,n0ψ m,ny mz n \Psi = \sum_{m,n \ge 0} \psi_{m,n} y^m \, z^n

It’s a power series in two variables, yy and zz, since we have two chemical species: AA‘s and BB’s.

Using this trick, the master equation looks like

ddtΨ(t)=HΨ(t) \frac{d}{d t} \Psi(t) = H \Psi(t)

where the Hamiltonian HH is a sum of terms, one for each reaction. This Hamiltonian is built from operators that annihilate and create AA‘s and BB’s. The annihilation and creation operators for AA atoms are:

a=y,a =y \displaystyle{ a = \frac{\partial}{\partial y} , \qquad a^\dagger = y }

The annihilation operator differentiates our power series with respect to the variable yy. The creation operator multiplies it by that variable. Similarly, the annihilation and creation operators for BB molecules are:

b=z,b =z \displaystyle{ b = \frac{\partial}{\partial z} , \qquad b^\dagger = z }

In Part 8 we explained a recipe that lets us stare at our chemical reaction network and write down this Hamiltonian:

H=α(b a 2a 2a 2)+β(a 2bb b) H = \alpha (b^\dagger a^2 - {a^\dagger}^2 a^2) + \beta ({a^\dagger}^2 b - b^\dagger b)

As promised, there’s one term for each reaction. But each term is itself a sum of two: one that increases the probability that our container of chemicals will be in a new state, and another that decreases the probability that it’s in its original state. We get a total of four terms, which correspond to the four terms in our previous way of writing the master equation.

Puzzle 1. Show that this new way of writing the master equation is equivalent to the previous one.

Equilibrium solutions

Now we will look for all equilibrium solutions of the master equation: in other words, solutions that don’t change with time. So, we’re trying to solve

HΨ=0 H \Psi = 0

Given the rather complicated form of the Hamiltonian, this seems tough. The challenge looks more concrete but perhaps more scary if we go back to our original formulation. We’re looking for probabilities ψ m,n\psi_{m,n}, nonnegative numbers that sum to one, such that

α(m+2)(m+1)ψ m+2,n1αm(m1)ψ m,n+β(n+1)ψ m2,n+1βnψ m,n=0 \alpha (m+2)(m+1)\psi_{m+2,n-1}- \alpha m(m-1) \psi_{m,n} + \beta (n+1) \psi_{m-2,n+1} - \beta n \psi_{m,n} = 0

for all mm and nn.

This equation looks rather horrid, but the good news is that it’s linear, so a linear combination of solutions is again a solution. This lets us simplify the problem using a conserved quantity.

Clearly, there’s a quantity that the reactions here don’t change:

What’s that? It’s the number of AA‘s plus twice the number of BB’s. After all, a BB can turn into two AA’s, or vice versa.

(Of course the secret reason is that BB is a diatomic molecule made of two AA‘s. But you’d be able to follow the logic here even if you didn’t know that, just by looking at the chemical reaction network… and sometimes this more abstract approach is handy! Indeed, the way chemists first discovered that certain molecules are made of certain atoms is by seeing which reactions were possible and which weren’t.)

Suppose we start in a situation where we know for sure that the number of BB‘s plus twice the number of AA’s equals some number kk:

ψ m,n=0unlessm+2n=k \psi_{m,n} = 0 \; unless \; m+2n = k

Then we know Ψ\Psi is initially of the form

Ψ= m+2n=kψ m,ny mz n \Psi = \sum_{m+2n = k} \psi_{m,n} y^m z^n

But since the number of AA‘s plus twice the number of BB’s is conserved, if Ψ\Psi obeys the master equation it will continue to be of this form!

Put a fancier way, we know that if a solution of the master equation starts in this subspace:

L k={Ψ:Ψ= m+2n=kψ m,ny mz nforsomeψ m,n} L_k = \{ \Psi: \; \Psi = \sum_{m+2n = k} \psi_{m,n} y^m z^n \; for \; some \; \psi_{m,n} \}

it will stay in this subspace. So, because the master equation is linear, we can take any solution Ψ\Psi and write it as a linear combination of solutions Ψ k\Psi_k, one in each subspace L kL_k.

In particular, we can do this for an equilibrium solution Ψ\Psi. And then all the solutions Ψ k\Psi_k are also equilibrium solutions: they’re linearly independent, so if one of them changed with time, Ψ\Psi would too.

This means we can just look for equilibrium solutions in the subspaces L kL_k. If we find these, we can get all equilibrium solutions by taking linear combinations.

Once we’ve noticed that, our horrid equation makes a bit more sense:

α(m+2)(m+1)ψ m+2,n1αm(m1)ψ m,n+β(n+1)ψ m2,n+1βnψ m,n=0 \alpha (m+2)(m+1)\psi_{m+2,n-1}- \alpha m(m-1) \psi_{m,n} + \beta (n+1) \psi_{m-2,n+1} - \beta n \psi_{m,n} = 0

Note that if the pair of subscripts m,nm, n obey m+2n=km + 2n = k, the same is true for the other pairs of subscripts here. So our equation relates the values of ψ m,n\psi_{m,n} for all choices of m,nm,n lying on this line segment:

2m+n=k,m,n0 2m+n = k , \qquad m ,n \ge 0

If you think about it a minute, you’ll see that if we know two of these values, we can keep using our equation to recursively work out all the rest. So, there are at most two linearly independent equilibrium solutions of the master equation in each subspace L kL_k.

Why at most two? Why not two? Well, we have to be a bit careful about what happens at the ends of the line segment: remember that ψ m,n\psi_{m,n} is defined to be zero when mm or nn becomes negative. If we think very hard about this, we’ll see there’s just one linearly independent equilibrium solution of the master equation in each subspace L kL_k. But this is the sort of nitty-gritty calculation that’s not fun to watch someone else do, so we won’t bore you with that.

Soon we’ll move on to a more high-level approach to this problem. But first, one remark. Our horrid equation

α(m+2)(m+1)ψ m+2,n1αm(m1)ψ m,n+β(n+1)ψ m2,n+1βnψ m,n=0 \alpha (m+2)(m+1)\psi_{m+2,n-1}- \alpha m(m-1) \psi_{m,n} + \beta (n+1) \psi_{m-2,n+1} - \beta n \psi_{m,n} = 0

resembles the usual discretized form of the equation

d 2ψdx 2=0 \displaystyle {\frac{d^2 \psi}{d x^2} = 0 }

namely:

ψ n12ψ n+ψ n+1=0 \psi_{n-1} - 2 \psi_{n} + \psi_{n+1} = 0

And this makes sense, since we get

d 2ψdx 2=0 \displaystyle {\frac{d^2 \psi}{d x^2} = 0 }

by taking the heat equation:

ψt= 2ψx 2 \displaystyle \frac{\partial \psi}{\partial t} = {\frac{\partial^2 \psi}{\partial x^2} }

and assuming ψ\psi doesn’t depend on time. So what we’re doing is a lot like looking for equilibrium solutions of the heat equation.

This makes perfect sense, since the heat equation describes how heat smears out as little particles of heat randomly move around. True, there don’t really exist ‘little particles of heat’, but the heat equation also describes the diffusion of any other kind of particles as they randomly move around undergoing Brownian motion. Similarly, our master equation describes a random walk on this line segment:

m+2n=k,m,n0m+2n = k , \qquad m , n \ge 0

or more precisely, the points on this segment with integer coordinates. The equilibrium solutions arise when the probabilities ψ m,n\psi_{m,n} have diffused as much as possible.

If you think about it this way, it should be physically obvious that there’s just one linearly independent equilibrium solution of the master equation for each value of kk.

There’s a general moral here, too, which we’re seeing in a special case: the master equation for a chemical reaction network really describes a bunch of random walks, one for each allowed value of the conserved quantities that can be built as linear combinations of number operators. In our case we have one such conserved quantity, but in general there may be more (or none). Futhermore, these ‘random walks’ are what we’ve been calling Markov processes.

Noether's theorem

We simplified our task of finding equilibrium solutions of the master equation by finding a conserved quantity. The idea of simplifying problems using conserved quantities is fundamental to physics: this is why physicists are so enamored with quantities like energy, momentum, angular momentum and so on.

Nowadays physicists often use ‘Noether’s theorem’ to get conserved quantities from symmetries. There’s a very simple version of Noether’s theorem for quantum mechanics, but in Part 11 we saw a version for stochastic mechanics, and it’s that version that is relevant now. Here’s a paper which explains it in detail:

• John Baez and Brendan Fong, Noether’s theorem for Markov processes.

We don’t really need Noether’s theorem now, since we found the conserved quantity and exploited it without even noticing the symmetry. Nonetheless it’s interesting to see how it relates to what we’re doing.

For the reaction we’re looking at now, the idea is that the subspaces L kL_k are eigenspaces of an operator that commutes with the Hamiltonian HH. It follows from standard math that a solution of the master equation that starts in one of these subspaces, stays in that subspace.

What is this operator? It’s built from ‘number operators’. The number operator for AA‘s is

N A=a a N_A = a^\dagger a

and the number operator for BB‘s is

N B=b b N_B = b^\dagger b

A little calculation shows

N Ay mz 2 n=my mz n,N By mz n=my mz n N_A \,y^m z_2^n = m \, y^m z^n, \quad \qquad N_B\, y^m z^n = m \,y^m z^n

so the eigenvalue of N AN_A is the number of AA‘s, while the eigenvalue of N BN_B is the number of BB’s. This is why they’re called number operators.

As a consequence, the eigenvalue of the operator N A+2N BN_A + 2N_B is the number of AA‘s plus twice the number of BB’s:

(N A+2N B)y mz n=(m+2n)y mz n (N_A + 2N_B) \, y^m z^n = (m + 2n) \, y^m z^n

Let’s call this operator OO, since it’s so important:

O=N A+2N B O = N_A + 2N_B

If you think about it, the spaces L kL_k we saw a minute ago are precisely the eigenspaces of this operator:

L k={Ψ:OΨ=kΨ} L_k = \{ \Psi : \; O \Psi = k \Psi \}

As we’ve seen, solutions of the master equation that start in one of these eigenspaces will stay there. This lets take some techniques that are very familiar in quantum mechanics, and apply them to this stochastic situation.

First of all, time evolution as described by the master equation is given by the operators exp(tH)\exp(t H). In other words,

ddtΨ(t)=HΨ(t)andΨ(0)=ΦΨ(t)=exp(tH)Φ \displaystyle{ \frac{d}{d t} \Psi(t) } = H \Psi(t) \; and \; \Psi(0) = \Phi \quad \Rightarrow \quad \Psi(t) = \exp(t H) \Phi

Thus if Φ\Phi is an eigenvector of OO, so is exp(tH)Φ\exp(t H) \Phi, with the same eigenvalue. In other words,

OΦ=kΦ O \Phi = k \Phi

implies

Oexp(tH)Φ=kexp(tH)Φ=exp(tH)OΦ O \exp(t H) \Phi = k \exp(t H) \Phi = \exp(t H) O \Phi

But since we can choose a basis consisting of eigenvectors of OO, we must have

Oexp(tH)=exp(tH)O O \exp(t H) = \exp(t H) O

or, throwing caution to the winds and differentiating:

OH=HO O H = H O

So, as we’d expect from Noether’s theorem, our conserved quantity commutes with the Hamiltonian! This in turn implies that HH commutes with any polynomial in OO, which in turn suggests that

exp(sO)H=Hexp(sO) \exp(s O) H = H \exp(s O)

and also

exp(sO)exp(tH)=exp(tH)exp(sO) \exp(s O) \exp(t H) = \exp(t H) \exp(s O)

The last equation says that OO generates a 1-parameter family of ‘symmetries’ exp(sO)\exp(s O): operators that commute with time evolution. But what do these symmetries actually do? Since

Oy mz n=(m+2n)y mz n O y^m z^n = (m + 2n) y^m z^n

we have

exp(sO)y mz n=e s(m+2n)y mz n \exp(s O) y^m z^n = e^{s(m + 2n)}\, y^m z^n

So, this symmetry takes any probability distribution ψ m,n\psi_{m,n} and multiplies it by e s(m+2n)e^{s(m + 2n)}.

In other words, our symmetry multiplies the relative probability of finding our container of gas in a given state by a factor of e se^s for each AA atom, and by a factor of e 2se^{2s} for each BB molecule. It might not seem obvious that this operation commutes with time evolution! However, experts on chemical reaction theory are familiar with this fact.

Finally, a couple of technical points. Starting where we said ‘throwing caution to the winds’, our treatment has not been rigorous, since OO and HH are unbounded operators, and these must be handled with caution. Nonetheless, all the commutation relations we wrote down are true.

It’s also true that exp(sO)\exp(s O) is unbounded for positive ss. It’s bounded for negative ss, but even then doesn’t map probability distributions to probability distributions. However, it does map any nonzero vector Ψ\Psi with ψ m,n0\psi_{m,n} \ge 0 to a vector exp(sO)Ψ\exp(s O) \Psi with the same properties. So, we can just normalize this vector and get a probability distribution. This normalization is why we introduced the concept of relative probabilities.

The Anderson-Craciun-Kurtz theorem

Now we’ll actually find solutions of the master equation in closed form. To understand this final section, you really do need to remember some things we’ve discussed earlier. Last time we considered the same chemical reaction network we’re studying today, but we looked at its rate equation. Our notation for the rate constants was different then—sorry—but if we call those constants α\alpha and β\beta as we’re doing now, the rate equation looks like this:

ddtx 1=2αx 22βx 1 2 \displaystyle{ \frac{d}{d t} x_1 = 2 \alpha x_2 - 2 \beta x_1^2}
ddtx 2=αx 2+βx 1 2 \displaystyle{ \frac{d}{d t} x_2 = - \alpha x_2 + \beta x_1^2 }

This describes how the number of AA‘s and BB’s changes in the limit where there are lots of them and we can treat them as varying continuously, in a deterministic way. The number AA’s is x 1x_1, and the number of BB’s is x 2x_2.

We saw that the quantity

x 1+2x 2 x_1 + 2 x_2

is conserved, just as now we’re seeing that N A+2N BN_A + 2 N_B is conserved. We saw that the rate equation has one equilibrium solution for each choice of x 1+2x 2x_1 + 2 x_2. And we saw that these equilibrium solutions obey

x 1 2x 2=αβ \frac{x_1^2}{x_2} = \frac{\alpha}{\beta}

Now the Anderson-Craciun-Kurtz theorem, introduced in Part 9, is a powerful result that gets equilibrium solution of the master equation from equilibrium solutions of the rate equation. It only applies to equilibrium solutions that are ‘complex balanced’, but that’s okay:

Puzzle 2. Show that the equilibrium solutions of the rate equation for the chemical reaction network

are complex balanced.

So, given any equilibrium solution (x 1,x 2)(x_1,x_2) of our rate equation, we can hit it with the Anderson-Craciun-Kurtz theorem and get an equilibrium solution of the master equation! And it looks like this:

Ψ=e (x 1+x 2) m,n0x 1 mx 2 nm!n!y mz n \displaystyle{ \Psi = e^{-(x_1 + x_2)} \, \sum_{m,n \ge 0} \frac{x_1^m x_2^n} {m! n! } \, y^m z^n }

In this solution, the probability distribution

ψ m,n=e (x 1+x 2)x 1 mx 2 nm!n! \psi_{m,n} = e^{-(x_1 + x_2)} \, \frac{x_1^m x_2^n} {m! n! }

is a product of Poisson distributions. The factor in front is there to make the numbers ψ m,n\psi_{m,n} add up to one. And remember, x 1,x 2x_1, x_2 are any nonnegative numbers with x 1 2/x 2=α/βx_1^2/x_2 = \alpha/\beta.

So, using the Anderson-Craciun-Kurtz theorem, we get an explicit closed-form solution of the horrid equation

α(m+2)(m+1)ψ m+2,n1αm(m1)ψ m,n+β(n+1)ψ m2,n+1βnψ m,n=0 \alpha (m+2)(m+1)\psi_{m+2,n-1}- \alpha m(m-1) \psi_{m,n} + \beta (n+1) \psi_{m-2,n+1} - \beta n \psi_{m,n} = 0

with very little work! This is why this theorem is so nice. And of course we’re looking at a very simple reaction network: for more complicated ones it becomes even better to use this theorem to avoid painful calculations.

category: blog