The Azimuth Project
Price Equation derivation and gripes

This page has been superceded by my thesis.


Biological evolution is the change over generations of the genetic composition of populations due to natural factors, typically including significant randomness. Describing this mathematically, and developing quantitative tools to predict what might evolve under which conditions, is a great challenge. One place to begin is by describing, in a nice way, a population’s change in genetic character from one generation to the next. By “a nice way”, we mean that we’d like to be able to attribute changes to the appropriate influences. What changes are due to random mutations creating new variations, for example, and what changes are due to natural selection winnowing out varieties which cannot survive in their environment?

We can make a crude measure of a population’s genetic composition by counting up how many organisms in the population have a certain gene of interest. We can express this amount as a percentage of the total population, saying, for example, “The frequency of gene A in this population is 0.22.”


In this section, we use the notation of van Veelen (2005), which unfortunately does not seem to be freely available online.

We consider two populations, S 1S_1 and S 2S_2. All the offspring of organisms in S 1S_1 belong to S 2S_2, and all the parents of organisms in S 2S_2 are in S 1S_1. We write NN for the size of population S 1S_1. For an individual iS 1i \in S_1, the frequency of gene A is

q i=g il z,q_i = \frac{g_i}{l_z},

where l zl_z is the zygotic ploidy. The frequency of gene A in population S 1S_1 is

Q 1= iS 1g il zN= iq iN.Q_1 = \frac{\sum_{i\in S_1} g_i}{l_z N} = \frac{\sum_i q_i}{N}.

We want to relate Q 2Q_2 and Q 1Q_1. One simple way to do so is to take their difference:

ΔQ=Q 2Q 1.\Delta Q = Q_2 - Q_1.

We can write Q 2Q_2 as

Q 2= ig il g iz i,Q_2 = \frac{\sum_i g_i'}{l_g\sum_i z_i},

where l gl_g is the gametic ploidy, z iz_i is the number of successful gametes from individual ii, and g ig_i' is the number of A-type genes in the set of all successful gametes from individual ii. The proportion of A-type genes in that set is

q i=g iz il g.q_i' = \frac{g_i'}{z_i l_g}.

From this,

Q 2= iz il gq il g iz i= iz iq i iz i.Q_2 = \frac{\sum_i z_i l_g q_i'}{l_g \sum_i z_i} = \frac{\sum_i z_i q_i'}{\sum_i z_i}.


ΔQ= iz iq i iz i1N iq i.\Delta Q = \frac{\sum_i z_i q_i'}{\sum_i z_i} - \frac{1}{N}\sum_i q_i.

We’d like our expression for the change in QQ to be written in terms of the changes in the individual q iq_i, so we subtract and add a sum over q iq_i:

ΔQ= iz i(q iq i) iz i+ iz iq i iz i1N iq i.\Delta Q = \frac{\sum_i z_i (q_i' - q_i)}{\sum_i z_i} + \frac{\sum_i z_i q_i}{\sum_i z_i} - \frac{1}{N}\sum_i q_i.

Next, we gather the last two terms over a common denominator:

ΔQ= iz i(q iq i) iz i+ iz iq i1N iq i jz j iz i.\Delta Q = \frac{\sum_i z_i (q_i' - q_i)}{\sum_i z_i} + \frac{\sum_i z_i q_i - \frac{1}{N}\sum_i q_i\sum_j z_j}{\sum_i z_i}.

Now, we factor an NN out of the latter term.

ΔQ= iz i(q iq i) iz i+N iz i[1N iz iq i1N 2 iq i jz j].\Delta Q = \frac{\sum_i z_i (q_i' - q_i)}{\sum_i z_i} + \frac{N}{\sum_i z_i} \left[\frac{1}{N}\sum_i z_i q_i - \frac{1}{N^2}\sum_i q_i\sum_j z_j\right].

We rearrange this just a bit to yield the Price Equation:

ΔQ=N iz i[1N iz iq i(1N iq i)(1N jz j)]+ iz i(q iq i) iz i.\boxed{\Delta Q = \frac{N}{\sum_i z_i} \left[\frac{1}{N}\sum_i z_i q_i - \left(\frac{1}{N}\sum_i q_i\right)\left(\frac{1}{N}\sum_j z_j\right)\right] + \frac{\sum_i z_i (q_i' - q_i)}{\sum_i z_i}.}

This is just an algebraic identity: we took the compositions of the two populations as given, and we wrote a fancy expression for the change of gene frequency between them. We have not said anything about dynamics from which this change could be derived, nor have we made any claims about what changes are more probable than others.

Van Veelen et al. (2012) make the point in the following way:
[[W]]hat is most important is that we realize that the numerical input of the Price equation is a list of numbers. It is a list that concerns two generations, and which tracks who is whose offspring. But whatever it reflects, it is crucial to realize that the point of departure is nothing but a list of numbers. This list of numbers is used twice. First we use it to compute the frequencies of the gene under consideration in generations 1 and 2, respectively, and subtract the latter from the former. This amounts to the change in gene frequency. Then we use the same list to compute a few other, slightly more complex quantities. The essence of the Price equation is that these quantities also add up to the change in gene frequency. One way of computing the change in frequency therefore can be rewritten as the other and vice versa. What they are, therefore, is nothing but two equivalent ways to compute the change in gene frequency, given a list of numbers concerning genes in two subsequent generations [][\ldots] Whether this particular second generation is likely to follow the first or not, the two ways of computing the change in frequency return the same number.

To make a physics analogy, what we have done is like starting with Newton’s second law, F=ma\vec{F} = m\vec{a}, and writing it as

ma=ma.m\vec{a} = m\vec{a}.

We could then rewrite the a\vec{a} vectors in some elaborate way. For example, we could write one side of the equation in Cartesian coordinates and the other in spherical coordinates, giving some complicated formulas involving trigonometric functions all over the place. These formulas would be true, in the sense that Euclidean geometry is true, but they would contain no physics. In some circumstances, they might be useful, but we could not wring value out of them without some extra assumptions about the dynamics at work.

From the Price Identity to Hamilton’s Rule

(leaving in telegraphic form for now)

Donors of effort increase the number of successful gametes produced by the recipient, at the expense of their own. We parameterize this in the following way: denote by cc a donor’s decrease in successful gametes of its own, and denote by bb the increase in successful gametes of the recipient. We idealize interactions as pairwise events, and so we keep track of them using matrices. The first index, ii, denotes an individual in population S 1S_1. The second index, α\alpha, ranges over the occasions on which interactions can take place.

ΔQ=N iz i[1N iz iq i(1N iq i)(1N jz j)]+ iz i(q iq i) iz i.\Delta Q = \frac{N}{\sum_i z_i} \left[\frac{1}{N}\sum_i z_i q_i - \left(\frac{1}{N}\sum_i q_i\right)\left(\frac{1}{N}\sum_j z_j\right)\right] + \frac{\sum_i z_i (q_i' - q_i)}{\sum_i z_i}.

We can be slightly more general and allow each individual to have their own ploidy, l il_i. So, instead of using the population size NN, we use il i\sum_i l_i. Following the literature, we calculate the number of successful gametes per haploid set, w i=z i/l iw_i = z_i / l_i.

ΔQ= il i il iw i[ il iw iq i il i( il iw i il i)( il iq il i)]+ il iw i(q iq i) il iw i.\Delta Q = \frac{\sum_i l_i}{\sum_i l_i w_i} \left[\frac{\sum_i l_i w_i q_i}{\sum_i l_i} - \left(\frac{\sum_i l_i w_i}{\sum_i l_i}\right)\left(\frac{\sum_i l_i q_i}{\sum l_i}\right)\right] + \frac{\sum_i l_i w_i (q_i' - q_i)}{\sum_i l_i w_i}.

We now make two assumptions:

  1. The second term in this form of the Price identity is negligible.
  2. The fitnesses z iz_i can be written
    z i=l iw i=f i+b αS iαc αQ iα.z_i = l_i w_i = f_i + b \sum_\alpha S_{i\alpha} - c\sum_\alpha Q_{i\alpha}.

    Here, αS iα\sum_\alpha S_{i\alpha} is the total number of times individual ii received a benefit, and αQ iα\sum_\alpha Q_{i\alpha} is the number of times individual ii incurred a cost.

We introduce the abbreviation

q¯= il iq i il i.\bar{q} = \frac{\sum_i l_i q_i}{\sum_i l_i}.

Dropping the last term of ΔQ\Delta Q and substituting in our chosen form for l iw il_i w_i, we arrive after some algebra at the following:

ΔQ=( i,αQ iα(q iq¯) il iw i)[( i,αS iα(q iq¯) i,αQ iα(q iq¯))bc].\Delta Q = \left(\frac{\sum_{i,\alpha} Q_{i\alpha}(q_i - \bar{q})}{\sum_i l_i w_i}\right) \left[\left(\frac{\sum_{i,\alpha} S_{i\alpha}(q_i - \bar{q})}{\sum_{i,\alpha} Q_{i\alpha}(q_i - \bar{q})}\right) b - c \right].

The quantity in square brackets has the form of Hamilton's condition, if we identify the quotient multiplying bb as a measure of assortment:

r= i,αS iα(q iq¯) i,αQ iα(q iq¯).r = \frac{\sum_{i,\alpha} S_{i\alpha}(q_i - \bar{q})}{\sum_{i,\alpha} Q_{i\alpha}(q_i - \bar{q})}.


category: blog