The Azimuth Project
Power law

Power law

Idea

A dependent variable is said to follow a power law if it has the form Cv βC v^\beta. It is unfortunately sometimes used as an abbreviation of “power law distribution”, i.e., a probability distribution which follows a power law. The term “power law” is also sometimes used when what is really meant is just a heavy tailed distribution?.

Details

The basic form of a power law relating some variable vv to an observed quantity A(v)A(v) is

A(v)=Cv βA(v)= C v^\beta

where in practice modelling real data the equality becomes approximation. One particularly common variety of power law is where AA denotes frequency of occurrence. This is one model which has the property that “bigger things are rarer” whilst retaining significant mass in the tail. Another property is scale invariance, for example, that the ratio of AA values for a pair of vv values depends only on the vv values ratio and not their exact values.

Examples of kinds of curves corresponding to power laws are shown below (with a linear function for comparison).

plot of typical power law curves

A power law is in contrast to other models such as a Gaussian distribution (which has a large proportion of the population close to the mean) or a uniform distribution (where any value in the distribution is equally “typical”).

Generating processes

One of the reasons that whether a function is a power law is interesting is that various sophisticated processes from physics and mathematics are known to generate power laws. Thus finding a power law in a new field may motivate work on the development of new causal models using elements from these known models. However, there are two important caveats:

  1. Many quite different processes generate power laws.

  2. From a prioritization perspective, it is important to be sure of that the empirical fit to a power law obtained is strong statistical evidence that it is a power law (see section Finding/validating power laws).

Arguably, it is more promising to start with a generating process that can be validated directly and then determine if this process gives rise to a power law.

Similarly, if the only “use” of the fitted power law is to argue that it is important to know the distribution is heavy tailed? this can be inferred directly without the fitted function as an intermediary. Likewise, if the function is to be used for black-box prediction it is debatable whether non-parametric density estimation isn’t a better approach.

Finding/validating power laws

Many quantities arising from complex processes, such as ecological ones, can appear to empirically fit power laws reasonably well. However, it is important to note that, particularly when dealing with discrete phenomena with an upper-bound cut-off on samples obtained, many other analytic characterisations generate sample data which produces the same kind of curve. Thus to rigorously validate a “proposed” power law is difficult.

In particular, it appears that the some of the fitting procedures used in the literature are very problematic. After converting the data into the double logarithmic logA\log A-logv\log v representation these are:

  1. Data binning? to reduce the data volume: Although slightly better than binning before log-transformation, considering data set sizes and modern computing power there is no need to do this, and it adds significantly to the error in the results. In addition, doing this “artificially inflates” the R 2R^2 measure of the fit.

  2. Using some variety of least squares? to fit a line through the remaining points: This is very dubious since almost all the modeling assumptions about least squares fitting do not apply in this set-up. Even when the data is generated by a power law, poor parameter estimates result.

  3. Using R 2R^2 as a goodness of fit criterion for the power law: Other measures of goodness of fit against the original data against the fitted power law are much more discriminatory than the R 2R^2 value of the least squares line fit.

In contrast to this approach, better statistcial fitting techniques exist based upon applying a maximum likelihood estimator to the full dataset. These are perfectly computationally feasible and have been known for 60 years. There are also techniques for more convincing comparison against other simple candidate distributions – e.g., log-normal – to determine which is the better fit.

The references to Clauset, Shalizi and Newman’s work describes both these problems and their solutions. It also analyses 24 instances of claimed power laws in the scientific literature and finds truly compelling evidence for a power law in one case.

Examples

Some typical examples of properties which appear to be heavy tailed and have been claimed to be power laws are:

vvA(v)A(v)
city populationfrequency
monetary wealthfrequency
stellar massfrequency
animal massmetabolic rate
flying animal body massoptimal cruising speed

References