The Pareto front (or Pareto frontier) is a framework for partially evaluting a set of “actions” with multi-dimensional outputs assuming a very weak “desirability” partial ordering which only applies only when one processes is better (or at least as good) for all the outputs. It is useful for reducing a set of candidates prior to further analysis.
Suppose we’re trying to acheive something which is can be parameterised by a vector $\vec{v}=(v_0,\dots,v_N)$ of real-valued output variables, where for each variable a lower value is “better”. (Note that by negating “obvious variables” we can convert to a problem where for each variable a lower value is better than a larger value. Also clearly the whole framework can be used uniformly exchanging lower for higher in comparisons.)
As an example, consider a very, very simplified view of “improving the environment”. This might have the following variables
var | interpretation |
---|---|
$v_0$ | money expended |
$v_1$ | decrease in average human standard of living |
$v_2$ | expected number of tornadoes |
$v_3$ | number of species going extinct |
… | … |
For any particular action we can assign it a value for each of these variables (or at least a value estimated from a model).
It would be generally agreed that for each variable a lower value is better, but the framework hasn’t assigned relative importances to each variable (e.g., we haven’t specified how much money it is worth spending to save 1 species, or even that this value is constant as the number of species saved increases). Just about the only ordering that can be done without further specification is
Then given a set $S$ of possible actions (which might be a process, a technology, etc) one can discard any actions which are strictly dominated by another element of $S$ to get a reduced set of actions $S'$ which is called the Pareto front. This name is inspired by the way that, in $N$-dimensions, the remaining points are the “outer shell” of $S$ with the discarded points on the interior. This is shown in $2$-D in the following plot:
(Image from wikipedia page referenced below.) Another description of the Pareto front is as the set of minimal elements under the partial order $\prec$.
The intuitive meaning of the Pareto front is that elements which are not on the front are never the best choice, because there’s some element on the front which is at least as good on every variable. In contrast, elements which are on the front could be the best choice, depending how the user decides to trade-off the weights of the various variables.
(Note that this page describes a small subset of the ideas and mathematics based on Pareto efficiency which are relevant to Azimuth. Further details can be found on the wikipedia page.)
The obvious way of approaching dealing with multidimensional outputs would be to define a scalar score function $s:\mathbb{R}^N \rightarrow \mathbb{R}$ and then use a standard optimisation technique to find the best element of $S$ (which is possibly non-unique). This has two problems:
If multiple interested parties are performing the analysis, they are likely to disagree on the precise form of $s$, blocking any progress.
Even with only one analyser, it is often difficult to formalise scalar mapping that reflects all the different “intuitive values”. In particular, experience in other fields of use is that often the optimal $s(\vec{v})$ will be at some extreme point of the space that is, upon reflection of the “intuitive values”, actually a poor choice. For example, given a numerical weighting between money saved and species saved, a scalar optimisation function may have a minimum of no money saved and no species saved. In contrast, the Pareto front gives a reduced set of possible candidates which can be analysed further.
Visualising the Pareto front is obviously very informative, but for greater than 3 dimensions this isn’t directly possible. Are there good ways to project to a viewable dimension preserving useful features?
The Pareto front is one way to make “partial decisions” based on weak ordering: are there other ways that are also useful for decision making.
There are other mathematical optimisation problems, e.g., linear programming, with other attractive properties. Are there any connections/extensions to other problems?