Blog - Quantum superposition

This page is a blog article in progress, written by Piotr Migdał. It is a continuation of the Quantum network theory series at Azimuth Blog. To read discussions of this post as it was being written, visit the Azimuth Forum. To see the final polished article, go to the Azimuth Blog.

This post introduces quantum superposition and role of its coherence. A following post will introduce (classical) community detection and how can be it applied to cluster “quantum” subsystems of a large system, for instance - a light harvesting complex.

For classical community detection:

- I will introduce the idea (it will be even easier, and more visual, that this quantum intro),
- motivate Newman’s modularity,
- show some examples,
- provide an interactive simulation (a very preliminary version here).

For quantum community detection (based on arXiv:1310.663, in PRX):

- introduce coherence between sites as a tool for community detection,
- show that it is something quantitively different that classical community detection (e.g. not only links, but also phases matter),
- show LHCII as an example.

To comment on this blog post series, go to the discussion on the Azimuth Forum.

*guest post by Piotr Migdał*

In this blog post I will introduce some basics of quantum mechanics, with the emphasis on why a particle being in a few places at once behaves measurably differently from a particle whose position we just don’t know. It’s a kind of continuation of the “Quantum Network Theory” series (Part 1, Part 2) by Tomi Johnson about our work in Jake Biamonte’s group at the ISI Foundation in Turin. My goal is to explain quantum community detection. Before that, I need to introduce the relevant basics of quantum mechanics, and of the classical community detection.

But before I start, let me introduce myself, as it’s my first post to Azimuth.

I just finished my quantum optics theory Ph.D in Maciej Lewenstein’s group at The Institute of Photonic Sciences in Castelldefels, a beach near Barcelona. My scientific interests range from quantum physics, through complex networks, to data-driven approach…. to pretty much anything—and now I work as a data science freelancer. I do enjoy doing data visualizations (for example of relations between topics in mathematics), I am a big fan of Rényi entropy (largely thanks to Azimuth) and a believer in open science. If you think that there are too many off-topic side projects, you are absolutely right!

In my opinion, quantum mechanics is easy. Based on my gifted education experience it takes roughly 9 intense hours to introduce entanglement to students having only a very basic linear algebra background. Even more - I believe that it is possible to get familiar with quantum mechanics only by playing with it - so I am developing a Quantum Game!

In quantum mechanics a particle can be in a few places at once. It sounds strange. So strange, that some pioneers of quantum mechanics (including, famously, Albert Einstein) didn’t want to believe in it: not because of any disagreement with experiment, not because of any lack of mathematical beauty, just because it didn’t fit their philosophical view of physics.

It went further: in the Soviet Union the idea that electron can be in many places (resonance bonds) was considered to oppose materialism. Later, in California, hippies investigated quantum mechanics as a basis for parapsychology—which, arguably, gave birth to the field of quantum information.

As Griffiths put it in his *Introduction to Quantum Mechanics* (Chapter 4.4.1):

To the layman, the philosopher, or the classical physicist, a statement of the form “this particle doesn’t have a well-defined position”

`[...]`

sounds vague, incompetent, or (worst of all) profound. It is none of these.

In this guest blog post I will try to show that not only can a particle be in many places at once, but also that if it were not in many places at once then it would cause problems. That is, as fundamental phenomena as atoms forming chemical bonds, or particle moving in the vacuum, require it.

As in many other cases, the simplest non-trivial case is perfect for explaining idea, as it covers the most important phenomena, while being easy to analyze, visualize and comprehend. Quantum mechanics is not an exception—let us start with a system of two states.

Let us study a simplified model of the hydrogen molecular ion $H_2^+$, that is, a system of two protons and one electron (see Feynman Lectures on Physics, Vol. III, Chapter 10.1). Since the protons are heavy and slow, we treat them as fixed. We focus on the electron moving in the electric field created by protons.

In quantum mechanics each possible configuration is weighted by a complex number called a ‘probability amplitude’. In particular, for the state of the electron can be described as a complex, two-dimensional vector:

$|\psi\rangle =
\begin{bmatrix}
\alpha \\ \beta
\end{bmatrix}.$

Note that

$\psi \rangle = \alpha \begin{bmatrix}
1 \\ 0
\end{bmatrix} + \beta \begin{bmatrix}
0 \\ 1
\end{bmatrix}.$

So, we say the electron is in a ‘linear combination’ or ‘superposition’ of the two states

$|1\rangle =
\begin{bmatrix}
1 \\ 0
\end{bmatrix},$

(where it’s near the first proton) and the state

$|2\rangle =
\begin{bmatrix}
0 \\ 1
\end{bmatrix}.$

(where it’s near the second proton). The complex numbers $\alpha$ and $\beta$ are amplitudes of being in the respective states.

Why do we denote unit vectors in strange brackets looking like

$| something \rangle$ ?

Well, this is called Dirac notation (or bra-ket notation) and it is immensely useful in quantum mechanics. We won’t go into it in detail here; merely note that $| \cdot \rangle$ stands for a column vector and $\langle \cdot |$ stands for a row vector, while $\psi$ is a traditional symbol for a quantum state.).

Amplitudes can be thought as ‘square roots’ of probabilities. We can force electron to localize by performing a classical measurement, for example by moving protons away and measuring which of them has neutral charge (for being coupled with the electron). Then, we get probability $|\alpha|^2$ of finding it near to the first proton and $|\beta|^2$ —near to the second. So, we require that

$|\alpha|^2 + |\beta|^2 = 1.$

Note that as amplitudes are complex, for a given probability there are many possible amplitudes. For example

$1 = |1|^2 = |-1|^2 = |i|^2 = \left| \tfrac{1+i}{\sqrt{2}} \right|^2 = ...,$

where $i$ is the imaginary unit (fulfilling $i^2 = -1$).

We will now show that the electron ‘wants’ to be spread out. Electrons don’t really have desires, so this is physics slang for saying that the electron will have less energy if its probability of being near the first proton is equal to its probability of being near the second proton: namely, 50%.

In quantum mechanics, a Hamiltonian is a matrix that describes the relation between the energy and evolution (i.e. how the state changes in time). The energy of any state $| \psi \rangle$ is

$E = \langle \psi | H | \psi \rangle,$

where $\langle \psi |$ is $| \psi\rangle$ after transposition and complex conjugation (i.e. changing imaginary unit $i$ into $-i$).

For the electron in the $H_2^+$ molecule the Hamiltonian can be written as the following $2 \times 2$ matrix with real, positive entries:

$H =
\begin{bmatrix}
E_0 & \Delta \\
\Delta & E_0
\end{bmatrix},$

where $E_0$ is the energy of the electron being either in state $|1\rangle$ or state $|2\rangle$, and $\Delta$ is the ‘tunneling amplitude’ (meaning that it describes how easy it is for the electron to move from neighborhood of one proton to that of the other).

The energy of a given state is given by a quantity called the ‘expectation value’ of $H$ on the electron state $|\psi\rangle$:

$E = \langle \psi | H | \psi \rangle \equiv
\begin{bmatrix}
\alpha^* & \beta^*
\end{bmatrix}
\begin{bmatrix}
E_0 & \Delta \\
\Delta & E_0
\end{bmatrix}
\begin{bmatrix}
\alpha \\ \beta
\end{bmatrix}.$

The star symbol denotes the complex conjugation. If you are unfamiliar with complex numbers, just work with real numbers on which this operation does nothing.

**Exercise 1.** Find $\alpha$ and $\beta$ with

$|\alpha|^2 + |\beta|^2 = 1$

that minimize or maximize the expectation value of energy $\langle \psi | H | \psi \rangle$ for

$|\psi\rangle =
\begin{bmatrix}
\alpha \\ \beta
\end{bmatrix}.$

**Exercise 2.** What’s the expectation value value of the energy for the states $| 1 \rangle$ and $| 2 \rangle$?

Or if you are lazy, just read the answer! It is straightforward to check that

$E = (\alpha^* \alpha + \beta^* \beta) E_0 + (\alpha^* \beta + \beta^* \alpha) \Delta.$

The coefficient of $E_0$ is 1, so the minimal energy is $E_0 - \Delta$ and maximal one is $E_0 + \Delta$. The states achieving these energy are spread out:

$| \psi_- \rangle =
\begin{bmatrix}
1/\sqrt{2} \\ -1/\sqrt{2}
\end{bmatrix},
\quad \text{with} \quad
\quad E = E_0 - \Delta,$

and

$| \psi_+ \rangle =
\begin{bmatrix}
1/\sqrt{2} \\ 1/\sqrt{2}
\end{bmatrix},
\quad \text{with} \quad
\quad E = E_0 + \Delta.$

The energies of these states are below and above the energies of being in one state $E_0$, respectively, and $\Delta$ says how much.

So, the electron is ‘happier’ (electrons don’t have moods either) to be in the state $|\psi_-\rangle$ than to be localized near only one of the protons. In other words—and this is Chemistry 101—atoms like to share electrons and it bonds them. Also, they like to share electrons in a particular and symmetric way.

For reference, state $|\psi_+ \rangle$ is called antibonding state. If it is in this state, atoms will get repelled from each other—and so much for the molecule!

How can we tell a difference between an electron being in a superposition between two states, and just not knowing its ‘real’ position? Well, first we need to devise a way to describe probabilistic mixtures.

It looks simple - if we have an electron in the state $|1\rangle$ or $|2\rangle$ with probabilities $1/2$, we may be tempted to write

$|\psi\rangle = \tfrac{1}{\sqrt{2}} |1\rangle + \tfrac{1}{\sqrt{2}} |2\rangle.$

As we are getting the right probabilities, it looks legit. But there is something strange about the energy. We have obtained the state $|\psi_+\rangle$, with $E=E_0+\Delta$, while mixing states, each with the energy $E=E_0$!

Moreover, we could have used different amplitudes, such that $|\alpha|^2=|\beta|^2=1/2$ each with different energy. So, we need to devise a way, in which we can avoid guessing amplitudes. All in all, we used quotation marks for ‘square roots’ for a reason!

It turns out that to describe statistical mixtures we can use density matrices.

For a state described by a vector (so-called pure state):

$| \psi \rangle =
\begin{bmatrix}
\alpha \\ \beta
\end{bmatrix}.$

we create a 2×2 matrix

$\rho = | \psi \rangle \langle \psi |
\equiv
\begin{bmatrix}
\alpha \alpha^* & \alpha \beta^*\\
\beta \alpha^* & \beta \beta^*
\end{bmatrix}.$

On the diagonal we get probabilities ( $|\alpha|^2$ and $|\beta|^2$), whereas the off-diagonal terms ($\alpha \beta^*$ and its complex conjugation) are related to the presence of quantum effects. For example, for $|\psi_-\rangle$ we get

$\rho =
\begin{bmatrix}
1/2 & -1/2\\
-1/2 & 1/2
\end{bmatrix}.$

For an electron in the state $|1\rangle$ we get

$\rho =
\begin{bmatrix}
1 & 0\\
0 & 0
\end{bmatrix}.$

To calculate the energy, the recipe is the following:

$E = \tr[H \rho],$

where $\tr$ (trace) is the sum of the diagonal entries. That is, for a $n \times n$ square matrix with entries $A_{ij}$ its trace is $A_{11} + A_{22} + \ldots + A_{nn}$.

**Exercise 3.** Show that this formula for energy, and the previous one, give the same result on pure states.

I advertised that density matrices allow us to mix quantum states. How do they do that? Very simple, just by adding density matrices, multiplied by the respective probabilities:

$\rho = p_1 \rho_1 + p_2 \rho_2 + \ldots + p_n \rho_n.$

It is exactly how we would mix probability vectors. Indeed, the diagonals are probability vectors!

So, let’s say that our co-worker was drunk and we are not sure if (s)he said that the state is $|\psi_-\rangle$ or $|1\rangle$. However, we think that the probabilities are $1/3$ and $2/3$. We get the density matrix:

$\rho =
\begin{bmatrix}
5/6 & -1/6\\
-1/6 & 1/6
\end{bmatrix}.$

So, how about its energy?

**Exercise 4.** Show that calculating energy using density matrix gives the same result as averaging energy over component pure states.

I may have given the impression that density matrix is an artificial thing, at best—a practical trick, and what we ‘really’ have are pure states (vectors), each with a given probability. If so, the next exercise is for you:

**Exercise 5.** Show that a 50%-50% mixture of $|1\rangle$ and $|2\rangle$ is the same as a 50%-50% mixture of $|\psi_+\rangle$ and $|\psi_-\rangle$.

This is different than statistical mechanics, or statistics, where we can always think about probability distributions as uniquely defined statistical mixtures of possible states. Here, as we see, it can be a bit more tricky.

As we said, for the diagonals things work as for classical probabilities. But there is more—at the same time as adding probabilities we also add the off-diagonal terms, which can add up to cancel, depending on their signs. It’s why it’s mixing quantum states may make them losing their quantum properties.

The value of the off-diagonal term is related to so-called ‘coherence’ between the states $|1\rangle$ and $|2\rangle$. Its value is bounded by the respective probabilities:

$\left| \rho_{12} \right| \leq \sqrt{\rho_{11}\rho_{22}} = \sqrt{p_1 p_2},$

where for pure states we get equality.

If the value is zero, there are no quantum effects between two positions: this means that the electron is sure to be at one place or the other, though we might be uncertain at which place. This is fundamentally different to coherent superposition (non-zero $\rho_{12}$), where we are uncertain at which site a particle is, but it can no longer be thought to be at one site *or* the other, but must be in some way associated with both simultaneously.

**Exercise 6.** For each $c \in [-1,1]$ propose how to obtain a mixed state described by density matrix

$\rho =
\begin{bmatrix}
1/2 & c/2\\
c/2 & 1/2
\end{bmatrix},$

by mixing pure states of your choice.

A similar thing works for position. Instead of a two-level system let’s take a particle in one dimension. The analogue of a state vector is a wavefunction, a complex-valued function on a line:

$\psi(x)$

In this continuous variant, $p(x) = |\psi(x)|^2$ is the probability density of finding particle in one place.

We construct the density matrix (or rather: density operator) in an way that is analogous to what we did for the two-level system:

$\rho(x, x') = \psi(x) \psi^*(x').$

Instead of a 2×2 matrix matrix, it is a complex function of two real variables. The probability density can be described by its diagonal values, i.e. $p(x) = \rho(x,x)$. Again, we may wonder if the particle energetically favors being in many places at once. Well, it does.

PICTURE: Density matrices for a classical and quantum state. They yield the same probability distributions (for positions). However, their off-diagonal values (i.e. $x\neq x'$) are different. The classical state is just a probabilistic mixture of a particle being in a particular place.

What would happen if we had a mixture of perfectly localized particles? Due to Heisenberg’s uncertainly principle we have

$\Delta x \Delta p \geq \frac{\hbar}{2},$

that is, that the product of standard deviations of position and momentum is at least some value.

If we exactly know the position, then the uncertainty of momentum goes to infinity. (The same thing holds if we don’t know position, but it can be known, even in principle. Quantum mechanics couldn’t care less if the particle’s position is known by us, by our friend, by our detector or by a dust particle.)

The Hamiltonian represents energy, and the energy of a free particle in continuous system is

$H=p^2/(2m),$

where $m$ is its mass; and the velocity are proportional to the momentum ($v=p/m$). So, if the particle is localized,

- the energy is infinite,
- velocities are infinite, so in no time the wavefunction will spread everywhere.

Infinite energies sometimes happen if physics. But if we get infinite velocities we see that there is something wrong (if one ins). So, a particle needs to be spread out, or ‘delocalized’, to some degree, to have finite energy.

As a side note, to consider high energies we would need to employ special relativity. In fact, one cannot localize a massive particle that much, as it will create a soup of particles and antiparticles, once its energy related to momentum uncertainty is as much as the energy related to its mass; see the Darwin’s term in the fine structure.

Moreover, depending on the degree of its delocalization its behavior is different. For example, a statistical mixture of highly localized particles would spread a lot faster than the same $p(x)$ but derived from a single wavefunction. The density matrix of the former would be in between of that of pure state (a ‘circular’ Gaussian function) and the classical state (a ‘linear’ Gaussian). That is, it would be an ‘oval’ Gaussian, with off-diagonal values being smaller than for the pure state.

Let us look at two Gaussian wavefunctions, with varying level of coherent superposition between them. That is, each Gaussian is already a superposition, but when we combine two we let ourselves use a superposition, or a mixture, or something in between. For a perfect superposition of Gaussian, we would have the density matrix

$\rho(x,x') = \frac{1}{2} \left( \phi(x+\tfrac{d}{2}) + \phi(x-\tfrac{d}{2}) \right) \left( \phi(x'+\tfrac{d}{2}) + \phi(x'-\tfrac{d}{2}) \right),$

where $\phi(x)$ is a normalized Gaussian function. For a statistical mixture between these Gaussians split by a distance of $d$, we would have:

$\rho(x,x') = \frac{1}{2} \phi(x+\tfrac{d}{2}) \phi(x'+\tfrac{d}{2}) + \frac{1}{2} \phi(x-\tfrac{d}{2}) \phi(x'-\tfrac{d}{2}).$

And in general,

$\rho(x,x') = \frac{1}{2} \left( \phi(x+\tfrac{d}{2}) \phi(x'+\tfrac{d}{2}) + \phi(x-\tfrac{d}{2}) \phi(x'-\tfrac{d}{2})\right) + \frac{c}{2} \left( \phi(x+\tfrac{d}{2}) \phi(x'-\tfrac{d}{2}) + \phi(x-\tfrac{d}{2}) \phi(x'+\tfrac{d}{2}) \right)$

for some $|c| \leq 1$.

PICTURE: Two Gaussian wavefunctions (centered at $-2$ and $+2$) in a coherent superposition with each other (the first and the last plot) and a statistical mixture (the middle plot); the 2nd and 4th plot show intermediary states. Superposition can be with different phase, much alike in the hydrogen example (color represents absolute value and hue - phase; here red is for positive numbers and teal - negative.).

We have seen learnt the difference between the quantum superposition and the statistical mixture of states. In particular, that while both of these descriptions may give the same probabilities, their predictions on the physical properties of states differ. For example, we need an electron to be delocalized in a specific way to describe chemical bonds; and we need delocalization of any particle to predict its movement.

We used density matrices to express both quantum superposition and (classical) lack of knowledge on the same ground. We have identified its off-diagonal terms as ones related to the quantum coherence.

But what if there were not only two states, but many? So, instead of $\mathrm{H}_2^+$ (we were not even considering the full hydrogen atom, but only its ionized version), how about electric excitation on something bigger? Not even $\mathrm{C}_2\mathrm{H}_5\mathrm{OH}$ or some sugar, but a protein complex!

So, this will be your homework (cf. this homework on topology). Just joking, there will be another blog post.