A full tutorial on QAOA algorithm without leave-it-as-an-exercise-for-reader

This note is a worked introduction to the Quantum Approximate Optimization Algorithm (QAOA), with the algebraic steps written out rather than left as exercises. The focus is the standard Ising-style formulation used for combinatorial optimization, how it connects to adiabatic quantum computation, how the circuit is built, and how to evaluate the operator identities that appear in the simplest derivations.

Table of Contents

The Problem Definition

QAOA is a variational quantum algorithm for approximately solving optimization problems. A classical objective function is encoded into a cost Hamiltonian $C$, and the algorithm prepares a parameterized quantum state whose measurement distribution favors low-cost or high-cost bit strings, depending on the sign convention.

For a binary optimization problem, write a candidate solution as a bit string

\[z = z_1 z_2 \cdots z_n, \qquad z_i \in \{0,1\}.\]

The objective function assigns a real value $C(z)$ to each bit string. In QAOA we promote this classical function to a diagonal quantum operator

\[C |z\rangle = C(z)|z\rangle.\]

The algorithm also uses a mixing Hamiltonian, usually

\[B = \sum_{j=1}^{n} X_j,\]

where $X_j$ is the Pauli $X$ operator acting on qubit $j$. For depth $p$, the QAOA state is

\[|\gamma,\beta\rangle = e^{-i\beta_p B} e^{-i\gamma_p C} \cdots e^{-i\beta_1 B} e^{-i\gamma_1 C} |s\rangle,\]

where

\[|s\rangle = |+\rangle^{\otimes n}\]

is the uniform superposition over all computational basis states. The classical outer loop chooses the angles

\[(\gamma_1,\ldots,\gamma_p,\beta_1, \ldots,\beta_p)\]

to optimize the expectation value

\[F_p(\gamma,\beta)=\langle \gamma,\beta|C|\gamma,\beta\rangle.\]

For a maximization problem, one tries to maximize $F_p$. For a minimization problem, either minimize the same expression or change the sign of the cost Hamiltonian.

Transverse Field Ising Model

Many QAOA examples use an Ising Hamiltonian because binary variables can be represented naturally by Pauli $Z$ operators. For two qubits, a common interaction term is

\[Z_i Z_j.\]

A graph-based cost Hamiltonian often has the form

\[C = \sum_{(i,j)\in E} w_{ij} Z_i Z_j,\]

possibly plus one-local fields

\[\sum_i h_i Z_i.\]

The transverse-field part is the mixer

\[B = \sum_i X_i.\]

So the two alternating pieces in QAOA are:

a diagonal Ising cost evolution $e^{-i\gamma C}$, and
a transverse-field mixing evolution $e^{-i\beta B}$.

The useful point is that many terms commute. All $Z_iZ_j$ terms are diagonal, so they commute with one another. All $X_i$ terms also commute with one another because they act on different qubits. This lets us implement the exponentials as products of one-qubit and two-qubit gates.

Adiabatic Quantum Computation

QAOA can be viewed as a digitized, variational version of adiabatic quantum computation.

In the adiabatic picture, we begin with an easy Hamiltonian whose ground state is easy to prepare, for example

\[H_B = -\sum_i X_i.\]

Its ground state is $|+\rangle^{\otimes n}$. We then slowly change the Hamiltonian toward the problem Hamiltonian $H_C$:

\[H(t) = (1-s(t))H_B + s(t)H_C,\]

where $s(0)=0$ and $s(T)=1$. If the interpolation is slow enough and the spectral gap behaves well, the adiabatic theorem says the state approximately follows the instantaneous ground state.

QAOA replaces this continuous process with alternating pulses:

\[e^{-i\beta B}e^{-i\gamma C}.\]

At large depth $p$, the sequence resembles a Trotterized adiabatic path. At small depth, the angles are treated as free variational parameters and optimized directly.

Circuit Design

For one QAOA layer, the circuit has three parts.

Prepare the initial state:

\[|0\rangle^{\otimes n} \xrightarrow{H^{\otimes n}} |+\rangle^{\otimes n}.\]

Apply the cost unitary:

\[U_C(\gamma)=e^{-i\gamma C}.\]

For an Ising edge $(i,j)$, the two-qubit term is

\[e^{-i\gamma w_{ij} Z_iZ_j}.\]

A standard decomposition is:

CNOT(i, j)
RZ(2 gamma w_ij) on j
CNOT(i, j)

The sign of the angle depends on the convention used by the quantum SDK for $R_Z(\theta)$. Always verify it from the local definition of the gate.

Apply the mixer unitary:

\[U_B(\beta)=e^{-i\beta\sum_i X_i}=\prod_i e^{-i\beta X_i}.\]

Since many SDKs define

\[R_X(\theta)=e^{-i\theta X/2},\]

this is usually implemented as

RX(2 beta) on every qubit

Again, verify the sign and factor of two against the local SDK definition.

The full depth-$p$ circuit repeats the cost and mixer blocks:

H on every qubit
for layer = 1..p:
    apply cost unitary with gamma[layer]
    apply mixer unitary with beta[layer]
measure

Coding

A minimal implementation has three moving parts:

a function that builds the QAOA circuit for given parameters,
a function that estimates the expected cost from measurement counts, and
a classical optimizer that updates the angles.

Pseudocode:

def qaoa_circuit(graph, gammas, betas):
    circuit = QuantumCircuit(len(graph.nodes))

    for q in graph.nodes:
        circuit.h(q)

    for gamma, beta in zip(gammas, betas):
        for i, j, weight in graph.edges(data="weight", default=1.0):
            circuit.cx(i, j)
            circuit.rz(2 * gamma * weight, j)
            circuit.cx(i, j)

        for q in graph.nodes:
            circuit.rx(2 * beta, q)

    circuit.measure_all()
    return circuit

To evaluate a bit string, define the same cost function used to build the Hamiltonian. For example, for an Ising objective:

def spin(bit):
    return 1 if bit == "0" else -1


def ising_cost(bitstring, edges):
    total = 0.0
    for i, j, weight in edges:
        total += weight * spin(bitstring[i]) * spin(bitstring[j])
    return total

Then estimate the expectation value from counts:

def expected_cost(counts, edges):
    shots = sum(counts.values())
    return sum(ising_cost(bits, edges) * count / shots for bits, count in counts.items())

This is deliberately SDK-neutral. In a real implementation, check the bit-ordering convention of the backend or simulator. Some frameworks return bit strings with the visual order reversed relative to qubit indices.

Trivial State Preparation

After several days of thinking and researching, I decided to answer my own question.

N.B. The tensor product symbol $\otimes$ is omitted when there is no risk of confusion, especially when the index is different. In symbols, $A_iB_j = A_i \otimes B_j (i \neq j)$ .

First, for the case $e^{i \beta B} Z_i Z_j e^{-i \beta B}$ , consider only qubit , where $B_e=\sum_{k \in e}{X_k}$ . Since X_i and X_j act on different qubits,

$\begin{equation*}e^{i{(X_i+X_j)}}=e^{i((X_i\otimes I_j)+(I_i \otimes X_j))}=e^{iX_i}e^{iX_j}.\end{equation*}$

Now,

$\begin{equation*} \begin{aligned} e^{-i \beta X} & = e^{-i \beta \begin{bmatrix}0 & 1\\ 1 & 0\end{bmatrix}} \\ & = e^{-i \beta \begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}}\end{bmatrix} \begin{bmatrix}1 & 0\\ 0 & 1\end{bmatrix} \begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\end{bmatrix}} \\ & = \begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}}\end{bmatrix} \begin{bmatrix}e^{-i \beta} & 0\\ 0 & e^{-i \beta}\end{bmatrix} \begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\end{bmatrix} \\ & = \begin{bmatrix}\cos(\beta) & -i\sin(\beta)\\ -i\sin(\beta) & \cos(\beta)\end{bmatrix} \\ & = I\cos(\beta)-iX\sin(\beta). \end{aligned} \end{equation*}$

Thus,

$\begin{equation*}\begin{aligned}e^{i \beta X}& = I\cos(\beta)+iX\sin(\beta).\end{aligned}\end{equation*}$

Hence,

$\begin{equation*}\begin{aligned}e^{i \beta X} Z e^{-i \beta X} & = (I\cos(\beta)+iX\sin(\beta))Z(I\cos(\beta)-iX\sin(\beta)) \\& = Z \cos(2\beta) + Y\sin(2\beta).\end{aligned}\end{equation*}$

For the transverse-field Ising Hamiltonian, consider $G_{e=(i,k)}=Z_iZ_k$ .

For $e^{i \gamma Z_iZ_k}$ , it can be evaluated as

$\begin{equation*} \begin{aligned} e^{i \gamma Z_iZ_k} = & e^{i \gamma \begin{bmatrix}1&&&\\&-1&&\\&&-1&\\&&&1\end{bmatrix}} = \begin{bmatrix}e^{i\gamma}&&&\\&e^{-i\gamma}&&\\&&e^{-i\gamma}&\\&&&e^{i\gamma}\end{bmatrix} \\ = & \begin{bmatrix}\cos\gamma+i\sin\gamma &&&\\&\cos\gamma-i\sin\gamma&&\\&&\cos\gamma-i\sin\gamma&\\&&&\cos\gamma+i\sin\gamma\end{bmatrix} \\ = & \cos\gamma\begin{bmatrix}1&&&\\&1&&\\&&1&\\&&&1\end{bmatrix}+i\sin\gamma\begin{bmatrix}1&&&\\&-1&&\\&&-1&\\&&&1\end{bmatrix} \\ = & I_i I_k \cos\gamma + i Z_1Z_2 \sin\gamma. \end{aligned} \end{equation*}$

We can calculate two-qubit operations independently:

$\begin{equation*}Y_i (Z_i Z_k) = (Y_i \otimes I_k)(Z_i \otimes Z_k)=(Y_i Z_i) \otimes (I_kZ_k) = i X_i \otimes Z_k = i X_i Z_k.\end{equation*}$

Numerically, if you cannot convince yourself,

$\begin{equation*}Y_i \otimes I_k = \begin{bmatrix} &&-i&\\&&&-i\\i&&&\\&i&& \end{bmatrix}\end{equation*}$

and

$\begin{equation*}Z_i \otimes Z_k = \begin{bmatrix} 1&&&\\&-1&&\\&&-1&\\&&& 1\end{bmatrix}\end{equation*}$

and

$\begin{equation*}(Y_i \otimes I_k)(Z_i \otimes Z_k) = \begin{bmatrix} &&i&\\&&&-i\\i&&&\\&-i&&\end{bmatrix} = i(X_i \otimes Z_k).\end{equation*}$

Now, for a more specific example,

$\begin{equation*} \begin{aligned} e^{-i \gamma Z_iZ_k} Y_i e^{i \gamma Z_iZ_k} & = (I_i I_k \cos\gamma - i Z_1Z_2 \sin\gamma) (Y_i \otimes I_k) (I_i I_k \cos\gamma + i Z_1Z_2 \sin\gamma) \\ & = Y_i \cos^2\gamma - Y_i\sin^2\gamma - X_i Z_k\sin\gamma\cos\gamma - X_i Z_k \sin\gamma\cos\gamma \\ & = Y_i \cos 2\gamma - X_iZ_k\sin 2\gamma. \end{aligned} \end{equation*}$

Q.E.D.

Reference

Choi J, Kim J. A Tutorial on Quantum Approximate Optimization Algorithm (QAOA): Fundamentals and Applications. In: 2019 International Conference on Information and Communication Technology Convergence (ICTC). IEEE; 2019. doi:10.1109/ictc46691.2019.8939749