Table of Contents

How to implement data set as an oracle and circuits?

In quantum algorithms, a classical data set is usually not loaded into a quantum computer as a table in the ordinary database sense. Instead, the data are exposed through an oracle: a reversible operation that can be queried by the algorithm. The circuit implementation depends on what question the oracle must answer.

A common form is

\[O_f: |x\rangle |y\rangle \mapsto |x\rangle |y \oplus f(x)\rangle,\]

where x is an index or key, y is a target register, and f(x) is the data value or label associated with x. If f(x) is one bit, the target register has one qubit. If f(x) is an m-bit value, the target register has m qubits and the XOR is bitwise.

Another frequently used form is the phase oracle

\[O_f: |x\rangle \mapsto (-1)^{f(x)} |x\rangle.\]

The phase oracle is usually obtained from the bit oracle by preparing the target qubit in |-\rangle = (|0\rangle - |1\rangle)/\sqrt{2} and using phase kickback.

Example: a small binary data set

Suppose the classical data set is

| index x | value f(x) |
| — | — |
| 00 | 0 |
| 01 | 1 |
| 10 | 1 |
| 11 | 0 |

This is the XOR function:

\[f(x_1,x_0)=x_1 \oplus x_0.\]

A bit oracle can be implemented as

\[|x_1x_0\rangle |y\rangle \mapsto |x_1x_0\rangle |y \oplus x_1 \oplus x_0\rangle.\]

The circuit is simply two CNOT gates, one controlled by x_1 and one controlled by x_0, both targeting y:

x1: ──■──────
      │
x0: ──┼──■───
      │  │
y : ──X──X───

This works because every CNOT toggles the target when its control is 1. The target is toggled once for each 1 bit in the index, so the result is the parity of the two index bits.

General method for a finite lookup table

For an arbitrary finite table, the direct construction is:

Use n qubits for the index register, enough to address all rows.
Use m qubits for the output register, enough to store each value.
For every row x = a where output bit j is 1, add a multi-controlled NOT gate targeting output qubit j and controlled on the index register matching a.
If a control must match 0, surround that control line with X gates before and after the multi-controlled operation.

For example, if the table has a row

x = 101, f(x) = 011

then the oracle must toggle the two low output bits when the index is 101. The controls are x2 = 1, x1 = 0, x0 = 1. The 0 control is handled by applying X to x1, using an ordinary all-ones controlled operation, and then applying X to x1 again.

In pseudocode:

for each row a in table:
    for each output bit j where f(a)[j] == 1:
        flip index qubits whose required control value is 0
        apply multi-controlled X from all index qubits to output[j]
        flip those index qubits back

This construction is easy to understand and verify, but it is not always efficient. A table with N rows and m output bits can require up to N * m multi-controlled gates before decomposition into the elementary gate set supported by the target device or simulator.

Reversibility matters

A quantum oracle must be unitary, so it cannot simply erase or overwrite data. This is why the usual oracle writes by XOR:

\[|x\rangle |y\rangle \mapsto |x\rangle |y \oplus f(x)\rangle.\]

If the algorithm needs temporary workspace, compute the value into an ancilla register, use it, and then uncompute it by running the same circuit in reverse. This prevents unwanted entanglement between the answer register and scratch qubits.

The standard pattern is:

compute f(x) into work register
use f(x)
uncompute f(x) by applying the inverse circuit

For classical reversible circuits made from X, CNOT, Toffoli, and multi-controlled X gates, the inverse is often the same gates in reverse order.

Data loading is different from an oracle

Sometimes people ask for a circuit that prepares a quantum state containing the data, such as

\[\sum_x d_x |x\rangle.\]

That is state preparation, not the same as a lookup oracle. State preparation can be expensive for arbitrary data, because the amplitudes must be normalized and encoded by rotations. A lookup oracle answers queries of the form “what is f(x)?” while state preparation creates a superposition whose amplitudes are the data.

The right implementation depends on the algorithm:

For Grover search, one often needs a phase oracle marking good indices.
For arithmetic algorithms, one often needs a reversible circuit computing f(x).
For amplitude encoding, one needs a state-preparation circuit or a QRAM-like assumption.
For small demonstrations, a direct lookup-table circuit is usually sufficient.

Practical checks

For a small oracle, verify the truth table before using it inside a larger algorithm. A simple reproducible check is to simulate every computational basis input and confirm that the output register becomes y XOR f(x).

The checklist is:

Confirm the bit order used by the framework. Qiskit, Cirq, and other tools may display qubits and bitstrings in different orders.
Test every index value x.
Test with the output register initialized to both zero and nonzero values, because the oracle should XOR rather than overwrite.
If ancillas are used, check that they return to |0\rangle after uncomputation.
Decompose the circuit into the backend’s basis gates and inspect the depth and gate count.

For small static data sets, the direct multi-controlled-gate method is the clearest implementation. For larger structured data sets, it is better to implement the function algebraically, using reversible arithmetic and logic. For very large unstructured data, the cost of oracle construction is part of the algorithmic model and should not be hidden behind the word “oracle.”