H₂O Benchmarking: First 8-Qubit Results on the QEncode Leaderboard
We added water to the QEncode benchmark suite. Here's what it takes to simulate H₂O with a [4,4] active space — 8 qubits, 105 Pauli terms, and what the VQE results reveal about UCCSD's limits at this scale.
Water is arguably the most important molecule in chemistry. It is also a meaningful step up in complexity from the diatomics and triatomics that typically populate quantum chemistry benchmarks. Adding H₂O to the QEncode leaderboard meant moving from 4-qubit problems (H₂, LiH, HF) and the 8-qubit BeH₂ to a genuinely polyatomic molecule with lone pairs, two O–H bonds, and a more intricate correlation structure.
Why [4,4] active space?
Full-valence H₂O in STO-3G has 10 electrons and 7 spatial orbitals. Running VQE on the full space would require 14 qubits — expensive and not yet the focus of our suite. Instead we freeze the oxygen 1s core and select a [4,4] active space: 4 electrons in 4 orbitals (the two lone pairs and the two O–H bonding orbitals). This captures the dominant correlation effects while keeping the circuit to 8 qubits — the same as BeH₂.
The exact FCI ground state energy within this active space is −6.163 Ha. The Hartree–Fock reference is −6.113 Ha, so the correlation energy we're trying to recover is about 50 mHa.
The Hamiltonian: 105 Pauli terms
After fermion-to-qubit mapping, the H₂O [4,4] Hamiltonian contains 105 Pauli terms regardless of which mapping you use (Jordan-Wigner, parity, or Bravyi-Kitaev all produce the same term count at this active space size). For comparison, BeH₂ [4,4] also produces around 105 terms — confirming that term count is driven by the active space dimensions, not the specific molecule.
More terms means each energy evaluation is more expensive: the VQE optimizer must contract a larger expectation value at every function call. With 1500–2000 COBYLA iterations per run and StatevectorEstimator handling the inner loop, each UCCSD entry takes approximately 7–10 minutes on a modern CPU.
UCCSD results: three certified entries
We ran UCCSD (reps=1) under all three encodings. All three converged within the certification threshold (gap < 10 mHa):
| Mapping | Ansatz | Gap (Ha) | Depth | 2Q gates |
|---|---|---|---|---|
| Jordan-Wigner | uccsd | 3.54 × 10⁻³ | 2512 | 1440 |
| Parity | uccsd | 3.54 × 10⁻³ | 2426 | 1316 |
| Bravyi-Kitaev | uccsd | 3.55 × 10⁻³ | 2458 | 1316 |
| Jordan-Wigner | hea | 7.45 × 10⁻³ | 27 | 56 |
All three UCCSD encodings converge to nearly the same gap (~3.54 mHa). The parity mapping produces the shallowest circuit (depth 2426 vs 2512 for JW), and BK matches parity on two-qubit gate count. If you're hardware-constrained, parity is the clear winner for UCCSD on H₂O.
Hardware-efficient ansatz: a split result
We ran hardware-efficient ansatz (HEA, reps=2) under all three mappings. Only the Jordan-Wigner encoding converged within the certification threshold (gap 7.45 mHa). The Bravyi-Kitaev and parity HEA runs produced gaps of 0.38 Ha and 0.37 Ha respectively — about 100× worse than UCCSD.
This is not a bug. HEA with reps=2 has 48 free parameters but no built-in particle conservation or chemistry-informed structure. On a molecule with this many Pauli terms and a moderately deep energy landscape, COBYLA with 1500 iterations frequently gets stuck in a local minimum. The JW encoding happened to land in the right basin with our random seed; BK and parity did not.
This reveals an important point: HEA's performance is encoding-sensitive in ways that UCCSD is not. UCCSD carries the physics of the system into the ansatz structure — it knows about electron pairs and excitations. HEA is agnostic, which makes it cheaper to compile but harder to converge reliably.
How H₂O compares to the rest of the suite
The H₂O UCCSD gap (~3.5 mHa) is roughly the same as BeH₂ UCCSD (~3.6 mHa). This makes sense: both use a [4,4] active space with 8 qubits. The correlation energy per active electron is similar because we're choosing the same active space size even though the molecules are different. What differs is the circuit depth: H₂O UCCSD runs ~2500 deep vs BeH₂'s ~1750. H₂O has more off-diagonal Pauli terms that require longer sequences of Givens rotations in the UCCSD construction.
By contrast, H₂ and HF achieve gaps in the nanoHartree range — essentially machine precision. That's because their [2,2] active spaces are tiny (4 qubits), and UCCSD with just a few parameters finds the exact ground state within the active space almost immediately. H₂O at [4,4] is a genuinely harder optimization problem.
What comes next for H₂O
The 3.5 mHa gap is certified but sits above the chemical accuracy threshold of 1.6 mHa (1 kcal/mol). To close that gap we plan to:
- Run with more optimizer restarts (currently 1, trying 3–5) to escape local minima
- Increase COBYLA iterations from 2000 to 4000–5000
- Benchmark k-UpCCGSD, which adds higher-order excitations at moderate extra cost
H₂O is now live on the QEncode leaderboard under the Best Accuracy, Lowest Cost, and Balanced categories. Filter to H₂O to see the full breakdown by encoding and ansatz.