Benzene on the Leaderboard: QEncode's First Aromatic Molecule
Nearly every drug molecule contains an aromatic ring. Benzene is the simplest of them — six carbons, six hydrogens, six π electrons in a conjugated ring that Hartree-Fock cannot correctly describe. We added it to Suite v4.2. Here is what the run required and what the result tells us.
When we certified N₂ in Suite v4.1, the challenge was the triple bond — a strongly correlated system where Hartree-Fock orbitals fail and CASSCF is required. Benzene presents the same fundamental challenge through a different structure: six π electrons delocalized across a symmetric ring. The orbitals are not localized, the correlation is spread across the entire molecule, and standard single-reference methods struggle with the same classes of excitations.
The practical motivation is pharmaceutical. The aromatic ring is present in roughly 70% of approved drug molecules — aspirin, caffeine, ibuprofen, most antibiotics. Any quantum chemistry method that cannot handle aromatic systems cannot contribute to drug discovery. Benzene is the entry point to that class of chemistry.
The active space: 6 π electrons in 6 π orbitals
In benzene's [6e, 6o] active space, we include the six π-type molecular orbitals — three bonding (π₁, π₂, π₃) and three antibonding (π₄*, π₅*, π₆*) — along with the six electrons that occupy them at equilibrium. This is the Hückel picture of benzene's aromatic system: the electrons that make the ring stable, reactive, and chemically interesting.
The active space has the same dimensions as N₂ — both are [6e, 6o] — which means the same qubit count (12 in Jordan-Wigner encoding) and comparable circuit complexity. The structural difference is symmetry: N₂ has linear D∞h symmetry and a concentrated triple bond; benzene has hexagonal D6h symmetry and a delocalized ring. Both require CASSCF orbital optimization.
Why CASSCF is required
In Hartree-Fock, the six π orbitals of benzene mix with σ* antibonding orbitals from the C-H framework. The result is a set of canonical molecular orbitals that don't cleanly represent the π system. If you build the VQE Hamiltonian from these mixed orbitals, the circuit has no good starting point — the CASCI minimum is hidden behind a large orbital-basis error.
CASSCF pre-optimizes the orbital basis to minimize the energy of the active space before any VQE evaluation runs. For benzene, this means the six π orbitals passed to the circuit genuinely represent the aromatic system — they are the orbitals where the interesting chemistry lives.
The run: 12 → 9 qubits, 914 Pauli terms
After CASSCF orbital optimization, we build the qubit Hamiltonian in the Jordan-Wigner encoding. Benzene's [6e, 6o] active space maps to 12 qubits — one per spin-orbital. Z2 symmetry tapering identifies three conserved symmetry sectors in the D6h Hamiltonian and removes them, reducing the circuit to 9 qubits. The tapered Hamiltonian has 914 Pauli terms — larger than N₂ (378 terms) due to benzene's greater structural complexity.
We ran the Hardware-Efficient Ansatz (HEA) with 6 repetition layers and 63 parameters, using 30 random restarts at 500 COBYLA iterations each. The circuit uses alternating layers of single-qubit rotations and entangling CNOT gates — no chemistry built in, but a flexible enough structure to explore the energy landscape broadly.
What 91.1 mHa means
The HEA result does not certify — the 91.1 mHa gap is well above our 10 mHa certification threshold. This is expected. HEA with 63 parameters and a gradient-free optimizer is not designed for a 914-term Hamiltonian with strong π correlation. The energy landscape is rugged, the restarts explore it broadly, and the result is the best the general-purpose circuit can find.
The beats_classical flag — which records whether the VQE correlation energy exceeds the CCSD(T) correlation energy — is True for this entry. This is a separate measure from the gap: even at 91 mHa from the active-space exact answer, the VQE circuit captures more electron correlation than the classical gold-standard method on this active space. That reflects how difficult benzene's π correlation is for perturbative methods.
The result is recorded in the Research tab of the leaderboard — honest, reproducible, and informative. Research-tier entries are not failures; they are data points that document the current frontier of what standard ansatz families achieve.
The reproduce command
python scripts/generate_entry_v4.py \ --molecule benzene --mapping jordan_wigner \ --ansatz-type hea --ansatz-reps 6 \ --orbital-opt casscf --multistart 30 \ --max-iter 500 --out-dir releases/v4/db
What comes next: UCCSD and hydrogen chains
Benzene UCCSD — with approximately 400 parameters, the same scale as N₂ — is the next target. A UCCSD certification of benzene would be the first aromatic molecule at chemical-accuracy-class precision in the suite, and a meaningful milestone for pharmaceutical quantum chemistry relevance.
In parallel, Suite v4.3 will add hydrogen chains — H₄, H₆, H₈ — which appear explicitly in the DARPA QB-GSEE target list. Hydrogen chains are cheap to run, systematically scalable, and ideal for testing how benchmark performance degrades as the active space grows.
All entries — including the benzene HEA Research result — are open source, reproducible with a single command, and signed with an Ed25519 provenance hash.