tutorialedge-quantumhardware

Run a Quantum Emulator on Raspberry Pi 5 with the AI HAT+ 2: A Hands-on Lab

UUnknown

2026-02-24

10 min read

Build a low-cost local quantum dev rig: run lightweight Qiskit/Cirq/PennyLane emulators on Raspberry Pi 5 with AI HAT+ 2 for NPU-accelerated preprocessing.

Turn your Raspberry Pi 5 + AI HAT+ 2 into a low-cost, local quantum dev rig — step-by-step

Hook: If you’re a developer or IT pro frustrated by the latency, cost, and vendor lock-in of cloud quantum demos, this hands-on lab shows how to run lightweight quantum emulators locally on a Raspberry Pi 5 equipped with the new AI HAT+ 2. You’ll get a reproducible workflow for running Qiskit, Cirq, and PennyLane demos where the AI HAT+ 2 handles fast pre/post-processing and inference, while the Pi hosts the quantum emulator — an ultra-low-cost edge quantum dev setup that’s practical in 2026.

Why this matters in 2026

By late 2025 and into 2026, the industry moved from cloud-first quantum proofs-of-concept toward hybrid, low-latency workflows. Vendors published lighter classical-accelerator toolchains and ONNX-compatible front-ends, making it feasible to push classical preprocessing and ML feature maps to edge NPUs. At the same time, Raspberry Pi 5 became a reasonable lone-node host for lightweight emulators and developer tooling because of its faster SoC and native ARM64 support.

“Edge-first quantum prototyping reduces iteration time for algorithms and lets teams iterate locally before committing to expensive cloud runs.”

This lab shows how to connect those dots: use the AI HAT+ 2 (NPU-backed addon) to accelerate classical parts of hybrid quantum workflows and run quantum emulators locally with Qiskit/Cirq/PennyLane on a Raspberry Pi 5.

What you’ll build and prerequisites

Outcome: a reproducible local quantum dev rig that runs a complete hybrid demo (classical preprocessing on AI HAT+ 2 → variational circuit simulated locally), using Qiskit or PennyLane as the quantum SDK and ONNX Runtime (or vendor runtime) for NPU acceleration.

Hardware

Raspberry Pi 5 (8GB recommended)
AI HAT+ 2 or compatible NPU HAT for Pi 5 (with vendor runtime that exposes a Python/ONNX execution provider)
MicroSD card (32GB+) or USB SSD (recommended for performance)
Power supply, keyboard, and network (Ethernet or Wi‑Fi)

Software & tools

Raspberry Pi OS 64-bit or other ARM64 Debian 12/Bookworm variant (2025/2026 images work best)
Python 3.11+ virtual environment
pip, git, build-essential
Qiskit, Cirq, PennyLane (we’ll offer choices and fallbacks)
ONNX Runtime (ARM64) or vendor SDK for AI HAT+ 2

Step 1 — Prepare your Pi and AI HAT+ 2

Start from a clean 64-bit Raspberry Pi OS image. Use an SSD if you plan to compile C++ libraries; it makes builds much faster.

1.1 Flash OS and update

sudo apt update && sudo apt upgrade -y
sudo apt install -y git build-essential python3-venv python3-pip libffi-dev libssl-dev

1.2 Attach AI HAT+ 2 and install drivers

Follow vendor instructions to mount the HAT. Most HATs expose an SDK that provides a system service and Python bindings. The generic steps look like:

sudo apt install -y ai-hat2-runtime   # vendor package name may differ
sudo systemctl enable ai-hat2.service
sudo systemctl start ai-hat2.service

Verify the runtime and NPU are visible. If the vendor provides an ONNX execution provider, it will typically register as ai_hat2 or similar in ONNX Runtime.

# Example verification (vendor CLI)
ai_hat2-ctl info
# or list ONNX providers in Python
python3 - <<'PY'
import onnxruntime as ort
print(ort.get_all_providers())
PY

Step 2 — Create a reproducible Python environment

Use a venv to avoid polluting the system site packages. These commands assume Python 3.11 is installed.

python3 -m venv ~/quantum-pi-env
source ~/quantum-pi-env/bin/activate
pip install --upgrade pip setuptools wheel

2.1 Install core quantum SDKs

Pick the SDK(s) you want. On ARM64, binary wheels for some packages (e.g., qiskit-aer) may not be available. The pragmatic approach is:

Install Qiskit Terra and use its BasicAer or the pure-Python simulators when Aer is not available.
Install PennyLane (it ships a pure-Python default simulator) for hybrid circuits.
Install Cirq for Google-style circuits and its simulator.

pip install qiskit==0.56.*  # Terra + basic adapters
pip install pennylane==0.28.*
pip install cirq-core==1.3.*

Note: replace versions with the current stable releases for 2026. If qiskit-aer wheels exist for ARM64 by 2026, you can add pip install qiskit-aer — otherwise see the "Optional" section for compiling from source.

2.2 Install ONNX Runtime or the vendor runtime

ONNX Runtime has grown ARM64 support and custom execution providers by 2026. Install the ARM wheel if available, or install the vendor’s Python package that registers an ONNX provider for the AI HAT+ 2.

# Install ONNX Runtime (example for generic ARM64 wheel)
pip install onnxruntime
# If vendor publishes a package, install that instead, e.g.:
# pip install ai-hat2-onnx

Step 3 — Choose and configure a lightweight quantum emulator

On a Pi 5, choose emulators that are pure-Python or have ARM wheels. These are reliable and fast enough for demos and developer iteration:

PennyLane's default.qubit (pure Python, supports gradients)
Cirq's Simulator (numpy-based)
Qulacs (C++ optimizer — check for ARM wheel; compile if needed)
Qiskit BasicAer (fallback for basic circuits)

Install Qulacs if an ARM wheel is available; otherwise fall back to PennyLane/Cirq.

pip install qulacs || echo "Qulacs ARM wheel not found; using PennyLane default.qubit"

Optional: compiling qiskit-aer or qsim on Pi 5

If you need a high-performance C++ simulator like qiskit-aer or qsim, you can compile from source, but expect long builds. Use an SSD and swap tuned. This is optional and for advanced users.

Step 4 — Hybrid demo: ONNX preprocessor + PennyLane circuit

We’ll demonstrate a compact hybrid pipeline: a classical feature-mapping neural network (ONNX) runs on the AI HAT+ 2 to compute a 4-dimensional feature vector from sensor data. That vector becomes rotation angles for a 4-qubit variational circuit simulated locally using PennyLane's default.qubit. This pattern mirrors many hybrid algorithms (QAOA feature maps, VQE with learned encodings).

4.1 Build or obtain an ONNX preprocessor

Create a tiny PyTorch/TensorFlow model locally, export to ONNX, and deploy. For brevity, we’ll assume you already have preprocessor.onnx placed in ~/models.

4.2 Python demo: run ONNX on AI HAT+ 2 and PennyLane locally

cat > hybrid_demo.py <<'PY'
import numpy as np
import onnxruntime as ort
import pennylane as qml
from pennylane import numpy as pnp

# Load ONNX runtime and specify provider (vendor may expose 'AI_HAT2' provider)
providers = ort.get_all_providers()
print('ONNX providers:', providers)
# Try to use the ai-hat2 provider first, fallback to CPU
if 'AI_HAT2' in providers:
    sess = ort.InferenceSession('~/models/preprocessor.onnx', providers=['AI_HAT2'])
else:
    sess = ort.InferenceSession('~/models/preprocessor.onnx')

# Dummy sensor input
raw_input = np.random.rand(1, 16).astype(np.float32)
input_name = sess.get_inputs()[0].name
features = sess.run(None, {input_name: raw_input})[0].ravel()
print('Features from ONNX:', features)

# Construct a 4-qubit variational circuit
n_qubits = 4
dev = qml.device('default.qubit', wires=n_qubits)

@qml.qnode(dev, interface='autograd')
def circuit(params, feature_angles):
    # Encode feature angles as rotations
    for i in range(n_qubits):
        qml.RY(feature_angles[i], wires=i)
    # Simple entangling layers
    for i in range(n_qubits-1):
        qml.CNOT(wires=[i, i+1])
    for i in range(n_qubits):
        qml.RY(params[i], wires=i)
    return qml.expval(qml.PauliZ(0))

# Random variational params and run
params = pnp.array([0.1, 0.2, 0.3, 0.4], requires_grad=False)
feature_angles = pnp.array(features[:n_qubits])
res = circuit(params, feature_angles)
print('Circuit result (expectation):', res)
PY

python3 hybrid_demo.py

This example shows the pattern: run heavy or repeatable classical transforms on the NPU (AI HAT+ 2), then pass compact data to the local emulator. On a Pi 5 this dramatically speeds up iterations when the preprocessing is the bottleneck.

Step 5 — Performance tuning and practical tips

Edge deployments require careful tuning. Here are practical knobs:

Batching: Send inputs to the ONNX model in batches to amortize runtime overhead on the NPU.
Quantization: Use int8 quantized ONNX models for faster NPU throughput if quality allows.
Threading: Set environment variables for BLAS/OpenMP to avoid oversubscription: export OMP_NUM_THREADS=4.
Swap and tmpfs: Use tmpfs for intermediate files and adjust swap if compiling C++ simulators.
Memory limits: Keep per-simulation qubit counts small (8–16 qubits); state-vector simulators scale exponentially with qubit count.

Practical hardware hints

Use an SSD for builds and swap files.
Keep Pi cooled; long simulations and compilations heat up the SoC.
If you need more memory for simulators, experiment with 16GB Pi 5 variants (if available) or use remote NFS for swap (not ideal).

Step 6 — Troubleshooting & fallbacks

Common issues and quick fixes:

ONNX provider not found: Install the vendor ONNX package or use generic ONNX Runtime. Confirm providers with ort.get_all_providers().
Qiskit Aer not installable: Use PennyLane default.qubit or Cirq simulator. You can compile Aer from source but expect long build times on Pi 5.
Slow inference: Quantize the ONNX model and use batching.
Memory errors: Lower qubit count or use density-matrix simulators only for very small systems.

Benchmarks and realistic expectations

Raspberry Pi 5 with AI HAT+ 2 is not a replacement for cloud HPC simulation, but it excels at fast iteration cycles for small circuits and hybrid pipeline development. Expect:

Preprocessing latencies in single-digit to low double-digit milliseconds on the NPU for small models (ONNX int8).
Local simulation times that scale exponentially — practical qubit ranges: 4–16 for interactive use depending on simulator.
Iteration speed improvements when heavy classical computation is offloaded to the AI HAT+ 2.

Advanced strategies for 2026 and beyond

As of early 2026, these trends are worth adopting:

ONNX standardization for hybrid stacks: More quantum toolchains publish ONNX-compatible classical components to ease edge deployment.
Edge federated experiments: Use multiple Pi+HAT nodes to parallelize sampling across many small emulators for embarrassingly parallel problems.
WASM-based emulators: Emerging WebAssembly quantum emulators are cross-platform and run well on ARM devices; consider them for portable demos.
Containerized reproducibility: Use lightweight containers (podman/docker with ARM64 images) to pin runtime stacks for labs and classrooms.

Real-world example: classroom & prototyping use cases

We’ve used this setup in 2025/2026 workshops where students iterate on variational algorithms without cloud credits. Teams run classical model tuning on the AI HAT+ 2 and test circuit changes on the Pi simulator — the local loop reduces turnaround from minutes to seconds, which is crucial for learning and rapid prototyping.

Checklist: quick copy-paste to get started

Flash Pi OS 64-bit and update system.
Attach AI HAT+ 2 and install the vendor runtime.
Create Python venv and install Qiskit/PennyLane/Cirq.
Install ONNX Runtime or vendor ONNX provider.
Place your preprocessor ONNX model in ~/models and run the hybrid demo.

Security and compliance notes

Keep vendor runtime packages up to date and validate ONNX models before deployment. For sensitive code, isolate the Pi on a private network. Edge devices are convenient for prototyping but require standard hardening practices for production use.

Actionable takeaways

Local iteration beats remote for early-stage algorithm design: Use the Pi 5 + AI HAT+ 2 lab to shorten the development loop.
Offload classical workloads: Use ONNX or vendor runtimes on the HAT to accelerate feature maps and pre/post-processing.
Choose simulators for practicality: Prefer PennyLane/Cirq or ARM-native wheels (Qulacs) rather than heavy C++ simulators unless you can compile them.
Plan for scale: For larger simulations, incrementally move from local emulation to cloud HPC — but keep the local rig for quick experiments and demos.

Where to go next

Extend this lab by:

Deploying a small web UI on the Pi for remote students to submit jobs.
Using multiple Pi 5 nodes to parallelize sampling for variational algorithms.
Benchmarking different emulators (PennyLane vs Qulacs vs Cirq) and publishing comparative numbers for your environment.

In 2026, the combination of affordable ARM hardware and specialized NPUs makes local quantum development practical and powerful. The Raspberry Pi 5 + AI HAT+ 2 pattern is a pragmatic way to prototype hybrid quantum-classical workflows without cloud friction. Try the steps above, benchmark your setup, and share a short reproducible report with the community — it helps everyone move faster.

Call to action: Clone the companion repo (link in the project notes), run the hybrid_demo.py on your Pi 5 + AI HAT+ 2, and post your timings or issues on our community board. We’ll publish a follow-up comparing Qulacs vs PennyLane on Pi 5 environments based on your feedback.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.