PERFORMANCE BENCHMARKS

Measured. Not estimated.

Every number below comes from running deliverables/benchmark.py with 3 warm-up passes and 7 timed repetitions on Apple M-series hardware. No simulations. No projections.

PERFORMANCE

Measured. Not estimated.

Every number below comes from running deliverables/benchmark.py with 3 warm-up passes and 7 timed repetitions on Apple M-series hardware. No simulations. No projections.

BloomFilter.add() — Time vs. Input Size

Per-operation cost

12–13 µs (constant)

Memory model

O(n) bits

False positive rate

1% (configurable)

False negative rate

0% (guaranteed)

Benchmark methodology

All benchmarks were run using deliverables/benchmark.py on an Apple M-series chip (exact model redacted to avoid hardware anchoring). Each benchmark uses 3 warm-up passes followed by 7 timed repetitions. The median of the 7 timed repetitions is reported.

The _save() disk write is patched out of the ConfidenceCalibrator benchmark to isolate mathematical computation cost from I/O variance. All other benchmarks reflect end-to-end wall time.

The BFS graph benchmark uses a degree-4 ring graph as a representative topology. Real-world context memory stores may have different topologies; O(V+E) complexity holds regardless.

LLM inference latency is not benchmarked here — it is network- and provider-dependent. See the LLM Routing section on the home page for observed figures.