Proven systems at production speed

Verified is no longer the slow option.

On the metric production operators actually lose sleep over β€” the tail latency β€” it is the faster one.

We build systems software whose correctness is machine-checked, theorem by theorem: a reverse proxy and HTTP stack in Lean 4 on ~700 lines of trusted C, and a Unicode security layer whose central guarantee is proven for every possible input, not tested on some. Every claim below traces to a commit. The public work, you can re-verify yourself.

Latency under load: p50 β†’ p99 preliminary Β· two-run stable

fd-passing data-forward nginx
1 ms 2 4 6 8 10 p50 (median) p99 (tail) 10.8 ms 6.9 ms 2.96 ms

The tail is the story. nginx β€” hand-tuned C refined over two decades β€” fans out to an 8Γ— spread from median to p99. The proven-Lean fd-passing proxy stays nearly flat at ~1.9Γ—. nginx keeps the median (1.28 ms vs 1.53 ms); the verified stack wins the tail by ~3.6Γ—, on 1.7Γ— the per-core throughput β€” while serving that load on 2.4 of 14 cores, so it is nowhere near saturation. Caveat, plainly: two stable runs on isolated cores with origin and load generator sharing the box β€” rps/core is the robust metric, not absolute throughput. The multi-run, dedicated-hardware, config-published benchmark is in progress; these numbers move only to firmer ground.


What is shipped today

The ledger

Three artifacts, audited. Each fact carries its receipt β€” a commit hash, or a public repository you can clone right now.

Verified transport

An HTTP/1.1 + h2c reverse proxy & server, in Lean 4

Every protocol decision β€” framing, HPACK, connection lifecycle, keep-alive, pipelining, streaming, proxying β€” is made by pure state machines carrying machine-checked safety theorems. The machines cannot perform I/O; the I/O shell cannot make protocol decisions. The only unverified code is the syscall shim.

  • Trusted C~700 lines β€” io_uring proactor at the syscall boundary
  • Proxy vs nginx1.7Γ— per-core throughput Β· 3.6Γ— lower p99 Β· preliminary, two-run
  • Upstreamproven H/2 to real origins via ALPN Β· H1↔H2 translation Β· verified stream demux
  • As origin3.4M req/s (HTTP/1.1, pipelined, loopback, pinned) Β· 270K single-core
  • HTTP/21.7Γ— over HTTP/1.1 cross-host (7.5 ms RTT, 100% success / 5M req)
  • Hygienezero sorry in the HTTP-path proofs Β· curl, nghttp, h2load interop
commit Β· b647ed2 commit Β· e779abd continuity Β· private, under audit

Verified distribution β€” the strongest mathematics in the stack

An inter-core mesh runtime, proven transparent

Sharding computation across cores is proven to change nothing observable. mesh_faithful is a kernel-checked, fully general theorem: distributing any machine pipeline over any lossless codec produces identical outputs for every input stream. The message codec beneath it is universally proven β€” via bv_decide, with zero native_decide in that chain.

  • Theoremmesh_faithful β€” kernel-checked, universally quantified
  • Codecβˆ€-proven roundtrip (bv_decide), native-UInt shifts
  • Throughput7M msg/s Β· proven codec at ~21% of the memory-copy floor
commit Β· 26e96ad Machine/Mesh.lean continuity Β· private

Verified semantics β€” public & re-checkable

unicode-lean: a proven text-security layer

The full UAX-15 normalization pipeline (all four forms), UTS-39 identifier security, Bidi, IDNA, and 26 adversarial detectors β€” invisible-character covert channels, homoglyphs, Trojan Source, mixed-script spoofing β€” against byte-pinned, SHA-256-verified Unicode 17.0.0 tables. The centerpiece: the NFC quick-check carries a universally quantified soundness theorem. For every possible string, if the fast path says β€œalready normalized,” that is mathematically true. Test suites cannot certify the absence of a bug class. This does.

  • SoundnessquickCheck_sound β€” βˆ€ strings, by structural induction
  • Detectors26 families Β· UCD 17.0.0, SHA-pinned
  • Hygienezero sorry / axiom / unsafe across 203 files

The reverse-proxy comparison, in full

Nothing hidden

The complete table, including the case where we don't win. The data-forward variant trails nginx on per-core efficiency; the fd-passing variant leads on both. Publishing the loss is the point β€” it is what makes the win believable.

Reverse proxy Β· 14 isolated Zen5 cores Β· two-run stable Β· 100% 2xx Β· varied-Host keep-alive load
Contender req/s cores rps / core p50 p99 p50 β†’ p99
origin-direct ceiling β€” no proxy 739Kβ€”β€”0.73 ms7.0 msβ€”
continuity Β· fd-passing efficiency + tail champion Β· 2.4 of 14 cores 518K2.4216K1.53 ms2.96 ms1.9Γ—
continuity Β· data-forward trails nginx per-core 481K4.6105K1.97 ms6.9 ms3.5Γ—
nginx Β· 8 workers wins the median β€” 1.28 ms 436K3.45127K1.28 ms10.8 ms8.4Γ—

Ryzen 9 9950X3D (Zen5), 14 isolated cores, OS noise quarantined, verified Lean 4 + io_uring. Two stable runs. The load generator and origin share the box β€” so req/s totals are a floor (fd-passing serves 518K on 2.4 cores, nowhere near saturated), and rps/core is the robust metric. Incumbent tuning and a multi-run protocol on dedicated, separated hardware are in progress before this is treated as settled.


The trust story

One paragraph, one command

Every theorem's assumptions are enumerable by a single command. There is no test suite in the trust base, no vendor library, and no proof that depends on anything you cannot check yourself. The whole of it fits here:

The trusted computing base is the Lean kernel, one well-defined evaluation axiom β€” Lean.ofReduceBool, the same axiom carried by the SAT-backed tactics used in production verification at AWS β€” the pinned Unicode tables, and the ~700-line C shim. That is the entire surface. It is smaller than the trust base of any production Unicode implementation or any unverified proxy, and unlike theirs, it is auditable by an outside party without asking us for anything.

Why the pieces fit

The full arc of a trust decision

Every security decision that touches the network runs the same path. We have a proof at each arrow β€” a diagram no one else can draw.

continuity

Bytes arrive

A proven transport parses frames and drives the connection through machine-checked state machines β€” no request smuggling, no protocol-state exploit, because the machine cannot enter a state the theorems forbid.

continuity codecs

Bytes become text

Wire codecs decode headers and payloads with proven round-trips β€” the layers agree with themselves by construction, not by convention.

unicode-lean

Text becomes a decision

Normalization, identity, and threat verdicts are computed by proven semantics β€” the confusable is caught, the invisible payload is seen, the canonical form is provably canonical, before anything downstream trusts it.


Try it Β· verify it

Don't trust us. Check.

The public artifact re-checks from source with an independent kernel checker. This is the whole thesis: the correctness argument is the same for you as it was for us.

$ git clone https://github.com/jpyxal-straylight/unicode-lean $ cd unicode-lean && lake build $ lake env lean4checker # independent kernel re-check βœ“ every theorem re-verified Β· no trust in our toolchain required

Confusable & covert-channel detector

Illustrative preview

This browser preview illustrates the kind of verdict the detectors produce. The production verdict is computed by the verified Lean artifact behind a hosted oracle β€” the same proven code, not this JavaScript sketch. The live oracle is in progress.


What's next β€” not yet shipped

Roadmap

Each item builds on a proven foundation rather than an open research question. Stated in the future tense, deliberately.

TLS serving next

LibreSSL termination beneath the proven stack. The client path exists today; the server accept path closes the β€œsecure transport” story.

Cross-host mesh over RDMA next

mesh_faithful already guarantees correctness wherever the channel is lossless and ordered. RDMA reliable connections provide exactly that in hardware β€” making this a transport build with failure-handling as the open design item, not a theory gap. We intend to extend the mesh over RDMA interconnects (RoCE / InfiniBand, e.g. NVIDIA ConnectX) via libibverbs.

The verified text-security gateway converging

The transport half already exists and beats nginx. The remaining work wires the proven Unicode detectors inline to screen every request at line rate β€” aimed first at LLM API security, where a false β€œclean” verdict from an unverified scanner is itself the vulnerability.

The ICU differential oracle planned

Our proven normalization as ground truth, run at mesh scale against the implementation the entire internet depends on. Every disagreement is a bug report or a spec clarification.