Leaderboard

Public ranking of how well documentation explains itself to LLMs. Editions ship quarterly. Every row pins to a reproducible methodology version so scores remain comparable across runs.

Methodology v1.0 Dispute policy Run on your docs

First edition — seed rows

Seed placeholder rows are shown while the first-edition eval corpus queues up. Once Specshift Cloud writebacks begin, live scores replace these placeholders automatically.

Rank	Target	Top suite	Methodology	Score
01	—	retrieval	v1	—
02	—	agent	v1	—
03	—	structure	v1	—
04	—	drift	v1	—

Disputes are public. Submit one and the whole record — submission, ruling, reasoning, and chain hash — gets posted at /leaderboard/disputes/[id]. No silent corrections. No silent rejections. Each ruling carries an audit-chain hash; once Specshift Cloud anchors a published edition, those hashes are written to immutable, write-once storage so a reviewer can verify the public record matches the original.