Leaderboard

Public ranking of how well documentation explains itself to LLMs. Editions ship quarterly. The first edition seeds 50 placeholder rows while live evals queue up — once Specshift Cloud’s writeback lands, every row pins to a reproducible methodology version.

First edition — seeded

Rows render once the seed corpus eval completes. Until then, the grid below is a structural placeholder. Past editions stay frozen at the methodology version they ran against — see the dispute policy for what does and does not get amended.

RankTargetTop suiteMethodologyScore
01retrievalv1
02agentv1
03structurev1
04driftv1

Disputes are public. Submit one and the whole record — submission, ruling, reasoning, and chain hash — gets posted at /leaderboard/disputes/[id]. No silent corrections. No silent rejections. Hash anchors land in S3 with Object Lock so a reviewer can verify the public record matches the original.