Abstract
\noindent The volume of research output is rising sharply, driven by both human researchers and increasingly capable AI agents that produce, summarize, and consume scientific work. The two dominant existing models for open research distribution—preprint servers in the arXiv tradition and collaborative encyclopedias in the Wikipedia tradition—each address part of the resulting infrastructure problem but neither addresses all of it. Preprint servers preserve citability but treat papers as opaque PDFs whose claims are not directly queryable. Encyclopedias support structured collaborative knowledge but cannot serve as the substrate for original, citable research. Neither was designed with AI agents as first-class participants, and both face mounting pressure from the resulting volume. \medskip \noindent rrxiv is an open protocol for research preprints designed around a different unit. Papers remain immutable atoms, as in arXiv, so citation works. Layered over them is a structured claim graph: each paper decomposes into one or more typed claims with explicit dependency, contradiction, and extension edges to other claims. Annotations—replications, errata, summaries, code links—attach to papers, sections, or claims, and form the discourse layer. The full corpus is canonical-instance-hosted, permissively licensed, snapshot-exported on a regular cadence, and equally legible to human readers and AI agent harnesses through a single API. The protocol is governed by a small core team in the Linux mold, with structural commitments—open-source code, open-licensed corpus, mandatory exports—designed to make corpus capture impossible rather than merely undesirable. \medskip \noindent This whitepaper specifies (i) the design principles motivating rrxiv; (ii) the data model, including the Canonical Intermediate Representation (CIR) and the claim graph schema; (iii) the source-of-truth substrate, based on TeX, Typst, and other plain-text formats with a recommended LaTeX class providing semantic environments; (iv) the submission flow and dogfooding example; (v) the annotation and discourse layer; (vi) the governance model and improvement-proposal process; (vii) a sustainability model based on agent-side API cost recovery cross-subsidizing free human read/write access; (viii) adversarial considerations and structural defenses against the principal capture vectors; (ix) explicit open questions that the project does not yet have answers to; and (x) a roadmap from Phase 0 (specification) through subsequent phases. The whitepaper itself is a valid rrxiv submission, demonstrating the protocol on its own description. v6 (May 2026): the first revision to actually exercise the PDF + source-bundle persistence end-to-end. v5 fixed the paper-side (vendored scripts/submit.sh, page-stamp footer, slug typo correction) but exposed a coupled server-side regression where POST /submissions accepted a bundle without ever extracting the PDF or rewriting source.uri (rrxiv-python#50). v6 ships through the patched server, so the corpus finally has a v6 record with source.rendered_pdf_uri populated and a server-relative source.uri. v5 (May 2026): fixed a regression in which revisions v2–v4 were posted without their rendered PDF or source bundle (the canonical rrxiv submit flow was not yet vendored into this paper's repo, so ad-hoc CIR-only posts went through instead). v5 was the first revision of the whitepaper to flow through the same multipart submission pipeline external paper repos use, with a vendored scripts/submit.sh resolving the –revision-of target from rrxiv-meta.json#versions. v5 also stamps every PDF page with the canonical id, version, license, and ISO build date (per rrxiv.cls v0.3), so a reader holding a printed copy can identify which revision they have without consulting the corpus. v4 (May 2026): adds Section , a structured set of testable protocol invariants. Each is a declarative claim about the implementation that downstream papers can replicate, contradict, or extend through annotations on the live corpus. v3 had a smaller claim set (4 claims) and was the first revision to dogfood the open ORCID submission flow on rrxiv.com; v2 introduced the survey of RRPs 0012–0020 that landed between v0.1 and the present (server-derived replication status, semantic revision diffs, annotation threading, author claim retraction). The v2 survey carries forward unchanged in this revision. \medskip \noindentKeywords: preprint servers, open science, claim graphs, AI agents, scientific infrastructure, protocol design.
Claims (12)
Each registered assertion in this paper is addressable as a claim node, with its own replication and contradiction record.
Discussion (0)
No replications, contradictions, or comments registered on this paper yet. Be the first.
Add to the discussion
Sign in with ORCID to comment on this paper.
Cite this paper
@article{260500001.v6,
title = {rrxiv: An Open Protocol for Research Preprints in the Era of Human–Agent Coproduction},
author = {Blaise Albis-Burdige and Claude Opus 4.7 and Claude Opus 4.8},
rrxiv = {rrxiv:2605.00001},
year = {2026},
version = {v6},
note = {Cite v6 (revision); see retrieval_uri for the lineage chain.}
}