TensorWasm
cve-disclosure-dry-run
cve-disclosure-dry-run
Manual procedure for rehearsing the CVE disclosure pipeline end-to-end
without an actual vulnerability. Not an alert; satisfies the v0.5
security-workstream commitment in
PATH-TO-V1.md ("CVE disclosure pipeline
exercised at least once — intentional rehearsal, not a real CVE —
before v0.5") and exercises every step the Security committee will
take when a real report eventually lands.
Severity: manual (scheduled by the Security committee on the cadence below; may also be invoked ad-hoc when membership changes or when a real report arrives after a long dormancy).
This runbook is a procedure runbook — it does not follow the
nine-section alert template in README.md "Runbook
contract". It sits alongside rollback.md,
oncall-paging.md, and
disaster-recovery.md as the fourth manual
playbook. It overlaps most with disaster-recovery.md: both rehearse
a high-consequence, low-frequency response. Where DR rehearses the
operational pipeline, this runbook rehearses the disclosure pipeline.
When to run this runbook
Run a dry-run when any of the following is true:
- Quarterly cadence. Mandatory once per calendar quarter,
regardless of whether a real report has arrived. If a quarter
ends with no dry-run on file under
security/dry-runs/, the next non-security release is held until one completes — mirrors the under-staffed-committee rule inGOVERNANCE.md. - Membership change. Within 14 days of the Security committee
gaining or losing a member (per
MAINTAINERS.mdand the onboarding flow inGOVERNANCE.md). New members walk the pipeline before the next real report; departing members are confirmed off embargo channels. - Dormancy break. If a real embargoed report arrives and the pipeline has not been exercised (real or dry-run) in the prior 30 days, the on-call committee member walks this runbook in parallel with real triage. A 30-day-cold pipeline is the most common failure mode for security response programs — more dangerous than a slightly stale runbook.
Outside these triggers, do not run a dry-run; they cost committee
time and clutter the security/dry-runs/ history.
Scope
The dry-run rehearses every link of the pipeline that runs on human time, not the code-shaped artefacts that depend on a real vulnerability existing.
In scope (rehearsed):
- Triage (the 72-hour acknowledgement commitment from
SECURITY.mdand the standing commitments inGOVERNANCE.md). - CVSS scoring against the rubric the committee will actually use on a real report.
- Embargo notification to the design-partner list and the named
operator backups in
MAINTAINERS.md. - Fix-development sequencing (without merging real code — see the synthetic-bug rule below).
- Coordinated disclosure (the cadence and channel handoff from embargo to public).
- Backport application against the
SECURITY.mdW3.5 branch model: identifying everyrelease-vN.xbranch the synthetic patch would land on, enumerating cherry-pick targets, confirming the branch ACLs still allow the committee to push. - Public announcement (the GitHub Security Advisory wording, the
release-notes entry shape, the
CHANGELOG.mdline under "Operator-visible behaviour change"). - CVE record filing (drafting the CNA submission text, even if not actually filing it).
Out of scope (not rehearsed):
- An actual code change. The synthetic patch is described in prose
and (optionally) sketched as a draft PR on a private throwaway
branch; it must never land on
mainor arelease-vN.xbranch. Required marker text in every artefact:do not implement — cve-disclosure-dry-run synthetic bug. - Filing a real CVE with the assigning CNA. The submission text is drafted to prove the form fits, then destroyed.
- Publishing a real GitHub Security Advisory — use
draftstate and close without publishing. - Sending the embargo notification to anyone outside the participant list (the committee + design partners who have opted into rehearsals).
The scope split keeps the rehearsal cheap enough to run quarterly. Side effects that compound (a real CVE id, a real advisory, a real backport commit) erode trust in the next real report.
Roles
Every dry-run staffs four roles. The Security committee owns the
rotation; assignments are recorded at the top of the
security/dry-runs/YYYY-MM-DD.md file before the rehearsal starts.
- Reporter. A committee member acting as outside researcher.
Drafts the synthetic report, sends it to
security@craton.com.arfrom a personal address, then goes silent on pipeline channels until coordinated-disclosure time. The role should rotate so every committee member rehearses the outside-researcher posture at least annually; two consecutive dry-runs with the same Reporter is a process smell to flag in the postmortem. - Triage owner. Next committee member on the rotation. Acknowledges the report, drafts CVSS, opens the private GitHub Security Advisory, and is the primary point of contact for the synthetic reporter. This role most resembles the real-CVE first responder, so use it to shake out muscle-memory pieces: the acknowledgement template, the Advisory form, the embargo-list addresses.
- Fix owner. Third committee member. Writes the synthetic patch description, enumerates backport targets per W3.5, walks the embargo list through the proposed remediation. Does not merge anything: the patch lives only in the embargo email and optionally a draft PR with the do-not-implement banner.
- Comms owner. Lead maintainer (per
GOVERNANCE.md) or a written delegate. Owns outward-facing artefacts (Advisory text, release-notes entry,CHANGELOG.mdline, announcement). Equally responsible for not publishing any of them — the rehearsal ends with comms artefacts staged but withheld.
If the committee has fewer than four members, Comms may overlap with Triage (both run on the same 72-hour clock) but Reporter and Fix owner must stay separate. A single-person rehearsal is a self-review of the runbook, not a dry-run, and is recorded as such.
Tabletop scenario
Every dry-run uses a canned synthetic vulnerability. Using the same template makes pipeline metrics comparable across rehearsals; the scenario is realistic enough to exercise the form fields and the CVSS rubric without naming a real code path.
Synthetic vuln (do not implement). A bounds error in the
tensor-wasm-snapshotCRC32 mismatch handler causes 16 bytes of the zstd decoder's internal state to be returned to the caller in the error message instead of a generic "checksum failed" string. The leak is reachable from any caller oftensor-wasm restore <path>with a crafted snapshot file. The bytes leaked are zstd internal state, not tenant memory, but in practice they include pointers to guest-controlled buffers and the layout of the decoder's lookup tables. CVSS 3.1: roughly 5.3 (Medium) —AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N, network-unreachable, requires a user to restore an attacker-supplied snapshot, leaks confidential decoder state.
This vuln is fake. The CRC32 mismatch handler in
crates/tensor-wasm-snapshot/src/reader.rs returns a generic error
and has never leaked decoder state. The scenario is constructed to
be plausible enough to exercise the CVSS rubric honestly (no trivial
"10.0 RCE"), bounded enough that the Fix owner can describe the patch
in two paragraphs without writing real code, and obviously fake on
inspection so anyone reading the embargo email after cleanup can tell
at a glance that no real change was needed.
Later dry-runs may vary the scenario (different crate area, severity
band, or a coordinated-with-upstream scenario where the bug is in
wasmtime or cudarc) but the "do not implement" banner is
non-negotiable and the scenario must remain inspectable as fake.
Procedure with target timings
Timings below are rehearsal targets. They are compressed relative to
the real-CVE timelines in
SECURITY.md and
GOVERNANCE.md but
preserve the shape: acknowledge-fast, deliberate-fix, coordinated-
disclose, with a backport-aware mid-cycle checkpoint. The 90-day
fix-or-workaround commitment is compressed to one calendar week so
the pipeline fits an end-to-end rehearsal. Real-report timings are
not compressed; they remain whatever SECURITY.md states.
T+0 — Reporter submits
The Reporter sends the synthetic-vuln email to
security@craton.com.ar. Subject line must begin with [DRY-RUN]
so the rehearsal cannot be confused with a real report. Body
includes: the synthetic-vuln text verbatim from
Tabletop scenario; a note that this is a
dry-run; the Reporter's chosen disclosure preference (credit by name
vs anonymous) — pick one per rehearsal so both branches get
exercised over a year. Reporter then stays silent on pipeline
channels until T+7d; the rehearsal also tests how the inbox handles
a quiet reporter.
T+1h — Triage owner acknowledges
Reply from security@craton.com.ar with the standard
acknowledgement template (matches what the committee will send a
real reporter): confirm receipt with timestamp, name the Triage
owner as primary contact, state the 72-hour SLO and 90-day
fix-or-workaround commitment from
GOVERNANCE.md, ask
for reproducer / version range / deployment shape.
In parallel, Triage owner:
- Drafts the CVSS 3.1 vector and score. Target: within 0.5 of the canned 5.3 — if a rehearsal scores the canned scenario above 7.0 or below 4.0 the committee's CVSS calibration is the finding, not the score.
- Opens a private GitHub Security Advisory with the synthetic-vuln
description and the do-not-implement banner. Lives in
draftstate, closed during postmortem at T+10d.
T+24h — Embargo notification
Triage owner (with Comms review) sends the embargo notification to:
- Every active member of the Security committee (per
MAINTAINERS.md). - Named operator backups in
MAINTAINERS.mdfor the affected crate area (here:tensor-wasm-snapshot). - The W3.6 SBOM consumer list (design partners and downstream operators opted into pre-disclosure notifications, kept next to the SBOM publication tooling). An empty list is itself the finding for this rehearsal.
The template names bug class, affected versions, proposed embargo date, and disclosure preference. No patch yet; that lands at T+72h.
T+72h — Fix owner circulates a synthetic patch
A one- or two-paragraph fix description to the embargo list. For
the canned scenario: "wrap the CRC32 mismatch error in a generic
SnapshotError::ChecksumFailed and stop reading from the zstd
decoder's internal buffer after the mismatch; no caller-visible
bytes change." Attach the do-not-implement banner and state this is
the rehearsal's synthetic patch.
Optionally sketch as a draft PR against a private throwaway branch
named dry-run/YYYY-MM-DD/synthetic-snapshot-leak, closed unmerged
at T+10d. Do not push to any release-vN.x line and do not merge
to main even temporarily; the absence of the synthetic commit
in git log is the disclosure-day verification that no real change
leaked.
T+5d — Pre-disclosure backport verification
Fix owner enumerates the maintenance branches the synthetic patch
would land on under W3.5 in
SECURITY.md:
- List every
release-vN.xbranch currently within its 12-month window. - For each, confirm the file the synthetic patch would touch
(
crates/tensor-wasm-snapshot/src/reader.rsfor the canned scenario) exists, the function signature still matches, and the committee has push rights. - Confirm GitHub branch-protection rules on each
release-vN.xallow the committee's security-hotfix workflow; missing workflow is itself a finding.
This checkpoint is the part of the pipeline most likely to surface drift: maintenance branches with lost backport reviewers, broken CI, or silently-changed protection rules. Surface those here and not on disclosure day.
T+7d — Coordinated disclosure
Comms owner stages (but does not publish) every public artefact:
- GitHub Security Advisory text, in private draft.
- Release-notes entry against the next-release section of
CHANGELOG.md. - The "Operator-visible behaviour change" line per
SLO.md§9. - CVE record submission text, ready to send to the assigning CNA.
- Maintainer-list announcement and (if applicable) social post.
Triage owner schedules a synthetic "release window" with Comms — the rehearsal does not cut a real release, but the mechanics of coordinating an embargoed cherry-pick + tag + push are walked through and recorded. Reporter is brought back in to confirm credit attribution.
T+10d — Postmortem
Committee meets (sync or async) within three days of T+7d. Outputs:
security/dry-runs/YYYY-MM-DD.md(template under Where to record results) committed.- Private draft Security Advisory closed without publishing.
- Draft PR (if any) closed unmerged. Committee verifies
git log --all -- crates/tensor-wasm-snapshot/src/reader.rsshows no new commits between T+0 and T+10d. - Every artefact carrying the do-not-implement banner deleted or
archived under
security/dry-runs/with the banner intact. - At least one actionable improvement filed as a public issue
tagged
security-dry-run-followup.
Verification checklist
Before declaring complete, Triage owner walks this list. Every item is green or red; a red item is a postmortem finding, not a reason to silently retry.
- All four roles staffed at T+0 and held by the same person throughout (no quiet handoffs).
- Acknowledgement at T+1h within 2x (by T+2h). The real SLO is 72 hours; the tighter rehearsal target catches inbox latency.
- Embargo notification at T+24h within 2x (by T+48h), reached every name on the design-partner + crate-backup list with no NDR bounces.
- Synthetic patch at T+72h reviewable by every embargo-list recipient (one ack each).
- Backport enumeration at T+5d covered every in-window
release-vN.xwith no surprises (no lost ACLs, stale protection rules, or missing CI workflow). - Disclosure artefacts at T+7d staged and reviewed but none
published — GitHub Advisory remains
draft, no new tag, no new release, no newCHANGELOG.mdline onmain, no commit on anyrelease-vN.xfrom any committee member between T+0 and T+10d. - Postmortem produced at least one actionable issue and one
security/dry-runs/YYYY-MM-DD.mdrecord committed.
A rehearsal in which any "no publish" item failed is not a
successful dry-run — it is an incident, escalated per
oncall-paging.md. The leak of a synthetic
advisory is recoverable but must be acknowledged publicly under the
same standards as a real disclosure breach.
What constitutes a successful rehearsal
A dry-run is successful when all four hold:
- All four roles staffed by distinct people (or three distinct people with the documented Comms-Triage overlap, overlap recorded).
- Each timing target met within 2x. The 2x cushion exists because rehearsals run on volunteer time; missing 3x consistently is a sign the committee is under-staffed and feeds the governance escalation in When the dry-run fails.
- Backport enumeration at T+5d produced no surprises (a maintenance branch the committee did not know was in window, a branch they cannot push to, a branch with red CI, a code path that no longer matches the canned scenario). Surprises do not fail the rehearsal — finding them is the point — but uncaptured surprises do.
- Postmortem produced at least one actionable improvement with an owner and a target date. Zero follow-ups means either the team did not look hard or is in steady state; both deserve a sentence in the record explaining which.
A rehearsal that misses any of the four is recorded honestly: a
dry-run with findings is still a successful run-of-the-runbook; a
dry-run that nobody finished is a status: incomplete entry.
Where to record results
Every dry-run produces security/dry-runs/YYYY-MM-DD.md (the T+0
date, not the postmortem date), committed to the repository on
main. The directory holds rehearsal logs, not vulnerability
records — the public-repo location is deliberate so anyone evaluating
the project can gauge how seriously it takes the disclosure pipeline.
Required sections:
- Participants. Four roles with names.
- Scenario used. Pointer to the canned scenario or variation, with the do-not-implement banner reproduced.
- Timings observed. T+0, T+1h, T+24h, T+72h, T+5d, T+7d, T+10d, actual vs target.
- Findings. Red items from the verification checklist plus backport-enumeration surprises.
- Follow-ups filed. Public issue links, one per actionable improvement.
- Confirmation of no publication. Short paragraph asserting the six "no publish" items from the verification checklist held, signed by Comms owner.
Sensitive material (embargo-list addresses, drafts that accidentally named real CVEs, the contents of the synthetic embargo email) stays on the committee's private channel; the public record captures the shape of the rehearsal, not raw contents.
When the dry-run fails
Three distinct failure modes, three different remediations.
- Roles cannot be staffed. Fewer than three committee members
available in the quarter, or Reporter and Fix owner cannot be
separated. This is a governance signal, not a process problem.
The lead maintainer (per
GOVERNANCE.md) opens an amendment RFC againstGOVERNANCE.mdproposing one of: enlarging the committee, redistributing responsibilities, or reducing cadence to match capacity. The committee does not silently skip dry-runs; an honest cadence reduction in writing beats a paper one in practice. - Timings missed by more than 2x. The pipeline exists but the committee cannot drive it on schedule. Capture the slowest step in the postmortem; common slow steps are inbox monitoring and embargo-list curation. Both are fixable without governance changes.
- A publication leaked. A draft Advisory was published, a
synthetic patch landed on a real branch, a CHANGELOG line went
out. This is the dry-run-as-incident case from the
Verification checklist. Escalate per
oncall-paging.md, publish a short public retraction naming the rehearsal, and treat the next real report with extra care. A leaked rehearsal validates the need for the runbook, not the opposite.
Related
SECURITY.md— disclosure contract and W3.5 backport policy this runbook stress-tests (does not modify).GOVERNANCE.md— committee standing commitments and embargo-discipline rules exercised here.MAINTAINERS.md— current committee roster (dry-run roles draw from here) and crate-backup contacts (embargo-list step).PATH-TO-V1.md— v0.5 security-workstream commitment this runbook discharges.disaster-recovery.md— sibling procedure runbook (W3.7) for low-frequency, high-consequence rehearsals; shares staffing pattern and postmortem discipline. A combined DR + disclosure tabletop is a reasonable extension once both have been run separately a few times.rollback.md,oncall-paging.md— sibling procedure runbooks; oncall-paging is the escalation path for a leaked rehearsal.README.md— runbook directory contract (W2.6); this file is named there as the fourth procedure runbook.CHANGELOG.md— where a real disclosure lands publicly; the dry-run stages but does not publish.