Skip to main content

Hardproof quality report pipeline (draft)

This is a draft outline for the first quality report pipeline. It is scaffolding for methodology and reproducible aggregation, not a public proof page yet.

Data flow (proposed)

  1. Corpus selection: a checked-in manifest describing which targets are in-scope, with explicit exclusions and rationale.
  2. Run manifests: pinned tool versions, baseline versions, and deterministic scan inputs (including trust inputs when required).
  3. Execution: run hardproof corpus run to produce a stable per-target directory of scan artifacts.
  4. Aggregation: an offline script/notebook that computes summary tables and charts from corpus outputs without scraping logs.
  5. Publication: a report viewer page that embeds the methodology appendix and links to reproducible artifacts.

Methodology appendix (sections)

  • Score truth: publishable vs partial vs insufficient, and what evidence is required before a number is treated as publishable.
  • Confidence: which signals are estimate-grade (usage metrics) and which probes are bounded smoke signals (performance), including any confidence markers and sample counts.
  • Fairness: what comparisons are and are not meaningful across different server classes and transports.
  • Determinism: what makes corpus runs reproducible and what is intentionally excluded (non-deterministic load testing, LLM judging).
  • Interpretation: how to interpret warnings vs failures vs partial scores, and what follow-up evidence is expected before making stronger claims.