Benchmarks
GIAB HG002 benchmark pack
Full reports with configs, container hashes, and cost breakdowns. No cherry-picked samples. Use this pack to validate numbers or rerun the harness yourself.
What's inside
- HG002 30× WGS runs on A100 80GB, plus CPU 64-core baseline.
- Configs, Nextflow wrappers, and container digests.
- Cost table by instance type and pricing model.
- Verification summary with VeriBiota checks.
Want to see the deliverable format first? Download an example proof bundle (SHA256).
Hardware and setup
Cloud: A100 80GB; CPU baseline: 64 vCPU tuned BWA/GATK. Storage: object storage for reads, ephemeral SSD for scratch. Containers pinned via digest.
| Resource | Runs | Cost | Notes |
|---|---|---|---|
| A100 80GB (cloud) | 1 × HG002 30× WGS | ~$25–$70 (spot) / ~$70–$160 (on‑demand) | Compute-only range; full line-item breakdown in the full pack |
| CPU 64-core baseline | 1 × HG002 30× WGS | ~$45–$140 | Compute-only range for tuned BWA/GATK baseline |
Cost numbers are presented as ranges because cloud pricing varies by region and purchasing model. The full benchmark pack includes exact wall-time, instance configuration, and line-item cost inputs used to compute per-run totals.
Reproduce locally
Fetch HG002 from the public FTP, then run through the provided Nextflow wrapper pointing to OGN.
# download curl -O ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/HG002/NA24385_son/NIST_HG002_HiSeq300x_fastq/HG002_R1.fastq.gz curl -O ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/HG002/NA24385_son/NIST_HG002_HiSeq300x_fastq/HG002_R2.fastq.gz # submit to OGN ogn submit --reads HG002_R1.fastq.gz --reads HG002_R2.fastq.gz --ref s3://refs/grch38_noalt.fa --pipeline wgs30 --export vcf,metrics.json --verify veribiota:giab-hg002
Used internally for HG002 regression and multi-platform benchmarks.