Docs · OGN platform
GPU-native genomics operating system
From raw reads to GIAB-validated variant calls in a continuous GPU pipeline. This is the control surface for the engine: CLI, pipelines, benchmarks, and deployment runbooks.
CUDA 12+Hopper · AmpereGIAB-validated flowsSchemas stable
Viewing
Public benchmark page
OGN Benchmark Contract v1.1 (Public Summary)
- Panels: GIAB core (HG001-HG007) smoke slice, multi-platform HG002 smoke, CHM13 perf stress
- Reference/Truth: GRCh38 + GIAB v4.2.1 for accuracy; CHM13v2.0 for perf stress
- Determinism: fixed manifests, seeds, and env.json (hardware, drivers, container, git tag)
- CI gating: accuracy and performance on every PR/push; gold snapshots stored in-repo
- Reproducibility: validated across CPU/GPU targets; env + metrics published with releases
- Contract files: metrics_spec.json, metrics_perf_spec.json, thresholds_*.json, gold TSVs under benchmarks/gold
- Badge: CLI and UI show “Verified by OGN Benchmark Contract v1.1 ✓” with hover text citing GIAB/CHM13 panels
GPU-native ingest contract (OGX chr20)
- Scope: chr20 OGX micro-bundle + overrides; validates Omega chain overrides, ingest performance, and OGC stamping.
- Parity: device ingest path is alignment-identical to the host path (MapperDeviceIngestEquiv GPU test).
- Perf gate: h2d throughput and wall-time thresholds calibrated on RTX 5070 and enforced in CI (benchmarks/thresholds_ingest_ogx_chr20.json).
- Provenance: ingest metrics flow from mapper logs → bench TSV → OGC manifests, so every capsule carries real h2d bytes/gbps and wall time.
- Status is reported alongside GIAB in CI/release notes; failures block merges/releases just like the accuracy/perf panels.
Coming next (v1.2 target): HG002 multi-platform consistency
- Illumina 2x150, PacBio HiFi, and ONT R10.4.1 on HG002
- Metrics: per-platform recall/precision plus cross-platform genotype concordance in GIAB confident regions
- Will ship with thresholds, gold TSVs/JSON, and a CI gate like the other panels
To ship a release, OGN must pass this contract. Releases list the contract version passed and the environments used. Public site targets: ogn.bio/bench, ogn.bio/benchmark-contract, ogn.bio/verified.
How to reproduce (public buckets)
- Grab the manifests under
benchmarks/smoke/(GIAB chr20 slices) andbenchmarks/smoke_multip/(HG002 multi-platform smoke). - Inputs are publicly hosted in GIAB/NCBI buckets; see
benchmarks/configs/benchmarks.yamlfor URLs. - Run
python tools/ogn_bench.py bench smokeand compare with the in-repo gold TSVs.
Public microbench contract (v1.0)
- Purpose: expose the low-level knobs reviewers ask about (FM lookup, S3 jitter, a single GPU kernel timing harness, a CPU reference baseline).
- Contract:
bench/micro/manifest.yaml→ JSONL records (schemaogn/bench/micro/v1). - Golden + thresholds:
bench/micro/gold/baseline.jsonl,bench/micro/thresholds.json. - Run:
./ogn bench micro --build-missing(ortools/run_bench.shfor smoke+micro in one command).
Three headline numbers (fill + rebaseline on release hardware)
- CPU baseline (SW align_ref, 2048×128×128): ~6.2k alignments/s, ~0.10 GCUPS (
bench/micro/gold/baseline.jsonl). - OGN GPU (SW kernel, 10k×256×256 on RTX 5070): ~1.03M alignments/s, ~67.5 GCUPS (
bench/micro/gold/baseline.jsonl). - FM lookup (synthetic 1 MiB text, 100k×L16 queries, loops=256): see
bench/micro/gold/baseline.jsonl. - Competitor placeholder (Parabricks-style): TBD (tracked in docs; replace with real runs once licensed).
Pipeline perf_guard baseline (chr20_sim32): 520k reads/s, 3.8 GCUPS, 9.2 GB/s ingest (
bench/perf_baseline.json).Cost estimate (procurement-friendly)
After any run, use:
BashPowerShellPython API (coming)
ogn CLIogn estimate-cost <profile>This prints per-run GPU hours, storage, and egress estimates plus a USD breakdown using
config/cost_model.json (overrideable via OGN_*_USD env vars).Example output (illustrative; not a performance claim):
$ ogn estimate-cost illumina_wgs
Profile: illumina_wgs
Run: /path/to/runs/illumina_wgs_hg002/20251216T120000Z
State: succeeded
GPU: NVIDIA A100 (count=1)
Wall: 2592.00s
Estimates (per run):
GPU hours: 0.7200
Storage: 12.4000 GB (12400000000 bytes)
Egress: 0.3000 GB (300000000 bytes; scope=results)
Retention: 30 days
Costs (USD; config/cost_model.json):
GPU compute: $1.7280 @ $2.4000/hr (A100_80GB)
Storage: $0.2852 @ $0.0230/GB-month (scaled to 30d)
Egress: $0.0270 @ $0.0900/GB
Total: $2.0402
System bench contract (HG002)
- HG002 is the end-to-end credibility anchor; we keep the harnesses visible and versioned.
- Harness:
bench/hg002/run_bench.sh(local runner that stages inputs, executes Nextflowpipelines/giab_validation.nf, and writesbench/reports/<timestamp>/). - Contract: accuracy/perf gated against in-repo gold/thresholds (see
bench/hg002/baseline.jsonplusbenchmarks/thresholds_*.jsonandbenchmarks/gold/*). - UI/CLI language stays consistent with microbench: versioned manifest → machine-readable outputs → gold + thresholds → CI gate.