Docs · Omnis platform

Platform, pipeline, and proof docs

Use this docs shell for CLI workflows, architecture, benchmark contracts, and public operating guidance across the Omnis platform.

CLIArchitectureBenchmarksDeployment
Need another page?
Search the docs

Jump to buyers, verification, demos, self-hosting, or adapters without opening the full docs tree first.

Mobile navigation
Jump to section
Open
  • Panels: GIAB core (HG001-HG007) smoke slice, multi-platform HG002 smoke, CHM13 perf stress
  • Reference/Truth: GRCh38 + GIAB v4.2.1 for accuracy; CHM13v2.0 for perf stress
  • Determinism: fixed manifests, seeds, and env.json (hardware, drivers, container, git tag)
  • CI gating: accuracy and performance on every PR/push; gold snapshots stored in-repo
  • Reproducibility: validated across CPU/GPU targets; env + metrics published with releases
  • Contract files: metrics_spec.json, metrics_perf_spec.json, thresholds_*.json, gold TSVs under benchmarks/gold
  • Badge: CLI and UI show “Verified by OGN Benchmark Contract v1.1 ✓” with hover text citing GIAB/CHM13 panels

GPU-native ingest contract (OGX chr20)

  • Scope: chr20 OGX micro-bundle + overrides; validates Omega chain overrides, ingest performance, and OGC stamping.
  • Parity: device ingest path is alignment-identical to the host path (MapperDeviceIngestEquiv GPU test).
  • Perf gate: h2d throughput and wall-time thresholds calibrated on RTX 5070 and enforced in CI (benchmarks/thresholds_ingest_ogx_chr20.json).
  • Provenance: ingest metrics flow from mapper logs → bench TSV → OGC manifests, so every capsule carries real h2d bytes/gbps and wall time.
  • Status is reported alongside GIAB in CI/release notes; failures block merges/releases just like the accuracy/perf panels.

Coming next (v1.2 target): HG002 multi-platform consistency

  • Illumina 2x150, PacBio HiFi, and ONT R10.4.1 on HG002
  • Metrics: per-platform recall/precision plus cross-platform genotype concordance in GIAB confident regions
  • Will ship with thresholds, gold TSVs/JSON, and a CI gate like the other panels
To ship a release, OGN must pass this contract. Releases list the contract version passed and the environments used. Public site targets: ogn.bio/bench, ogn.bio/benchmark-contract, ogn.bio/verified.

How to reproduce (public buckets)

  • Grab the manifests under benchmarks/smoke/ (GIAB chr20 slices) and benchmarks/smoke_multip/ (HG002 multi-platform smoke).
  • Inputs are publicly hosted in GIAB/NCBI buckets; see benchmarks/configs/benchmarks.yaml for URLs.
  • Run python tools/ogn_bench.py bench smoke and compare with the in-repo gold TSVs.

Public microbench contract (v1.0)

  • Purpose: expose the low-level knobs reviewers ask about (FM lookup, S3 jitter, a single GPU kernel timing harness, a CPU reference baseline).
  • Contract: bench/micro/manifest.yaml → JSONL records (schema ogn/bench/micro/v1).
  • Golden + thresholds: bench/micro/gold/baseline.jsonl, bench/micro/thresholds.json.
  • Run: ./ogn bench micro --build-missing (or tools/run_bench.sh for smoke+micro in one command).

Three headline numbers (fill + rebaseline on release hardware)

  • CPU baseline (SW align_ref, 2048×128×128): ~6.2k alignments/s, ~0.10 GCUPS (bench/micro/gold/baseline.jsonl).
  • OGN GPU (SW kernel, 10k×256×256 on RTX 5070): ~1.03M alignments/s, ~67.5 GCUPS (bench/micro/gold/baseline.jsonl).
  • FM lookup (synthetic 1 MiB text, 100k×L16 queries, loops=256): see bench/micro/gold/baseline.jsonl.
  • Competitor placeholder (Parabricks-style): TBD (tracked in docs; replace with real runs once licensed).
Pipeline perf_guard baseline (chr20_sim32): 520k reads/s, 3.8 GCUPS, 9.2 GB/s ingest (bench/perf_baseline.json).

Cost estimate (procurement-friendly)

After any run, use:
BashRunnable example
ogn estimate-cost <profile>
This prints per-run GPU hours, storage, and egress estimates plus a USD breakdown using config/cost_model.json (overrideable via OGN_*_USD env vars).
Example output (illustrative; not a performance claim):
TextReference snippet
$ ogn estimate-cost illumina_wgs
Profile: illumina_wgs
Run: /path/to/runs/illumina_wgs_hg002/20251216T120000Z
State: succeeded
GPU: NVIDIA A100 (count=1)
Wall: 2592.00s

Estimates (per run):
  GPU hours:  0.7200
  Storage:   12.4000 GB (12400000000 bytes)
  Egress:    0.3000 GB (300000000 bytes; scope=results)
  Retention: 30 days

Costs (USD; config/cost_model.json):
  GPU compute: $1.7280 @ $2.4000/hr (A100_80GB)
  Storage:    $0.2852 @ $0.0230/GB-month (scaled to 30d)
  Egress:     $0.0270 @ $0.0900/GB
  Total:      $2.0402

System bench contract (HG002)

  • HG002 is the end-to-end credibility anchor; we keep the harnesses visible and versioned.
  • Harness: bench/hg002/run_bench.sh (local runner that stages inputs, executes Nextflow pipelines/giab_validation.nf, and writes bench/reports/<timestamp>/).
  • Contract: accuracy/perf gated against in-repo gold/thresholds (see bench/hg002/baseline.json plus benchmarks/thresholds_*.json and benchmarks/gold/*).
  • UI/CLI language stays consistent with microbench: versioned manifest → machine-readable outputs → gold + thresholds → CI gate.
Public benchmark page | Omnis documentation | Omnis Genomics