Skip to content
Docs · OGN platform

GPU-native genomics operating system

From raw reads to GIAB-validated variant calls in a continuous GPU pipeline. This is the control surface for the engine: CLI, pipelines, benchmarks, and deployment runbooks.

CUDA 12+Hopper · AmpereGIAB-validated flowsSchemas stable

OGN FM: GPU-native search that makes legacy aligners obsolete

Key numbers (chr20, fast query sets, single RTX 5070, single CPU thread baselines)
  • Build: FM index for chr20 now builds in ~9.6 s (libdivsufsort, down from >10 min).
  • Query throughput (QPS in millions): GPU FM ranges 1.1–9.9×10^3 M across L=16..1000.
  • Median speedups vs mainstream aligners (Bowtie2, BWA-MEM, Minimap2-sr, HISAT2):
    • BWA-MEM: ~6.3×10^5× faster on GPU; ~26× on CPU fallback.
    • Bowtie2: ~5.1×10^5× faster on GPU; ~27× on CPU.
    • Minimap2-sr: ~3.1×10^5× faster on GPU; ~20× on CPU.
    • HISAT2: ~1.4×10^5× faster on GPU; ~8× on CPU.
What this means
  • This is not “GPU acceleration.” It is a 5–7 order-of-magnitude discontinuity. CPU-era seeding/lookup assumptions no longer apply.
  • Even the CPU FM backend is a new baseline (20–1000× over Bowtie2/BWA on the same data). Laptop mode is viable and still ahead of legacy tools.
  • Any pipeline currently bound on seed-search/alignment can be collapsed from minutes to milliseconds on a single consumer GPU.
Alignment Turbo Mode (drop-in accelerator)
  • Wrap existing aligners’ seed stage with OGN FM; no pipeline changes required.
  • Bowtie2 users get ~5e5× seed throughput; BWA users ~6e5×; Minimap2 short-read users ~3e5×; HISAT2 users ~1e5×.
  • Positioning: “Turn your aligner into a GPU-native engine with one call.”
OGN Align (next)
  • Build a full GPU-native aligner on top of FM seeding + GPU SW/DPX Pair-HMM + persistent kernels + streaming I/O. The seed stage is already 10^5–10^7× faster; downstream DP will inherit the same philosophy.
Aligner-equivalent GPU (marketing lens)
  • One RTX 5070 ≈ 6.3×10^5 BWA-MEM CPU cores on chr20; ≈ 5.1×10^5 Bowtie2 cores; ≈ 3.1×10^5 Minimap2 cores; ≈ 1.4×10^5 HISAT2 cores.
Artifacts
  • Raw FM results: results/fm_chr20.csv
  • Aligner results: results/aligners_chr20.csv
  • Speedup summary: results/aligners_speedups.csv
  • Plots (generate locally if matplotlib is present): results/fm_aligners_qps.png, results/fm_aligners_speedup.png via ./scripts/summarize_aligners_chr20.py
One-liner you can ship
“On GRCh38 chr20, OGN GPU FM delivers 5–7 order-of-magnitude speedups over Bowtie2, BWA-MEM, Minimap2, and HISAT2, with chr20 index builds in ~9.6 seconds. This is GPU-native genomics, not ‘acceleration.’”
FM announcement | OGN documentation | Omnis Genomics