Skip to content
Docs · OGN platform

GPU-native genomics operating system

From raw reads to GIAB-validated variant calls in a continuous GPU pipeline. This is the control surface for the engine: CLI, pipelines, benchmarks, and deployment runbooks.

CUDA 12+Hopper · AmpereGIAB-validated flowsSchemas stable

OGC Experiment Capsule (v1)

Purpose

OGC describes a concrete mapping experiment: which OGX bundle, which reads, which mapper binary/args/overrides, and which outputs. It is the audit/provenance unit for Helix/VeriBiota.

Manifest fields (manifest_v1.schema.json)

FieldTypeReqDescription
versionintyesSchema version (1)
idstringyesExperiment id (e.g., chr20_10M_omega_gpu_q32)
ogx_bundle.pathstringyesPath to OGX bundle root or manifest
ogx_bundle.manifest_sha256stringyesSHA256 of OGX manifest
ogx_bundle.chain_params_sha256stringyesSHA256 of chain_params.json
ogx_bundle.bundle_idstringnobundle_id from manifest (for auditing)
inputs.reads_fastqstringyesFASTQ path
inputs.read_countint/nullnoOptional count
inputs.read_lengthint/nullnoOptional read length
mapper.backendstringyes"omega"
mapper.modestringyes"gpu"/"cpu"
mapper.binarystringyesPath to mapper binary
mapper.git_commitstringnoShort git sha
mapper.argumentsarraynoCommand args used
mapper.omega_chain.sourcestringyesbundle:env:file:cli order used
mapper.omega_chain.fieldsarrayyesOverride fields applied
mapper.omega_chain.effective_hashstringyesSHA256 of effective chain config
mapper.omega_chain.overrides_pathstringnoOverride JSON path if file/cli
outputs.alignmentsstringyesAlignment output
outputs.statsstringyesStats JSON
outputs.logsstringyesLog path
ingest.readsintnoReads ingested
ingest.data_bytesintnoTotal data bytes (seqs/quals/etc.)
ingest.offset_bytesintnoTotal offset bytes
ingest.chunk_countintnoH2D copy chunks
ingest.h2d_msnumbernoHost→device millis
ingest.h2d_bytesintegernoBytes copied host→device
ingest.h2d_gbpsnumbernoh2d throughput (GB/s)
ingest.wall_secondsnumbernoIngest wall clock seconds
ingest.reads_per_secnumbernoDerived throughput
ingest.bytes_per_secnumbernoDerived throughput
metrics.versionintnoMetrics schema version (1)
metrics.readsintnoReads processed
metrics.runtime_msnumbernoTotal runtime in ms
metrics.qpsnumbernoReads per second

How it references OGX

OGC stores the OGX bundle path plus SHA256 fingerprints for manifest.json and chain_params.json. OGX content stays external; the hashes guarantee integrity.

Overrides provenance

  • mapper.omega_chain.source captures the winning order (bundle:env:file:cli).
  • mapper.omega_chain.fields lists overridden chain fields.
  • mapper.omega_chain.effective_hash is a stable SHA256 of the effective chain parameters after applying overrides.
  • mapper.omega_chain.overrides_path records the file used for file/cli modes.

Example

  • Micro: tests/data/ogc/micro_ogc.json
  • (Benchmark) chr20 example produced by scripts/bench_omega_chr20_ogx.py (when run with --regenerate-ogc).

Intended consumers

  • Helix ingest and Studio UI (status chips, hover details)
  • VeriBiota-style verifiers (replay/attest a run)
  • CI artifacts for reproducibility

Ingest semantics (OGX + FASTQ/SAM)

  • ingest.bytes counts bundle files (OGX) plus any read source if present.
  • ingest.read_bytes optionally captures streamed read payload (0 when absent).
  • ingest.wall_seconds spans from ingest start to first device batch ready.
  • ingest.h2d_ms measures GPU upload time of staged buffers/indexes.
  • ingest.h2d_bytes measures bytes copied host→device for OGX index or staged read buffers. Mapper logs emit ingest-summary lines; bench tooling copies these fields into the OGC manifest.

Notes for auditors

  • Treat OGC as the complete, immutable capsule for a mapping run: bundle fingerprints, override lattice, effective chain hash, ingest telemetry, and headline bench metrics (qps/runtime/read count).
  • Reserve extra fields for upcoming extensions (variant caller profile, FM version, GPU arch) to avoid breaking the format later.
OGC spec | OGN documentation | Omnis Genomics