Docs · OGN platform
GPU-native genomics operating system
From raw reads to GIAB-validated variant calls in a continuous GPU pipeline. This is the control surface for the engine: CLI, pipelines, benchmarks, and deployment runbooks.
CUDA 12+Hopper · AmpereGIAB-validated flowsSchemas stable
Viewing
OGC spec
OGC Experiment Capsule (v1)
Purpose
OGC describes a concrete mapping experiment: which OGX bundle, which reads, which mapper binary/args/overrides, and which outputs. It is the audit/provenance unit for Helix/VeriBiota.
Manifest fields (manifest_v1.schema.json)
| Field | Type | Req | Description |
|---|---|---|---|
| version | int | yes | Schema version (1) |
| id | string | yes | Experiment id (e.g., chr20_10M_omega_gpu_q32) |
| ogx_bundle.path | string | yes | Path to OGX bundle root or manifest |
| ogx_bundle.manifest_sha256 | string | yes | SHA256 of OGX manifest |
| ogx_bundle.chain_params_sha256 | string | yes | SHA256 of chain_params.json |
| ogx_bundle.bundle_id | string | no | bundle_id from manifest (for auditing) |
| inputs.reads_fastq | string | yes | FASTQ path |
| inputs.read_count | int/null | no | Optional count |
| inputs.read_length | int/null | no | Optional read length |
| mapper.backend | string | yes | "omega" |
| mapper.mode | string | yes | "gpu"/"cpu" |
| mapper.binary | string | yes | Path to mapper binary |
| mapper.git_commit | string | no | Short git sha |
| mapper.arguments | array | no | Command args used |
| mapper.omega_chain.source | string | yes | bundle:env:file:cli order used |
| mapper.omega_chain.fields | array | yes | Override fields applied |
| mapper.omega_chain.effective_hash | string | yes | SHA256 of effective chain config |
| mapper.omega_chain.overrides_path | string | no | Override JSON path if file/cli |
| outputs.alignments | string | yes | Alignment output |
| outputs.stats | string | yes | Stats JSON |
| outputs.logs | string | yes | Log path |
| ingest.reads | int | no | Reads ingested |
| ingest.data_bytes | int | no | Total data bytes (seqs/quals/etc.) |
| ingest.offset_bytes | int | no | Total offset bytes |
| ingest.chunk_count | int | no | H2D copy chunks |
| ingest.h2d_ms | number | no | Host→device millis |
| ingest.h2d_bytes | integer | no | Bytes copied host→device |
| ingest.h2d_gbps | number | no | h2d throughput (GB/s) |
| ingest.wall_seconds | number | no | Ingest wall clock seconds |
| ingest.reads_per_sec | number | no | Derived throughput |
| ingest.bytes_per_sec | number | no | Derived throughput |
| metrics.version | int | no | Metrics schema version (1) |
| metrics.reads | int | no | Reads processed |
| metrics.runtime_ms | number | no | Total runtime in ms |
| metrics.qps | number | no | Reads per second |
How it references OGX
OGC stores the OGX bundle path plus SHA256 fingerprints for
manifest.json and chain_params.json. OGX content stays external; the hashes guarantee integrity.Overrides provenance
mapper.omega_chain.sourcecaptures the winning order (bundle:env:file:cli).mapper.omega_chain.fieldslists overridden chain fields.mapper.omega_chain.effective_hashis a stable SHA256 of the effective chain parameters after applying overrides.mapper.omega_chain.overrides_pathrecords the file used for file/cli modes.
Example
- Micro:
tests/data/ogc/micro_ogc.json - (Benchmark) chr20 example produced by
scripts/bench_omega_chr20_ogx.py(when run with--regenerate-ogc).
Intended consumers
- Helix ingest and Studio UI (status chips, hover details)
- VeriBiota-style verifiers (replay/attest a run)
- CI artifacts for reproducibility
Ingest semantics (OGX + FASTQ/SAM)
ingest.bytescounts bundle files (OGX) plus any read source if present.ingest.read_bytesoptionally captures streamed read payload (0 when absent).ingest.wall_secondsspans from ingest start to first device batch ready.ingest.h2d_msmeasures GPU upload time of staged buffers/indexes.ingest.h2d_bytesmeasures bytes copied host→device for OGX index or staged read buffers. Mapper logs emitingest-summarylines; bench tooling copies these fields into the OGC manifest.
Notes for auditors
- Treat OGC as the complete, immutable capsule for a mapping run: bundle fingerprints, override lattice, effective chain hash, ingest telemetry, and headline bench metrics (qps/runtime/read count).
- Reserve extra fields for upcoming extensions (variant caller profile, FM version, GPU arch) to avoid breaking the format later.