Skip to content
Docs · OGN platform

GPU-native genomics operating system

From raw reads to GIAB-validated variant calls in a continuous GPU pipeline. This is the control surface for the engine: CLI, pipelines, benchmarks, and deployment runbooks.

CUDA 12+Hopper · AmpereGIAB-validated flowsSchemas stable

OGX Bundle Specification (v1)

Purpose

OGX encapsulates the GPU-ready OmegaAlign index for a single contig/region. It is the contract between bundle builders and runtime mappers.

Manifest (manifest.json) fields

FieldTypeReqExampleNotes
versionintyes1Schema version
bundle_idstringyeschr20_10MHuman readable identifier
contig.namestringyeschr20Single contig/region name
contig.lengthintyes10000000Bases in contig
files.graphpathyesgraph/graph.ogxGPU-ready graph
files.graph_sha256pathyesgraph/graph.ogx.sha256sha256sum sidecar
files.seedspathyesseeds/shard_0000.ogxPrimary seed shard
files.seeds_sha256pathyesseeds/shard_0000.ogx.sha256sha256sum sidecar
files.pathspathyespaths/paths.ogxPath lookup table
files.paths_sha256pathyespaths/paths.ogx.sha256sha256sum sidecar
files.ref_fastapathyesref.fastaReference sequence
files.ref_faipathyesref.fasta.faiFASTA index
files.chain_paramspathyeschain_params.jsonChain/WFA defaults
seeds_header.kindstringyesstrobeSeed generator kind
seeds_header.k/s/strobe_w/strobe_tintyes25/15/80/3Syncmer + randstrobe knobs
seeds_header.recordsintyes123456Seed record count
build.toolstringyesogn_ogx_builderBuilder binary
build.tool_versionstringyes0.3.0Builder semver/sha
build.source_fastastringyeschr20.faProvenance
build.source_gfastringyeschr20.gfaProvenance
build.timestamp_utcstringyes2025-11-18T04:05:06ZISO8601
metadata.notesstringnofree textFreeform annotations

Chain params (chain_params.json)

{
  "version": 1,
  "omega_chain": {
    "max_chain_gap": 5000,
    "max_dist": 2500,
    "max_chains": 32,
    "min_seed_score": 10,
    "max_seed_skip": 25,
    "diag_band": 1024
  },
  "scoring": {"match": 2, "mismatch": -4, "gap_open": -8, "gap_extend": -1}
}
FieldTypeReqOverridable
versionintyesno
omega_chain.max_chain_gapintyesyes
omega_chain.max_distintyesyes
omega_chain.max_chainsintyesyes
omega_chain.min_seed_scoreintyesyes
omega_chain.max_seed_skipintyesyes
omega_chain.diag_bandintyesyes
scoring.matchintyesyes
scoring.mismatchintyesyes
scoring.gap_openintyesyes
scoring.gap_extendintyesyes
Non-overridable (“baked in”): seeds header, graph/seeds/paths checksums, bundle_id, contig name/length, and builder provenance. Override JSON that attempts to set unknown keys, mixes conflicting values (e.g., max_chain_gap at root vs omega_chain.max_chain_gap with different numbers), or contains no valid fields is rejected.

File layout

bundle_root/
  manifest.json
  chain_params.json
  ref.fasta
  ref.fasta.fai
  graph/graph.ogx(.sha256)
  seeds/shard_0000.ogx(.sha256)
  paths/paths.ogx(.sha256)

Runtime override precedence

LayerSourceNotes
bundlemanifest + chain_params.jsonalways present
envOGN_OMEGA_CHAIN_OVERRIDESstrict JSON; malformed ⇒ hard error
fileOGN_OMEGA_CHAIN_OVERRIDES_FILE or --omega-chain-jsonfile must exist and contain at least one valid field
cliinline OmegaChainOverrides struct or --omega-chain-jsonhighest precedence
Later layers overwrite earlier ones; conflicts throw. ogn_ogx_load emits a single summary line such as:
[omega] omega.chain.source=bundle:env:file:cli fields=max_chains,max_dist hash=ab12...
Malformed JSON, empty objects, or conflicting keys abort the load, giving CI a hard failure mode for precedence violations.

OGX ingest semantics (for OGC and benchmarks)

  • ingest.bytes = sum of manifest.json, chain_params.json, graph, seeds, paths, ref_fasta, ref_fai, and associated sidecar .sha256 files.
  • ingest.read_bytes = optional additional bytes if a read source is streamed alongside the bundle (0 for pure bundle loads).
  • ingest.wall_ms = time from OGX load start until bundle is parsed and the Omega index is uploaded to device.
  • ingest.h2d_ms = host→device time for uploading the Omega index.
  • ingest.h2d_bytes = number of bytes copied to device for the OGX index (keys/offsets/counts/hits/contig tables).
  • ingest.h2d_gbps = h2d_bytes / h2d_ms, GB/s. These numbers are logged as ingest-summary source=ogx ... and flow into OGC manifests via the bench script.

Manifest contract (v1)

  • Required blocks: version, bundle_id, contig{name,length}, reference (id, build, coordinate_space), files (graph/seeds/paths/ref_fasta/ ref_fai/chain_params + sha256 sidecars), seeds_header, and build.
  • Optional: allowed_overrides enumerates which chain fields may be overridden at runtime. Unknown override keys are rejected by the schema.

Examples

  • Micro: tests/data/ogx/micro.ogx.json/manifest.json
  • Chr20: data/ogx/chr20_10M.ogx.json/manifest.json
OGX spec | OGN documentation | Omnis Genomics