PatentChecker
Platform quickstart (control plane + runners)
This is the operational “how to use it” guide for the headless PatentChecker platform: programs, watchlists, runners, drift, webhooks, triage, and retention.
PatentChecker is not a “run a command, get an answer” tool. It’s a continuously running monitoring system that produces verifiable, immutable evidence bundles and drift events you can review and disposition.
This doc is the practical “how to use it today” guide for the headless platform.
Mental model
PatentChecker has four moving parts:
- Control plane (SaaS): Programs, watchlists, runs, drift events, receipts, retention, webhooks.
- Runner (hosted or customer VPC): Executes the job, produces deterministic artifacts, uploads by digest, finalizes the run.
- Scheduler + outbox: Creates runs on schedule and delivers signed webhooks.
- Triage API: Humans review drift, assign, disposition, and download/export evidence.
Everything important is an artifact with digests. The UI (later) is just a view onto those immutable artifacts.
Request flow (high level)
program (active corpus snapshot)
-> watchlist (schedule + query ref + retention policy)
-> run (immutable record, pinned corpus snapshot)
watchlist (schedule + query ref + retention policy)
-> run (immutable record)
-> job (leased to a runner)
-> runner executes + uploads artifacts (deduped by sha256)
-> run finalized (idempotent)
-> drift event created (new vs old run)
-> webhook delivered (signed, at-least-once)
-> human triage (assign + dispositions)
-> retention pins evidence until drift is closed, then purges with a receipt
Where the code lives (local dev)
You will typically interact with three repos:
~/patentchecker: the trust anchor (artifact contracts + verifier + drift diff engine)~/patentchecker-platform: the control plane + workers + reference runner~/patentchecker-adapters: search adapters (BLAST/DIAMOND/etc), depending on your deployment
This doc assumes you are running the platform from
~/patentchecker-platform.0) Run the stack locally
From the platform repo:
cd ~/patentchecker-platform
docker compose -f docker-compose.dev.yml up --build
This starts:
- control plane API (Fastify)
- Postgres
- MinIO (S3-compatible object storage)
- scheduler worker
- outbox worker (webhooks)
- retention worker
The default dev auth token is the
CONTROL_PLANE_API_KEY from the compose file (often devkey).1) Create a program (unit of value)
Programs are the unit you sell and measure: one program per target / modality / team.
export BASE_URL="http://localhost:3000"
export TOKEN="devkey"
curl -sS -X POST "$BASE_URL/v1/programs" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: prog-kras-001" \
-d '{"name":"KRAS-G12D Program","description":"Primary oncology program"}'
Save the returned
program_id.2) Register + activate a corpus snapshot (what was checked)
Every run is pinned to exactly one corpus snapshot digest. The scheduler uses the program’s active corpus snapshot when creating runs (unless you override it on a
run-now request).Register a snapshot (minimal provenance example for
source_type=customer):curl -sS -X POST "$BASE_URL/v1/programs/<program_id>/corpus-snapshots" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: cs-kras-001" \
-d '{
"corpus_snapshot_digest":"sha256:<64>",
"source_type":"customer",
"manifest":{
"jurisdictions":["US"],
"sequence_types":["protein"],
"made_public_until":"2025-01-01"
}
}'
Activate it for the program (what new runs will use by default):
curl -sS -X POST "$BASE_URL/v1/programs/<program_id>/corpus-snapshots/activate" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: cs-kras-activate-001" \
-d '{"corpus_snapshot_digest":"sha256:<64>","reason":"baseline"}'
3) Create a watchlist (what to monitor)
Watchlists define:
- runner target (hosted vs VPC runner group)
- schedule (interval for now)
- retention policy (what evidence is kept / purged)
- query input reference (where your sequence payload lives)
Example (customer-bucket pointer):
curl -sS -X POST "$BASE_URL/v1/programs/<program_id>/watchlists" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: wl-kras-protein-001" \
-d '{
"name":"KRAS protein watch",
"enabled":true,
"runner_target":{"kind":"hosted"},
"schedule":{"kind":"interval","interval_seconds":86400},
"retention_policy":{"retention_enabled":true,"keep_last_n":10,"keep_days":90,"legal_hold":false},
"query_input_ref":{"kind":"customer_bucket_pointer","uri":"s3://customer-bucket/watchlists/kras.json"}
}'
Save the returned
watchlist_id.4) Trigger a run (run-now) or wait for the scheduler
Scheduler will create runs automatically based on the watchlist schedule.
To force a run now:
curl -sS -X POST "$BASE_URL/v1/watchlists/<watchlist_id>/runs" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: run-now-001" \
-d '{}'
This creates:
- a
runrow (system-of-record) - a
jobrow (leaseable unit for a runner)
Option A: Reference runner (dev)
In dev, use the platform’s reference runner to prove the end-to-end lifecycle:
cd ~/patentchecker-platform
npm ci
npm run run:reference-runner
The runner:
- pulls a job lease
- heartbeats until completion
- produces a deterministic run directory + bundle
- uploads artifacts by sha256 (dedupe-safe)
- finalizes the run (idempotent)
Option B: Hosted runner (prod)
Same protocol, but deployed as a long-running service in your cloud.
Option C: Customer VPC runner (prod upsell)
Same runner container, but deployed in customer infra. The control plane schedules jobs and receives only receipts/artifacts; customer secrets remain in the VPC.
6) Inspect runs and download evidence
List runs:
curl -sS "$BASE_URL/v1/runs?watchlist_id=<watchlist_id>&limit=50" \
-H "Authorization: Bearer $TOKEN"
Download the run bundle (presigned URL):
curl -sS "$BASE_URL/v1/runs/<run_id>/bundle" \
-H "Authorization: Bearer $TOKEN"
Get the bundle manifest inline:
curl -sS "$BASE_URL/v1/runs/<run_id>/bundle/manifest" \
-H "Authorization: Bearer $TOKEN"
Bundle-truth viewer index (UI should follow only returned links; never compose URLs client-side):
curl -sS "$BASE_URL/v1/runs/<run_id>/view-index" \
-H "Authorization: Bearer $TOKEN"
7) Drift is created automatically (what humans review)
When a run finalizes, the platform compares it to the prior finalized run for the same watchlist and creates a drift event.
List drift events for a program:
curl -sS "$BASE_URL/v1/programs/<program_id>/drift-events?state=new&limit=50" \
-H "Authorization: Bearer $TOKEN"
Get one drift event:
curl -sS "$BASE_URL/v1/drift-events/<drift_event_id>" \
-H "Authorization: Bearer $TOKEN"
Download evidence for a drift event (both new + old bundles):
curl -sS "$BASE_URL/v1/drift-events/<drift_event_id>/bundle" \
-H "Authorization: Bearer $TOKEN"
Explain (stub, but stable pointers):
curl -sS "$BASE_URL/v1/drift-events/<drift_event_id>/explain" \
-H "Authorization: Bearer $TOKEN"
8) Webhooks (how teams “feel” the system without UI)
Register a webhook endpoint:
curl -sS -X POST "$BASE_URL/v1/webhook-endpoints" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: wh-001" \
-d '{"url":"https://example.com/webhooks/patentchecker"}'
Events are delivered at-least-once and signed with an HMAC secret. Design consumers to dedupe by
event_id.9) Triage primitives (assignment + dispositions)
Assign a drift event:
curl -sS -X POST "$BASE_URL/v1/drift-events/<drift_event_id>/assign" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: assign-001" \
-d '{"assigned_to_user_id":"user_demo"}'
Create a disposition (append-only):
curl -sS -X POST "$BASE_URL/v1/drift-events/<drift_event_id>/dispositions" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: disp-001" \
-d '{"label":"relevant","reason_code":"triage","comment":"Overlaps our KRAS program; counsel review"}'
State semantics (must stay consistent across API + retention):
new: no dispositionsopen: latest disposition is non-terminal (e.g.needs_review,escalate)closed: latest disposition is terminal (e.g.relevant,not_relevant)
10) Retention (evidence purge with receipts)
Retention is designed to be safe and auditable:
- Evidence for runs referenced by any non-closed drift event is pinned (both new + old sides).
- When evidence is purged, the run/drift history remains, and the system writes a retention deletion receipt artifact.
- Evidence never “silently disappears”.
After evidence is purged, bundle/explain endpoints degrade deterministically:
- HTTP
410 Gone - error code
EVIDENCE_PURGED details.retention_deletion_receipt_artifact_idpoints to the receipt artifact
When a retention deletion receipt exists, you can fetch its viewer link:
curl -sS "$BASE_URL/v1/runs/<run_id>/retention-deletion-receipt/view" \
-H "Authorization: Bearer $TOKEN"
What to do next as a user (dogfood)
If you’re dogfooding internally:
- Stand up one real protein watchlist.
- Wire webhooks to Slack.
- Review drift weekly and disposition everything.
- Track: alerts/week, time-to-triage, false positives.
That’s the shortest path to real signal before building UI or scaling corpus ingestion.