Skip to main content
Sigilix does not rank findings on a numeric 1–5 confidence scale. A number is just the model’s opinion about its own opinion. Instead, every posted finding carries a proof-tier receipt that records how strongly it was witnessed — and a finding that cannot be witnessed at all does not post. The principle: the model’s job is interpretation; the deterministic layer is the witness. A finding is only as believable as the evidence the deterministic layer can attach to it.

Proof-tier receipts

Every finding that survives to the PR carries one of three tier pills:
TierWhat it meansHow it’s earned
VERIFIEDChecked by execution or backed by a signed receiptA deterministic check ran (secret scan, AST rule, executed/refuted assertion) and the finding was confirmed against ground truth
GROUNDEDAnchored to cited code / evidence in the diffThe finding quotes specific code that exists at the cited location; the grounding gate matched the quote to the source
MODELModel judgment, no external witnessA specialist’s reasoning the deterministic layer could not independently confirm — still useful, but explicitly labeled as opinion
The tier is a receipt, not a score. It tells the reader why they should believe a finding, not how confident the model claims to be. A VERIFIED secret leak and a MODEL-tier naming suggestion are both legitimate; the pill lets you triage at a glance.

The grounding gate

Between a specialist producing a finding and that finding posting, Harmonia runs the grounding gate: the finding must be anchored to the code it claims to be about.
CheckOutcome
Finding’s path:line exists in the diffEligible to anchor
Finding’s path:line is hallucinatedDropped
Cited code quote matches the source (whitespace-normalized)Earns GROUNDED
Cited code quote paraphrases / doesn’t matchDemoted to MODEL or dropped
Claimed unsafe pattern actually presentConfirmed; eligible for VERIFIED if a check ran
Claimed pattern absentSuppressed before it reaches you
The gate is what makes Sigilix’s reviews trustworthy: a finding that survives it is one whose subject genuinely exists in your code. Findings the model invented — an “unused” function that is called, a “missing” guard that is present — are caught here and never posted.
Grounding is content-matched, not line-counted. Models routinely miscount line numbers but quote code verbatim, so the gate matches the quoted code against the source to anchor the finding precisely — including to deleted lines (side: LEFT) when the finding’s subject is removed code.

Agreement still escalates severity

Multiple specialists flagging the same code is a strong signal, and it raises severity (independent of proof tier):
InputsSeverity
1 specialist, low confidenceInfo
1 specialist, high confidenceWarning
2+ specialists agreeWarning or Critical (by category)
Specialist + deterministic check confirmsCritical
Severity (Info / Warning / Critical) is the categorical label the reader sees. The proof tier is the separate witness pill. A finding can be Critical-and-MODEL (high stakes, model-only evidence) or Info-and-VERIFIED (a confirmed nit) — the two axes are orthogonal.

The verdict

The verdict follows from severity, with the grounding gate having already removed unwitnessed Critical claims:
  • Any Critical that cleared the grounding gate → Request changes
  • Otherwise → Approve
A Critical finding the gate could not ground is demoted or dropped before it can block a merge — so the verdict is never decided by a hallucination.

Advisory cap

To prevent comment dilution, Harmonia surfaces the top few low-severity (Info / nit) findings inline and aggregates the rest into a single “advisory nits” line in the summary. The remainder are recorded in telemetry, not posted. Switch profile to assertive if you want low-severity findings surfaced more aggressively.

Tuning what posts

The proof-tier and grounding logic is fixed — it’s the believability guarantee, not a knob. What you can tune via sigilix.json:
  • profile"chill" (default, only must-fix) vs "assertive" (include nits + style). Shifts what each specialist flags in the first place.
  • rules.<role> — Per-specialist prompt addendum. Influences what each specialist chooses to flag.
  • deterministicChecks — Regex rules whose severity you set explicitly. These post at the severity you declare and earn a deterministic (VERIFIED-class) receipt — they bypass model judgment entirely.

Telemetry

Every finding’s lifecycle is recorded: which proof tier it earned, whether the grounding gate matched / demoted / dropped it, any agreement escalation, final severity, and whether it was posted, suppressed, or aggregated. Telemetry is internal — used for monitoring and prompt-engineering.

Believability Pipeline

The five gates a finding earns before it can post.

Synthesizer

How Harmonia runs the grounding gate and stamps proof-tier receipts.