Skip to main content

How it all fits together

Server AI Hub is the platform. Δ UI is the chat interface. Corpus is the retrieval and knowledge-graph backend. HIFF / GRIFF-Δ is the hallucination-scoring engine underneath. Each piece is a discrete product that stands on its own — but together they're a vertically-integrated stack that solves the trustworthy LLM problem end-to-end.

The stack at a glance

Each box is one of our products. Anyone can buy the Hub alone and run their own LLM stack on it; anyone can buy Δ UI as a managed product with the rest of the stack pre-wired. The interesting story is what happens when you have all four.

The four products

Server AI Hub — the platform

The managed, pre-configured infrastructure layer. Turns an NVIDIA GPU server into a Mac-style desktop: containers as buttons, observability already on, backups and antivirus self-scheduling, networking that just works. Everything else in the stack runs as containers on top of the Hub.

Closed source. Sold as a license per box (Standard) or per seat + per box (Enterprise).

Corpus + Graphs — the retrieval backend

The data plane. Ingest documents (PDF, web, code, structured), embed them through our hybrid dense + sparse + ColBERT pipeline (BGE-M3 stack), index them in Qdrant for vectors and a NetworkX-backed graph layer for relationships. Queries return passages + KG hops + provenance, ready to feed a generator.

Hop-bounded graph reasoning and entropy-bounded web augmentation are part of the same retrieval surface — the same query gets answered by static docs, the knowledge graph, and (optionally, with confidence gates) the live web.

Closed source. Ships as the griff-api + graph-explorer + bge-m3-sparse-embedder containers in the Hub.

HIFF / GRIFF-Δ — the science

HIFFHallucination-Invariant Fidelity Framework — is the IP-protected ensemble methodology that turns a retrieval-grounded LLM response into a calibrated, per-claim fidelity score. GRIFF-Δ is the product name for the same engine when it ships in Δ UI.

The engine combines several techniques, all patent pending:

  • A 79-feature always-include whitelist spanning four families: direct token-prob signals, centroid distance, KNN coverage, and calibration features. Mean honest-test MCC ~0.79 on HaluEval-30K.
  • A 5-voter Panel-of-Judges stage-2 router (minicheck + qwen3-1.7B + qwen3-4B + Bespoke-MC + HEM-NLI) using a 6-rung deliberation flow: atomic decomposition, MARCH blinding, ReConcile deliberation, Dawid-Skene aggregation, Trust-or-Escalate, and per-claim Δ scoring.
  • A Chief Justice escalation pattern: when the panel disagrees beyond a calibrated threshold (under 5% of queries), the case escalates to a cloud DeepSeek tier-3 oracle. This bounds cost while keeping accuracy.
  • A contamination audit suite (A13 / A14 / A15) that detects whether a model's confidence on a given query is real or memorised from training data.

Validated on HaluEval-30K and RAGTruth-OOS. Stage-2 LLM router adds ~+0.15 MCC on top of stage-1 features.

Closed source. Patent applications filed.

Δ UI — the chat interface

The user-facing product. A next-generation LLM chat UI where every prompt, every response, and every thread receives a quality score at inference time — no other LLM interface does this. Hallucinations are flagged on each chat response inline, before the user has to wonder whether to trust the answer.

  • Per-token H / J / Δ flags — every span of generated text is annotated with a Hallucination score (HIFF), a Judgment score (panel-of-judges), and a Δ confidence (delta vs the retrieval evidence). Hover any span to see the evidence and the score breakdown.
  • Quality Card per response — a small summary panel at the end of each answer: % grounded, % uncertain, sources cited, confidence band.
  • Thread Score — a rolling per-conversation fidelity score so the user knows whether the thread has drifted into speculation.
  • Source-by-source attribution — every claim is back-linked to the Corpus passage(s) that supported it.

Δ UI is the only LLM interface (we believe, at time of writing) that shows you not just what the model said, but how confident the system is in each piece of it, scored at the moment of generation. Competes with Cohere North, Contextual, and Glean — at the high end of the trustworthy-LLM category.

Closed source. Sold as a license, often bundled with Hub + Corpus.

How a query flows through the stack

A user types a question into Δ UI.

  1. Δ UI sends the question to Corpus. Corpus runs the hybrid retrieval — dense (BGE-M3), sparse (BGE-M3-sparse), late-interaction (ColBERT) — over the indexed corpus, traverses the knowledge graph for related entities, optionally hits the live web within an entropy budget. Returns ranked passages + KG hops + provenance.
  2. Δ UI assembles the prompt and calls the generator (vLLM-served local model, or routed cloud model via LiteLLM). The LLM produces a draft response with token-level log-probabilities.
  3. GRIFF-Δ (HIFF engine) scores the draft. Stage 1 computes the 79 always-include features per claim. Stage 2 routes uncertain claims to the 5-voter Panel-of-Judges. Stage 3 escalates the panel's outliers to Chief Justice (cloud DeepSeek) — fires on under 5% of claims.
  4. Δ UI renders the response. Each claim is wrapped with its H / J / Δ flags; the Quality Card is computed; the Thread Score updates. The user sees not just the answer but the system's confidence in every part of it.

All four steps happen on infrastructure managed by Server AI Hub — containers, observability, backups, antivirus, networking all handled by the platform layer underneath.

What's patent pending

Filed under 6ix Labs IP:

  • The HIFF framework — the 79-feature always-include set + the four-family feature architecture + the calibration normalization approach.
  • The 5-voter Panel-of-Judges with the 6-rung deliberation flow (atomic decomp → MARCH blinding → ReConcile → Dawid-Skene → Trust-or-Escalate → per-claim Δ).
  • The Chief Justice escalation pattern (panel-then-oracle with cost-bounded fire rate).
  • The contamination audit triad (A13 / A14 / A15).
  • The per-claim Δ scoring rendering pattern (showing system confidence inline at token / span granularity).

Branding (Server AI Hub, Δ UI, GRIFF, HIFF) is trademarked; product brands are public, technical methods are patent-pending IP.

Buy any one — get the rest free of integration cost

Customers can buy any one product in the stack. But the meaningful unlock is the bundle: a Hub running Δ UI, fed by Corpus, scored by GRIFF-Δ, is the only configuration where the entire trustworthy-LLM story is end-to-end ours. Buy the Hub for the infrastructure win; buy Δ UI for the trust win; buy them together and integration is a non-issue.

Extending the platform

The Hub is a closed, single-vendor ecosystem. Two routes exist to add capabilities beyond what ships, both detailed in Extending the Hub:

  • Tier 1 — 6ix Labs catalog: signed packages we build in-house (alternative vector stores, model servers, dev tools, workflow engines). Mix of bundled and paid add-ons. Demand-prioritised.
  • Tier 2 — Enterprise API: a documented API + SDK on a subscription. Enterprise customers integrate their own internal services without us as the bottleneck. Per-seat or per-call service fee.

We don't accept third-party submissions and don't run a community catalog. That's deliberate.