Back
Preprint·2026·9 min read·Updated Jun 2026·DOI pending

Governance-First AI for
Failure-Mode Control

MAVS-GC: regulated consensus over always-on specialists for safer behaviour when evidence becomes uncertain, contradictory, corrupted, or unstable.

Saif Malik·MAVS Research Program·InfernusReal·ssaifmalikk@gmail.com
TLDR

Separate specialist prediction from output governance. Every specialist evaluates every input; diagnostics raise red flags; severity is aggregated; contextual weights and bounded mitigation shape a governed acceptance threshold; and the final decision passes through an auditable consensus trace with a hard veto. The claim is not universal accuracy — it is failure-mode control: under corruption and specialist failure, governed consensus suppresses unsafe acceptance by up to ~200× versus aggregation baselines while preserving accuracy and stability.

01

Problem

Modern AI systems are typically optimised to maximise accuracy under clean conditions. But real deployments are not clean: evidence becomes uncertain, contradictory, corrupted, or unstable, and individual specialists can fail silently. In these regimes the dangerous outcome is not a wrong prediction — it is an unsafe acceptance11An unsafe acceptance is admitting an input that should have been rejected. It is the failure mode safety-critical systems care about most — and the one accuracy alone does not measure.: confidently admitting an input that should have been rejected.

Static ensembles and routing-based Mixture-of-Experts inherit this weakness because acceptance is a fixed threshold applied after model scoring. In a controlled false-positive trap, mean aggregation accepted 100% of unsafe cases and static weighted aggregation accepted 85%. The decision rule itself, not the detectors, was the failure point.

02

Method

MAVS-GC elevates governance into a first-class computational object. A system is the tuple M=(X,Φ,F,G,A,W,P,Θ,Π)M = (X, \Phi, F, G, A, W, P, \Theta, \Pi): a shared feature map Φ\Phi, a set of always-on specialists FF22All-speak evaluation: every specialist scores every input. There is no router that can silently exclude a relevant specialist — a key difference from Mixture-of-Experts., a diagnostic system GG, a severity aggregator AA, an influence rebalancer WW, bounded mitigation PP, a threshold map Θ\Theta, and a decision rule Π\Pi.

θ, vetoφGOVERNANCEG → z → aseverityP → mmitigationθ = θ₀+λa−δmgate: a < τ_hardxinputΦx ↦ φF · all-speakf₁ → r₁f₂ → r₂f_n → r_nR = Σ wᵢrᵢconsensusΠdecisionAcceptreject · veto
Figure 1. The MAVS-GC pipeline. Input x is mapped to features φ and scored by all specialists into a governed consensus R. A separate governance block turns diagnostics into severity a, mitigation m, and a threshold θ; the final decision Π accepts only when consensus clears θ and severity stays below the hard veto.

Specialists emit calibrated scores si[0,1]s_i \in [0,1], converted to supports ri=2si1r_i = 2s_i - 1. Diagnostics produce a severity a=A(z)a = A(z) and mitigation mm, which move a governed threshold.

Governed threshold

θ=Θ(a,m)=θ0+λaδm\theta = \Theta(a,m) = \theta_0 + \lambda a - \delta m

Consensus

R(x)=iwiriR(x) = \sum_i w_i\, r_i

Decision

Π(R,θ,a)=1[a<τhard]1[Rθ]\Pi(R,\theta,a) = \mathbb{1}[\,a < \tau_{\text{hard}}\,]\cdot \mathbb{1}[\,R \geq \theta\,]

Acceptance therefore requires two things at once: severity must stay below a hard veto, and governed consensus must clear the governed threshold. Every run emits an auditable trace (r,w,z,a,m,θ,τhard,R,Π)(r, w, z, a, m, \theta, \tau_{\text{hard}}, R, \Pi), so a decision can always be reconstructed and explained — try it below.

Auditable trace

Live computation of (r, w, a, m, θ, τ_hard, R, Π)

specialistrᵢwᵢwᵢ·rᵢ
f10.820.340.279
f20.740.330.244
f30.680.330.224
severity a
0.05
mitigation m
0.20
θ = θ₀+λa−δm
0.065
τ_hard
0.80
R = 0.747 · R θ (0.065)
Π = 1 · ACCEPT

Specialists agree, diagnostics are quiet. Consensus clears a low governed threshold — the input is accepted.

Figure 2. An auditable trace, computed live. Switch scenarios to see how elevated severity raises the threshold (rejecting borderline inputs) and how the hard veto overrides an otherwise-positive consensus.
03

Key intuition

Normal behaviour is easy to govern from ordinary evidence; abnormal, adverse behaviour is not. By making governance explicit and monotone in severity33Monotone safety: increasing diagnostic severity can only raise the acceptance threshold, never lower it. Safety is a structural property of the decision rule, not a learned habit., higher diagnostic severity can never make acceptance easier — it can only make it harder. Mitigation is bounded and lives inside the decision rule, so it can nudge a borderline case but can never override the hard veto.

The consequence is a clean separation of concerns: intelligence generation (the specialists) and intelligence governance (A, W, P, Θ, Π) become independent. Acceptance behaviour can be retuned — more cautious, more permissive, differently audited — without retraining a single specialist.

04

Results

The program spans a formal Foundation Arc, a synthetic validation chapter, and three real-benchmark studies on Breast Cancer Wisconsin, Adult Income, Credit Card Fraud, and Bank Marketing. The strongest signal is robustness: under stress, MAVS-GC fails more safely.

Failure behaviour under corruption

Chapter 10B · accuracy vs. unsafe acceptance

Accuracy (higher is better) Unsafe acceptance (lower is better)
Pure MAVS-GCours
89.95%
1.35%
Mean / Veto
74.31%
27.29%
Single model
59.46%
45.42%

Under specialist-failure corruption, Pure MAVS-GC keeps accuracy high while unsafe acceptance stays near zero — roughly 20× lower than ensemble baselines and 34× lower than a single model.

Figure 3. Accuracy versus unsafe acceptance under corruption (Chapter 10B). Pure MAVS-GC holds high accuracy while keeping unsafe acceptance near zero; baselines degrade sharply. Toggle the regime to compare specialist failure and high corruption.

Specialist failure

1.35%

Unsafe acceptance for Pure MAVS-GC at 89.95% accuracy — ~20× lower than ensembles, ~34× lower than a single model.

High corruption (≥ 0.6)

0.45%

Unsafe acceptance at 85.30% accuracy — ~149× lower than ensemble-like baselines, ~202× lower than a single model.

Hard-veto compliance

100%

In synthetic validation, governance reduced unsafe acceptance from 100% / 85% baselines with zero hard-veto violations.

Clean accuracy (10A)

79 / 288

Competitive but not dominant: positive metric deltas in 79 of 288 comparisons. Governance shifts the error profile, not the ceiling.

Stability under corruptionChapter 10C
MetricPure MAVS-GCBaseline
Prediction stability0.9716150.952713
Decision stability0.9757700.958762
Consensus stability0.9793320.963946
Trace stability0.9679760.959693
Table 1. Behavioural stability under corruption (Chapter 10C). MAVS-GC preserves prediction, decision, consensus, and trace stability more strongly than the aggregation baseline as corruption increases.

Interpretation: MAVS-GC is a failure-management, robustness, and safety-oriented governance architecture — not a pure accuracy-maximiser. Under stress it rejects more cautiously, suppresses unsafe acceptance, and preserves behavioural consistency.

05

Limitations & future work

The current evaluation is rigorous but bounded. It covers four tabular datasets, a fixed suite of corruption families, a controlled split and audit structure, reproducibility manifests, and verified artifact trails. It does not yet establish production-scale behaviour, LLM-agent behaviour, universal robustness superiority, or cross-domain generalisation beyond the tested benchmark suite.

The most valuable next step is external-scale validation: larger datasets, additional modalities, LLM and agent specialist settings, adversarial expansions, ablation matrices, and independent replication. The open questions are whether the observed failure-management and stability-preservation effects survive at larger scale and in more realistic, safety-critical multi-model systems.

Support sought: research feedback, compute credits, review of experimental design, guidance on scalable evaluation, and collaboration on governance-first evaluation for LLM agents.

06

Citation

If you find this work useful, please cite it as:

@misc{malik2026mavsgc,
  title        = {MAVS-GC: Governance-First AI for Failure-Mode Control},
  author       = {Malik, Saif},
  year         = {2026},
  howpublished = {Preprint, MAVS Research Program},
  url          = {https://github.com/MAVS-RESEARCH}
}