Governance-First AI for
Failure-Mode Control
MAVS-GC: regulated consensus over always-on specialists for safer behaviour when evidence becomes uncertain, contradictory, corrupted, or unstable.
Separate specialist prediction from output governance. Every specialist evaluates every input; diagnostics raise red flags; severity is aggregated; contextual weights and bounded mitigation shape a governed acceptance threshold; and the final decision passes through an auditable consensus trace with a hard veto. The claim is not universal accuracy — it is failure-mode control: under corruption and specialist failure, governed consensus suppresses unsafe acceptance by up to ~200× versus aggregation baselines while preserving accuracy and stability.
Problem
Modern AI systems are typically optimised to maximise accuracy under clean conditions. But real deployments are not clean: evidence becomes uncertain, contradictory, corrupted, or unstable, and individual specialists can fail silently. In these regimes the dangerous outcome is not a wrong prediction — it is an unsafe acceptance11An unsafe acceptance is admitting an input that should have been rejected. It is the failure mode safety-critical systems care about most — and the one accuracy alone does not measure.: confidently admitting an input that should have been rejected.
Static ensembles and routing-based Mixture-of-Experts inherit this weakness because acceptance is a fixed threshold applied after model scoring. In a controlled false-positive trap, mean aggregation accepted 100% of unsafe cases and static weighted aggregation accepted 85%. The decision rule itself, not the detectors, was the failure point.
Method
MAVS-GC elevates governance into a first-class computational object. A system is the tuple : a shared feature map , a set of always-on specialists 22All-speak evaluation: every specialist scores every input. There is no router that can silently exclude a relevant specialist — a key difference from Mixture-of-Experts., a diagnostic system , a severity aggregator , an influence rebalancer , bounded mitigation , a threshold map , and a decision rule .
Specialists emit calibrated scores , converted to supports . Diagnostics produce a severity and mitigation , which move a governed threshold.
Governed threshold
Consensus
Decision
Acceptance therefore requires two things at once: severity must stay below a hard veto, and governed consensus must clear the governed threshold. Every run emits an auditable trace , so a decision can always be reconstructed and explained — try it below.
Live computation of (r, w, a, m, θ, τ_hard, R, Π)
| specialist | rᵢ | wᵢ | wᵢ·rᵢ |
|---|---|---|---|
| f1 | 0.82 | 0.34 | 0.279 |
| f2 | 0.74 | 0.33 | 0.244 |
| f3 | 0.68 | 0.33 | 0.224 |
Specialists agree, diagnostics are quiet. Consensus clears a low governed threshold — the input is accepted.
Key intuition
Normal behaviour is easy to govern from ordinary evidence; abnormal, adverse behaviour is not. By making governance explicit and monotone in severity33Monotone safety: increasing diagnostic severity can only raise the acceptance threshold, never lower it. Safety is a structural property of the decision rule, not a learned habit., higher diagnostic severity can never make acceptance easier — it can only make it harder. Mitigation is bounded and lives inside the decision rule, so it can nudge a borderline case but can never override the hard veto.
The consequence is a clean separation of concerns: intelligence generation (the specialists) and intelligence governance (A, W, P, Θ, Π) become independent. Acceptance behaviour can be retuned — more cautious, more permissive, differently audited — without retraining a single specialist.
Results
The program spans a formal Foundation Arc, a synthetic validation chapter, and three real-benchmark studies on Breast Cancer Wisconsin, Adult Income, Credit Card Fraud, and Bank Marketing. The strongest signal is robustness: under stress, MAVS-GC fails more safely.
Chapter 10B · accuracy vs. unsafe acceptance
Under specialist-failure corruption, Pure MAVS-GC keeps accuracy high while unsafe acceptance stays near zero — roughly 20× lower than ensemble baselines and 34× lower than a single model.
Specialist failure
1.35%Unsafe acceptance for Pure MAVS-GC at 89.95% accuracy — ~20× lower than ensembles, ~34× lower than a single model.
High corruption (≥ 0.6)
0.45%Unsafe acceptance at 85.30% accuracy — ~149× lower than ensemble-like baselines, ~202× lower than a single model.
Hard-veto compliance
100%In synthetic validation, governance reduced unsafe acceptance from 100% / 85% baselines with zero hard-veto violations.
Clean accuracy (10A)
79 / 288Competitive but not dominant: positive metric deltas in 79 of 288 comparisons. Governance shifts the error profile, not the ceiling.
| Metric | Pure MAVS-GC | Baseline |
|---|---|---|
| Prediction stability | 0.971615 | 0.952713 |
| Decision stability | 0.975770 | 0.958762 |
| Consensus stability | 0.979332 | 0.963946 |
| Trace stability | 0.967976 | 0.959693 |
Interpretation: MAVS-GC is a failure-management, robustness, and safety-oriented governance architecture — not a pure accuracy-maximiser. Under stress it rejects more cautiously, suppresses unsafe acceptance, and preserves behavioural consistency.
Limitations & future work
The current evaluation is rigorous but bounded. It covers four tabular datasets, a fixed suite of corruption families, a controlled split and audit structure, reproducibility manifests, and verified artifact trails. It does not yet establish production-scale behaviour, LLM-agent behaviour, universal robustness superiority, or cross-domain generalisation beyond the tested benchmark suite.
The most valuable next step is external-scale validation: larger datasets, additional modalities, LLM and agent specialist settings, adversarial expansions, ablation matrices, and independent replication. The open questions are whether the observed failure-management and stability-preservation effects survive at larger scale and in more realistic, safety-critical multi-model systems.
Support sought: research feedback, compute credits, review of experimental design, guidance on scalable evaluation, and collaboration on governance-first evaluation for LLM agents.
Citation
If you find this work useful, please cite it as:
@misc{malik2026mavsgc,
title = {MAVS-GC: Governance-First AI for Failure-Mode Control},
author = {Malik, Saif},
year = {2026},
howpublished = {Preprint, MAVS Research Program},
url = {https://github.com/MAVS-RESEARCH}
}