Policy and CI Gates
Policy and CI Gates
Section titled “Policy and CI Gates”Use these policies to move from visibility to enforcement for ML architecture risk.
Strict Policy (Production ML Systems)
Section titled “Strict Policy (Production ML Systems)”metrics: - id: ml_architecture
policy: invariants: - metric: ml_architecture.overall_score op: ">=" value: 0.80 severity: error message: "Overall ML architecture health baseline not met" - metric: ml_architecture.overall_score_extended op: ">=" value: 0.75 severity: error message: "Extended ML architecture health baseline not met" - metric: ml_architecture.train_serve_skew_score op: ">=" value: 0.85 severity: error message: "Train/serve skew controls are insufficient" - metric: ml_architecture.reproducibility_score op: ">=" value: 0.80 severity: error message: "Reproducibility controls are insufficient" - metric: ml_architecture.data_lineage_integrity_score op: ">=" value: 0.80 severity: error message: "Lineage and artifact immutability baseline not met" - metric: ml_architecture.eval_integrity_score op: ">=" value: 0.75 severity: error message: "Evaluation integrity baseline not met" - metric: ml_architecture.serving_maturity_score op: ">=" value: 0.75 severity: error message: "Serving maturity baseline not met" - metric: ml_architecture.drift_monitoring_score op: ">=" value: 0.75 severity: error message: "Drift monitoring baseline not met" - metric: ml_architecture.monitoring_alerting_score op: ">=" value: 0.80 severity: error message: "Monitoring and alerting baseline not met" - metric: ml_architecture.attestation_enforcement_score op: ">=" value: 0.75 severity: error message: "Deploy-time attestation enforcement baseline not met" - metric: ml_architecture.model_registry_governance_score op: ">=" value: 0.75 severity: error message: "Model registry governance baseline not met"Pragmatic Policy (Existing/Large Codebases)
Section titled “Pragmatic Policy (Existing/Large Codebases)”metrics: - id: ml_architecture
policy: invariants: - metric: ml_architecture.overall_score op: ">=" value: 0.65 severity: warning message: "Raise ML architecture health above minimum baseline" - metric: ml_architecture.train_serve_skew_score op: ">=" value: 0.70 severity: warning message: "Reduce train/serve skew risk over time" - metric: ml_architecture.reproducibility_score op: ">=" value: 0.65 severity: warning message: "Increase reproducibility coverage" - metric: ml_architecture.data_lineage_integrity_score op: ">=" value: 0.60 severity: warning message: "Improve lineage/versioning hygiene" - metric: ml_architecture.eval_integrity_score op: ">=" value: 0.55 severity: warning message: "Improve evaluation integrity checks" - metric: ml_architecture.serving_maturity_score op: ">=" value: 0.55 severity: warning message: "Improve serving maturity controls" - metric: ml_architecture.adversarial_resilience_score op: ">=" value: 0.55 severity: warning message: "Increase adversarial resilience validation over time"Baseline No-Regression Policy
Section titled “Baseline No-Regression Policy”metrics: - id: ml_architecture
policy: baseline: mode: git ref: origin/main invariants: - metric: ml_architecture.overall_score op: ">=" baseline: true severity: error message: "Overall ML architecture health regressed vs baseline" - metric: ml_architecture.train_serve_skew_score op: ">=" baseline: true severity: error message: "Train/serve skew posture regressed vs baseline" - metric: ml_architecture.reproducibility_score op: ">=" baseline: true severity: error message: "Reproducibility posture regressed vs baseline" - metric: ml_architecture.data_lineage_integrity_score op: ">=" baseline: true severity: error message: "Data lineage posture regressed vs baseline"CI Command Examples
Section titled “CI Command Examples”# Focused ML architecture runarxo analyze --path . --metric ml_architecture# AI preset run (includes ml_architecture)arxo analyze --path . --preset ai --config arxo.yml --fail-fastRollout Guidance
Section titled “Rollout Guidance”- Start with warning-level thresholds for 1-2 release cycles.
- Fix recurring low-score categories in high-centrality modules first.
- Promote strict gates to
erroronce trends stabilize. - Keep baseline no-regression checks enabled to prevent drift.
Gate Strategy: overall_score vs overall_score_extended
Section titled “Gate Strategy: overall_score vs overall_score_extended”Use ml_architecture.overall_score when you need backward-compatible gating focused on core and operational controls.
Use ml_architecture.overall_score_extended when advanced governance/resilience controls are in scope and you want those detectors to influence release gates.
Recommended adoption path:
- Start with
ml_architecture.overall_scoreas required and trackml_architecture.overall_score_extendedas warning-only. - Add explicit invariants for critical additive detectors (for example: attestation enforcement, registry governance, adversarial resilience).
- Promote
ml_architecture.overall_score_extendedto required once additive-detector trends are stable.