Policy and CI Gates

Use these policies to move from visibility to enforcement for ML architecture risk.

Strict Policy (Production ML Systems)

metrics:
  - id: ml_architecture

policy:
  invariants:
    - metric: ml_architecture.overall_score
      op: ">="
      value: 0.80
      severity: error
      message: "Overall ML architecture health baseline not met"
    - metric: ml_architecture.overall_score_extended
      op: ">="
      value: 0.75
      severity: error
      message: "Extended ML architecture health baseline not met"
    - metric: ml_architecture.train_serve_skew_score
      op: ">="
      value: 0.85
      severity: error
      message: "Train/serve skew controls are insufficient"
    - metric: ml_architecture.reproducibility_score
      op: ">="
      value: 0.80
      severity: error
      message: "Reproducibility controls are insufficient"
    - metric: ml_architecture.data_lineage_integrity_score
      op: ">="
      value: 0.80
      severity: error
      message: "Lineage and artifact immutability baseline not met"
    - metric: ml_architecture.eval_integrity_score
      op: ">="
      value: 0.75
      severity: error
      message: "Evaluation integrity baseline not met"
    - metric: ml_architecture.serving_maturity_score
      op: ">="
      value: 0.75
      severity: error
      message: "Serving maturity baseline not met"
    - metric: ml_architecture.drift_monitoring_score
      op: ">="
      value: 0.75
      severity: error
      message: "Drift monitoring baseline not met"
    - metric: ml_architecture.monitoring_alerting_score
      op: ">="
      value: 0.80
      severity: error
      message: "Monitoring and alerting baseline not met"
    - metric: ml_architecture.attestation_enforcement_score
      op: ">="
      value: 0.75
      severity: error
      message: "Deploy-time attestation enforcement baseline not met"
    - metric: ml_architecture.model_registry_governance_score
      op: ">="
      value: 0.75
      severity: error
      message: "Model registry governance baseline not met"

Pragmatic Policy (Existing/Large Codebases)

metrics:
  - id: ml_architecture

policy:
  invariants:
    - metric: ml_architecture.overall_score
      op: ">="
      value: 0.65
      severity: warning
      message: "Raise ML architecture health above minimum baseline"
    - metric: ml_architecture.train_serve_skew_score
      op: ">="
      value: 0.70
      severity: warning
      message: "Reduce train/serve skew risk over time"
    - metric: ml_architecture.reproducibility_score
      op: ">="
      value: 0.65
      severity: warning
      message: "Increase reproducibility coverage"
    - metric: ml_architecture.data_lineage_integrity_score
      op: ">="
      value: 0.60
      severity: warning
      message: "Improve lineage/versioning hygiene"
    - metric: ml_architecture.eval_integrity_score
      op: ">="
      value: 0.55
      severity: warning
      message: "Improve evaluation integrity checks"
    - metric: ml_architecture.serving_maturity_score
      op: ">="
      value: 0.55
      severity: warning
      message: "Improve serving maturity controls"
    - metric: ml_architecture.adversarial_resilience_score
      op: ">="
      value: 0.55
      severity: warning
      message: "Increase adversarial resilience validation over time"

Baseline No-Regression Policy

metrics:
  - id: ml_architecture

policy:
  baseline:
    mode: git
    ref: origin/main
  invariants:
    - metric: ml_architecture.overall_score
      op: ">="
      baseline: true
      severity: error
      message: "Overall ML architecture health regressed vs baseline"
    - metric: ml_architecture.train_serve_skew_score
      op: ">="
      baseline: true
      severity: error
      message: "Train/serve skew posture regressed vs baseline"
    - metric: ml_architecture.reproducibility_score
      op: ">="
      baseline: true
      severity: error
      message: "Reproducibility posture regressed vs baseline"
    - metric: ml_architecture.data_lineage_integrity_score
      op: ">="
      baseline: true
      severity: error
      message: "Data lineage posture regressed vs baseline"

CI Command Examples

# Focused ML architecture run
arxo analyze --path . --metric ml_architecture

# AI preset run (includes ml_architecture)
arxo analyze --path . --preset ai --config arxo.yml --fail-fast

Rollout Guidance

Start with warning-level thresholds for 1-2 release cycles.
Fix recurring low-score categories in high-centrality modules first.
Promote strict gates to error once trends stabilize.
Keep baseline no-regression checks enabled to prevent drift.

Gate Strategy: `overall_score` vs `overall_score_extended`

Use ml_architecture.overall_score when you need backward-compatible gating focused on core and operational controls.

Use ml_architecture.overall_score_extended when advanced governance/resilience controls are in scope and you want those detectors to influence release gates.

Recommended adoption path:

Start with ml_architecture.overall_score as required and track ml_architecture.overall_score_extended as warning-only.
Add explicit invariants for critical additive detectors (for example: attestation enforcement, registry governance, adversarial resilience).
Promote ml_architecture.overall_score_extended to required once additive-detector trends are stable.

Policy and CI Gates

Policy and CI Gates

Strict Policy (Production ML Systems)

Pragmatic Policy (Existing/Large Codebases)

Baseline No-Regression Policy

CI Command Examples

Rollout Guidance

Gate Strategy: overall_score vs overall_score_extended

Read Next

Gate Strategy: `overall_score` vs `overall_score_extended`