Policy and CI Gates

Use these policies to move from visibility to enforcement for agent risks.

Strict Policy (Production Agent Systems)

metrics:
  - id: agent_architecture

policy:
  invariants:
    - metric: agent_architecture.loop_guard_absence
      op: "<="
      value: 0.10
      severity: error
      message: "Agent loops must be strongly bounded"
    - metric: agent_architecture.tool_policy_absence
      op: "<="
      value: 0.10
      severity: error
      message: "Tools must be policy-scoped and controlled"
    - metric: agent_architecture.schema_validation_gap
      op: "<="
      value: 0.15
      severity: error
      message: "Tool I/O must be schema-validated"
    - metric: agent_architecture.agent_eval_absence
      op: "<="
      value: 0.10
      severity: error
      message: "Eval harness must be present for agent flows"
    - metric: agent_architecture.agent_reliability_score
      op: ">="
      value: 80
      severity: error
      message: "Reliability baseline not met"
    - metric: agent_architecture.governance_readiness
      op: ">="
      value: 85
      severity: error
      message: "Governance baseline not met"

Pragmatic Policy (Existing/Large Codebases)

metrics:
  - id: agent_architecture

policy:
  invariants:
    - metric: agent_architecture.loop_guard_absence
      op: "<="
      value: 0.25
      severity: warning
      message: "Add loop guards to remaining unbounded flows"
    - metric: agent_architecture.tool_policy_absence
      op: "<="
      value: 0.30
      severity: warning
      message: "Increase tool allowlist and scope coverage"
    - metric: agent_architecture.retry_storm_risk
      op: "<="
      value: 0.25
      severity: warning
      message: "Reduce retry storm risk with backoff and limits"
    - metric: agent_architecture.agent_reliability_score
      op: ">="
      value: 70
      severity: warning
      message: "Improve agent reliability score over time"

Baseline No-Regression Policy

metrics:
  - id: agent_architecture

policy:
  baseline:
    mode: git
    ref: origin/main
  invariants:
    - metric: agent_architecture.loop_guard_absence
      op: "<="
      baseline: true
      severity: error
      message: "Loop guard coverage regressed vs baseline"
    - metric: agent_architecture.tool_policy_absence
      op: "<="
      baseline: true
      severity: error
      message: "Tool governance regressed vs baseline"
    - metric: agent_architecture.overall_agent_health
      op: ">="
      baseline: true
      severity: error
      message: "Overall agent health must not regress"

CI Command Examples

# Run all AI metrics (includes agent_architecture)
arxo analyze --path . --preset ai

# CI-style run with policy file
arxo analyze --path . --preset ai --config arxo.yml --fail-fast

Rollout Guidance

Start with warning-level gates for 1-2 release cycles.
Fix top recurring findings (loop_guard_absence, tool_policy_absence, agent_eval_absence).
Tighten thresholds gradually and switch key gates to error.
Keep no-regression baseline checks enabled to prevent drift.