Policy and CI Gates
Policy and CI Gates
Section titled “Policy and CI Gates”Use these profiles to roll out llm_integration safely in CI.
Balanced Policy (Recommended)
Section titled “Balanced Policy (Recommended)”metrics: - id: llm_integration
policy: invariants: - metric: llm.pii_leakage_risk op: "==" value: 0 severity: error message: "PII must not reach prompts without controls"
- metric: llm.prompt_injection_surface op: "<=" value: 0.2 severity: error message: "Prompt injection exposure is too high"
- metric: llm.observability_gap op: "<=" value: 0.25 severity: warning message: "Improve observability around LLM calls"
- metric: llm.fallback_absence op: "<=" value: 0.25 severity: warning message: "Fallback and timeout coverage should be improved"
- metric: llm.rate_limit_absence op: "<=" value: 0.25 severity: warning message: "Rate limiting/backoff coverage should be improved"
- metric: llm.structured_output_enforcement_gap op: "<=" value: 0.3 severity: warning message: "Schema enforcement for model outputs is too weak"
- metric: llm.overall_integration_health op: ">=" value: 0.7 severity: warning message: "Overall LLM architecture health baseline not met"Strict Policy (Production-Sensitive)
Section titled “Strict Policy (Production-Sensitive)”metrics: - id: llm_integration
policy: invariants: - metric: llm.pii_leakage_risk op: "==" value: 0 severity: error message: "PII must be redacted before LLM prompts"
- metric: llm.prompt_injection_surface op: "<=" value: 0.1 severity: error message: "Prompt injection surface must be minimized"
- metric: llm.observability_gap op: "<=" value: 0.15 severity: error message: "LLM call observability coverage is insufficient"
- metric: llm.fallback_absence op: "<=" value: 0.2 severity: error message: "Fallback strategy is required"
- metric: llm.rate_limit_absence op: "<=" value: 0.2 severity: error message: "Rate limiting/backoff is required"
- metric: llm.supply_chain_risk op: "<=" value: 0.2 severity: error message: "Model and tool dependencies must be pinned and verified"
- metric: llm.data_model_poisoning_exposure op: "<=" value: 0.2 severity: error message: "Ingestion and index paths must resist poisoning"
- metric: llm.system_prompt_leakage op: "<=" value: 0.1 severity: error message: "System prompts must not leak in logs or responses"
- metric: llm.mcp_authz_gap op: "<=" value: 0.2 severity: error message: "MCP surfaces require strong authN/authZ"
- metric: llm.mcp_tool_contract_gap op: "<=" value: 0.2 severity: error message: "MCP tools must enforce explicit input/output contracts"
- metric: llm.genai_otel_semconv_gap op: "<=" value: 0.3 severity: error message: "GenAI OTel semantic conventions are required"
- metric: llm.model_rollout_guardrail_gap op: "<=" value: 0.2 severity: error message: "Model rollout guardrails are required"
- metric: llm.overall_integration_health op: ">=" value: 0.8 severity: error message: "Overall LLM architecture health baseline not met"Exploratory Policy (Early Adoption)
Section titled “Exploratory Policy (Early Adoption)”metrics: - id: llm_integration
policy: invariants: - metric: llm.observability_gap op: "<=" value: 0.35 severity: warning message: "Improve observability coverage"
- metric: llm.cost_tracking_gap op: "<=" value: 0.35 severity: warning message: "Increase token and cost tracking coverage"
- metric: llm.eval_harness_absence op: "<=" value: 0.5 severity: warning message: "Add eval coverage for critical call paths"
- metric: llm.overall_integration_health op: ">=" value: 0.65 severity: warning message: "Incrementally raise LLM architecture health"Baseline No-Regression Policy
Section titled “Baseline No-Regression Policy”metrics: - id: llm_integration
policy: baseline: mode: git ref: origin/main invariants: - metric: llm.overall_integration_health op: ">=" baseline: true severity: error message: "Overall LLM architecture health regressed"
- metric: llm.pii_leakage_risk op: "<=" baseline: true severity: error message: "PII leakage risk regressed"
- metric: llm.prompt_injection_surface op: "<=" baseline: true severity: error message: "Prompt injection risk regressed"CI Commands
Section titled “CI Commands”# Focused metric runarxo analyze --path . --metric llm_integration --format json# AI preset run (includes llm_integration)arxo analyze --path . --preset ai --config arxo.yml --fail-fastRollout Sequence
Section titled “Rollout Sequence”- Start with Exploratory or Balanced profile for one to two release cycles.
- Fix recurring findings in PII, injection, fallback, rate limit, and structured outputs.
- Promote key controls to Strict profile for production-critical services.
- Keep baseline no-regression checks enabled permanently.
Degraded-Analysis Mode
Section titled “Degraded-Analysis Mode”If llm.blast_radius_available = 0 or llm.pii_taint_used = 0, analysis fidelity is reduced.
Recommended temporary posture:
- Keep baseline no-regression gates enabled.
- Keep hard safety gates (
llm.pii_leakage_risk,llm.prompt_injection_surface) with conservative thresholds. - Use warnings for newly introduced strict keys until diagnostics recover.
- Treat persistent degraded diagnostics as a platform issue and fix parser/call-graph availability.
Version note: this page is aligned with metric version 1.2.0.