Remediation Playbook
Remediation Playbook
Section titled “Remediation Playbook”Use this playbook to convert llm.* findings into concrete engineering work.
Fix Order (Recommended)
Section titled “Fix Order (Recommended)”- Safety-critical paths
- Reliability and runtime controls
- Governance and rollout controls
- Cost and optimization
1) Safety-Critical Paths
Section titled “1) Safety-Critical Paths”PII Leakage
Section titled “PII Leakage”- Watch:
llm.pii_leakage_risk - Fix: Add PII classification/redaction and explicit prompt field allowlists.
- Validate:
llm.pii_leakage_risk == 0
Prompt Injection and Instruction Boundaries
Section titled “Prompt Injection and Instruction Boundaries”- Watch:
llm.prompt_injection_surface,llm.instruction_boundary_violation - Fix: Separate system/user/tool channels; sanitize or delimit untrusted text.
- Validate: both keys trend downward; confidence remains stable
Insecure Output Handling
Section titled “Insecure Output Handling”- Watch:
llm.insecure_output_handling,llm.structured_output_enforcement_gap - Fix: Enforce strict schema validation before output reaches SQL, shell, UI, or workflow actions.
- Validate: lower output-handling gaps and fewer critical findings
Sensitive Telemetry
Section titled “Sensitive Telemetry”- Watch:
llm.sensitive_info_in_telemetry,llm.system_prompt_leakage - Fix: Redact prompt/response payloads and never log sensitive instruction content.
- Validate: both metrics trend toward
0
2) Reliability and Runtime Controls
Section titled “2) Reliability and Runtime Controls”Fallback, Rate Limits, Streaming, Idempotency
Section titled “Fallback, Rate Limits, Streaming, Idempotency”- Watch:
llm.fallback_absence,llm.rate_limit_absence,llm.streaming_risk,llm.cache_idempotency_gap - Fix: Add bounded retries with jitter, timeout budgets, fallback models, idempotency keys, and response caching.
- Validate: resilience metrics trend down and incident retry/429 rates drop
Context and Consumption Controls
Section titled “Context and Consumption Controls”- Watch:
llm.context_budget_absence,llm.unbounded_consumption,llm.cost_tracking_gap - Fix: enforce token budgets, retrieval limits (
top_k, windowing), request quotas, and cost attribution. - Validate: lower context/cost/consumption gaps
Observability
Section titled “Observability”- Watch:
llm.observability_gap,llm.genai_otel_semconv_gap - Fix: add traces/logs around LLM calls with model, usage, latency, and status fields.
- Validate: lower observability and OTel gaps, better triage speed
3) Governance and Rollout Controls
Section titled “3) Governance and Rollout Controls”Eval and Regression Coverage
Section titled “Eval and Regression Coverage”- Watch:
llm.eval_harness_absence,llm.eval_presence_score,llm.eval_quality_score - Fix: add golden datasets, adversarial and stochastic checks, and CI gating for critical flows.
- Validate: absence decreases; quality sub-scores increase
Versioning and Prompt Governance
Section titled “Versioning and Prompt Governance”- Watch:
llm.model_version_unpinned,llm.prompt_hardcoding_score,llm.template_governance_gap - Fix: pin model versions; move prompts to versioned templates with ownership and changelog.
- Validate: versioning/governance gaps decrease; prompt score increases
Model Rollout Safety
Section titled “Model Rollout Safety”- Watch:
llm.model_rollout_guardrail_gap - Fix: implement shadow/canary rollout, objective eval gates, and automatic rollback triggers.
- Validate: rollout guardrail gap trends toward
0
4) Tooling and Supply Chain Controls
Section titled “4) Tooling and Supply Chain Controls”MCP Authentication and Contracts
Section titled “MCP Authentication and Contracts”- Watch:
llm.mcp_authz_gap,llm.mcp_tool_contract_gap, additive MCP keys (llm.mcp_oauth21_gap,llm.mcp_pkce_gap,llm.mcp_tool_output_schema_gap, etc.) - Fix: require authN/authZ on MCP surfaces and strict runtime schema contracts for tool I/O.
- Validate: aggregate and additive MCP gaps trend down
Supply Chain, Poisoning, and Retrieval Boundaries
Section titled “Supply Chain, Poisoning, and Retrieval Boundaries”- Watch:
llm.supply_chain_risk,llm.data_model_poisoning_exposure,llm.vector_embedding_weakness,llm.embedding_drift_risk - Fix: pin/verify dependencies, validate ingestion pipelines, enforce tenant metadata filters, and align embedding/index versions.
- Validate: risk keys trend toward
0
Data-Quality Check Before Tightening Gates
Section titled “Data-Quality Check Before Tightening Gates”If these diagnostics are degraded, prioritize restoring analysis fidelity first:
llm.blast_radius_available = 0llm.pii_taint_used = 0- high
llm.call_sites_unresolved_count
Version note: this page is aligned with metric version 1.2.0.