Skip to content
Arxo Arxo

check_llm_integration

Check the health of LLM integrations in your codebase. This tool detects observability gaps, PII leakage risks, cost tracking issues, prompt governance problems, and resilience anti-patterns in code that calls LLM providers (OpenAI, Anthropic, etc.).

ParameterTypeRequiredDescription
project_pathstringYesAbsolute or relative path to the project root directory

Returns a JSON summary with LLM integration health scores and findings.

{
"llm_integration": {
"health_score": "number", // 0-1 overall health (1 = perfect)
"observability_gap": "number", // Count of unobserved LLM calls
"pii_leakage_risk": "number", // Count of potential PII leaks to LLM
"cost_tracking_gap": "number", // Count of calls without token/cost tracking
"prompt_hardcoding": "number", // Count of hardcoded prompts (should be in registry)
"model_coupling": "number", // Count of direct model dependencies (should use adapter)
"fallback_absence": "number" // Count of calls without fallback/retry logic
},
"findings": [
{
"title": "string",
"severity": "error|warning|info",
"evidence_count": "number",
"description": "string"
}
],
"violations_count": "number"
}
ScoreGradeInterpretation
0.8 - 1.0✅ ExcellentProduction-ready LLM integration
0.6 - 0.8⚠️ GoodMinor issues, review findings
0.4 - 0.6⚠️ FairAddress observability and cost tracking
0.2 - 0.4🚨 PoorMajor gaps — not production-ready
0.0 - 0.2🚨 CriticalImmediate action required

Request:

{
"project_path": "."
}

Response:

{
"llm_integration": {
"health_score": 0.87,
"observability_gap": 0,
"pii_leakage_risk": 0,
"cost_tracking_gap": 1,
"prompt_hardcoding": 2,
"model_coupling": 0,
"fallback_absence": 0
},
"findings": [
{
"title": "Cost tracking gap",
"severity": "warning",
"evidence_count": 1,
"description": "1 LLM call missing token usage logging"
},
{
"title": "Prompt hardcoding",
"severity": "info",
"evidence_count": 2,
"description": "2 inline prompts detected — consider moving to prompt registry"
}
],
"violations_count": 0
}

Request:

{
"project_path": "/path/to/project"
}

Response:

{
"llm_integration": {
"health_score": 0.42,
"observability_gap": 5,
"pii_leakage_risk": 2,
"cost_tracking_gap": 8,
"prompt_hardcoding": 12,
"model_coupling": 3,
"fallback_absence": 6
},
"findings": [
{
"title": "Observability gap",
"severity": "error",
"evidence_count": 5,
"description": "LLM calls without tracing or logging detected"
},
{
"title": "PII leakage risk",
"severity": "error",
"evidence_count": 2,
"description": "User data sent to LLM without redaction"
},
{
"title": "Cost tracking gap",
"severity": "warning",
"evidence_count": 8,
"description": "Missing token usage logging and budget alerts"
},
{
"title": "Fallback absence",
"severity": "warning",
"evidence_count": 6,
"description": "No timeout, retry, or fallback configuration"
}
],
"violations_count": 2
}

Interpretation:

  • Health score 0.42 — needs improvement before production
  • 5 LLM calls lack observability (add tracing/logging)
  • 2 PII leakage risks (add redaction)
  • 8 calls don’t track token usage (add cost monitoring)
  • 6 calls have no fallback logic (add retries/timeouts)

Issue: LLM calls without tracing or structured logging

Fix:

  • Add OpenTelemetry spans around LLM calls
  • Log request/response metadata (model, tokens, latency)
  • Use structured logging libraries (e.g., winston, slog)

Issue: User data sent to LLM without redaction

Fix:

  • Implement PII detection and redaction before LLM calls
  • Use allowlists for data fields sent to LLMs
  • Add audit logging for all data sent to external LLM providers

Issue: Missing token usage logging and budget monitoring

Fix:

  • Log usage.total_tokens from LLM responses
  • Implement per-user or per-request cost tracking
  • Set up budget alerts (e.g., daily spend limits)

Issue: Inline prompts make versioning/A-B testing difficult

Fix:

  • Move prompts to a prompt registry or template system
  • Version prompts separately from application code
  • Use prompt management tools (e.g., LangSmith, PromptLayer)

Issue: Direct dependencies on specific LLM providers

Fix:

  • Introduce an adapter/interface layer
  • Use provider-agnostic SDKs (e.g., LiteLLM, LangChain)
  • Make model selection configurable

Issue: No timeout, retry, or fallback configuration

Fix:

  • Add timeouts to LLM calls (e.g., 30s)
  • Implement exponential backoff retry logic
  • Add fallback models or cached responses

Use this tool as part of an LLM integration audit:

1. check_llm_integration → assess health score
2. If health_score < 0.6:
a. Review findings and prioritize fixes
b. Add observability (tracing, logging)
c. Implement PII redaction
d. Add cost tracking and budgets
3. Re-run check_llm_integration to verify improvements

See Workflows: LLM integration audit for a full example.

ErrorCauseSolution
missing required parameter: project_pathproject_path not providedInclude project_path in request
llm_integration metric not in resultsNo LLM calls detected in codebaseVerify project uses LLM providers (OpenAI, Anthropic, etc.)
ai preset may not include itEngine version mismatchUpdate to latest arxo version
  • Speed: 5-15 seconds (uses ai preset with 5 AI-related metrics)
  • Caching: Does not use cache (always fresh analysis)
  • Scalability: Handles projects with 1k+ LLM call sites efficiently
Terminal window
# Check LLM integration health
arxo analyze --preset ai --format json | jq '.results[] | select(.id=="llm_integration")'