Skip to content
Arxo Arxo

ML Architecture

This guide shows an end-to-end workflow for improving ml_architecture health in production ML systems.

  1. Run an ML architecture audit.
  2. Triage findings by detector family and blast radius.
  3. Apply targeted remediation.
  4. Enforce policy gates in CI.
  5. Prevent regressions with baseline checks.
Terminal window
# Focused ML architecture metric
arxo analyze --path . --metric ml_architecture --format json
Terminal window
# AI preset (includes ml_architecture)
arxo analyze --path . --preset ai --format json

Prioritize in this order:

  1. train_serve_skew_score, train_inference_boundary_score
  2. reproducibility_score, data_lineage_integrity_score
  3. eval_integrity_score, data_validation_score, ci_integration_score
  4. serving_maturity_score, drift_monitoring_score, monitoring_alerting_score, model_staleness_score
  5. ab_testing_score, shadow_canary_score, serving_ops_score

Then inspect ml_architecture.overall_score movement after each fix batch.

  • Architecture track: shared train/serve transforms, boundary cleanup, DAG simplification.
  • Repro/lineage track: seed policy, lock files, immutable dataset/model references.
  • Eval/quality track: leakage-safe splits, data validation contracts, fairness and CI checks.
  • Serving ops track: warmup, health/readiness, drift/staleness monitoring, alerting, canary/A/B rollout.

Use the Remediation Playbook for fix-by-metric guidance.

Use profiles from Policy and CI Gates.

Terminal window
arxo analyze --path . --preset ai --config arxo.yml --fail-fast

Recommended rollout:

  1. Start with warning-level thresholds.
  2. Fix recurring low-score categories in central modules.
  3. Promote critical gates to error after score trend stabilizes.
  • Enable baseline no-regression checks against origin/main.
  • Start in one critical ML service/workspace.
  • Expand to additional workspaces after stable score trends.