ML Architecture
ML Architecture
Section titled “ML Architecture”The ml_architecture metric evaluates architecture quality for machine learning systems across training, evaluation, serving, and MLOps operations.
Last verified against engine metric version 1.2.0.
Why It Matters
Section titled “Why It Matters”Production ML systems often fail due to architecture gaps rather than model quality:
- Train/serve skew creates silent prediction drift.
- Weak reproducibility and lineage make incidents hard to debug.
- Missing eval and validation controls allow regressions to ship.
- Serving and monitoring gaps increase outage and staleness risk.
ml_architecture surfaces these issues as detector-level scores plus evidence-backed findings.
What It Measures
Section titled “What It Measures”Core ML Architecture Integrity
Section titled “Core ML Architecture Integrity”ml_architecture.train_serve_skew_scoreml_architecture.skew_test_absence_scoreml_architecture.pipeline_complexity_scoreml_architecture.reproducibility_scoreml_architecture.train_inference_boundary_scoreml_architecture.data_lineage_integrity_scoreml_architecture.experiment_isolation_scoreml_architecture.eval_integrity_scoreml_architecture.serving_maturity_score
Operational Readiness and Controls
Section titled “Operational Readiness and Controls”ml_architecture.drift_monitoring_scoreml_architecture.data_validation_scoreml_architecture.ci_integration_scoreml_architecture.fairness_audit_scoreml_architecture.ab_testing_scoreml_architecture.shadow_canary_scoreml_architecture.monitoring_alerting_scoreml_architecture.model_staleness_scoreml_architecture.serving_ops_score
Advanced Governance and Resilience Controls
Section titled “Advanced Governance and Resilience Controls”ml_architecture.model_validation_gates_scoreml_architecture.calibration_uncertainty_scoreml_architecture.feature_store_consistency_scoreml_architecture.progressive_delivery_analysis_scoreml_architecture.provenance_attestation_scoreml_architecture.responsible_ai_governance_scoreml_architecture.attestation_enforcement_scoreml_architecture.model_registry_governance_scoreml_architecture.lineage_schema_fidelity_scoreml_architecture.adversarial_resilience_scoreml_architecture.post_market_incident_readiness_scoreml_architecture.genai_telemetry_semconv_score
Composite and Diagnostic Outputs
Section titled “Composite and Diagnostic Outputs”| Metric Key | Range / Type | Direction |
|---|---|---|
ml_architecture.overall_score | 0..1 | Higher is better |
ml_architecture.overall_score_extended | 0..1 | Higher is better |
ml_architecture.gpu_file_count | Number | Informational |
ml_architecture.database_file_count | Number | Informational |
ml_architecture.env_config_file_count | Number | Informational |
ml_architecture.graph.* | Graph entries | Informational |
No ML Files Detected
Section titled “No ML Files Detected”If no training or serving ML files are detected:
- detector scores are emitted as neutral (
0.5) ml_architecture.overall_scoreandml_architecture.overall_score_extendedare both0.5- diagnostic counts (
ml_architecture.gpu_file_count,ml_architecture.database_file_count,ml_architecture.env_config_file_count) are emitted as0 - no findings are emitted in this neutral mode
Config Quick Reference
Section titled “Config Quick Reference”metrics: - id: ml_architecture enabled: truePolicy Quick Start
Section titled “Policy Quick Start”metrics: - id: ml_architecture
policy: invariants: - metric: ml_architecture.overall_score op: ">=" value: 0.70 message: "ML architecture health baseline not met" - metric: ml_architecture.overall_score_extended op: ">=" value: 0.70 message: "Extended ML architecture health baseline not met" - metric: ml_architecture.train_serve_skew_score op: ">=" value: 0.75 message: "Train/serve skew controls are insufficient" - metric: ml_architecture.reproducibility_score op: ">=" value: 0.70 message: "Reproducibility baseline not met"For production-ready profiles, see Policy and CI Gates.
Runtime and ID Compatibility
Section titled “Runtime and ID Compatibility”- Documentation route:
/metrics/ml-architecture - Stable metric ID:
ml_architecture