Scoring and Keys
Scoring and Keys
Section titled “Scoring and Keys”This page documents the scoring behavior and emitted metric keys for ml_architecture, aligned to:
crates/arxo-engine/src/metrics/ai_observability/ml_architecture/plugin.rscrates/arxo-engine/src/metrics/ai_observability/ml_architecture/ui/health_summary.rs
Overall Score Formulas
Section titled “Overall Score Formulas”ml_architecture.overall_score is confidence-weighted and clamped to 0..1.
For each detector score s_i with detector confidence c_i:
effective_conf_i = max(c_i, 0.35)
For each detector group G:
group_score(G) = sum(s_i * effective_conf_i) / sum(effective_conf_i)
Overall:
overall_score = clamp( 0.20 * group1 + 0.35 * group2 + 0.45 * group3, 0.0, 1.0)Group membership:
group1(0.20):train_serve_skew_risk,skew_test_absence,train_inference_boundarygroup2(0.35):pipeline_complexity,reproducibility,data_lineage_integrity,experiment_isolation,eval_integritygroup3(0.45):serving_maturity,drift_monitoring,data_validation,ci_integration,monitoring_alerting,model_staleness,serving_ops,shadow_canary,ab_testing,fairness_audit
ml_architecture.overall_score_extended keeps the same base groups and adds two additive groups:
group4(0.20):model_validation_gates,calibration_uncertainty,feature_store_consistency,progressive_delivery_analysis,provenance_attestation,responsible_ai_governancegroup5(0.20):attestation_enforcement,model_registry_governance,lineage_schema_fidelity,adversarial_resilience,post_market_incident_readiness,genai_telemetry_semconv
No-ML-Files Behavior
Section titled “No-ML-Files Behavior”If no training/serving ML files are detected:
- detector scores are neutral (
0.5) ml_architecture.overall_score = 0.5ml_architecture.overall_score_extended = 0.5- diagnostic file counts are emitted as
0 - findings are not emitted in this neutral mode
Emitted Key Contract
Section titled “Emitted Key Contract”Detector score keys (0..1, higher is better)
Section titled “Detector score keys (0..1, higher is better)”| Metric Key |
|---|
ml_architecture.train_serve_skew_score |
ml_architecture.skew_test_absence_score |
ml_architecture.pipeline_complexity_score |
ml_architecture.reproducibility_score |
ml_architecture.train_inference_boundary_score |
ml_architecture.data_lineage_integrity_score |
ml_architecture.experiment_isolation_score |
ml_architecture.eval_integrity_score |
ml_architecture.serving_maturity_score |
ml_architecture.drift_monitoring_score |
ml_architecture.data_validation_score |
ml_architecture.ci_integration_score |
ml_architecture.fairness_audit_score |
ml_architecture.ab_testing_score |
ml_architecture.shadow_canary_score |
ml_architecture.monitoring_alerting_score |
ml_architecture.model_staleness_score |
ml_architecture.serving_ops_score |
ml_architecture.model_validation_gates_score |
ml_architecture.calibration_uncertainty_score |
ml_architecture.feature_store_consistency_score |
ml_architecture.progressive_delivery_analysis_score |
ml_architecture.provenance_attestation_score |
ml_architecture.responsible_ai_governance_score |
ml_architecture.attestation_enforcement_score |
ml_architecture.model_registry_governance_score |
ml_architecture.lineage_schema_fidelity_score |
ml_architecture.adversarial_resilience_score |
ml_architecture.post_market_incident_readiness_score |
ml_architecture.genai_telemetry_semconv_score |
Composite and diagnostic keys
Section titled “Composite and diagnostic keys”| Metric Key | Range / Type | Direction |
|---|---|---|
ml_architecture.overall_score | 0..1 | Higher is better |
ml_architecture.overall_score_extended | 0..1 | Higher is better |
ml_architecture.gpu_file_count | Number | Informational |
ml_architecture.database_file_count | Number | Informational |
ml_architecture.env_config_file_count | Number | Informational |
ml_architecture.graph.* | Graph entries | Informational |
Version Note
Section titled “Version Note”This contract is documented against metric version 2.0.0.