Ricci Curvature Metrics
Ricci Curvature Metrics
Section titled “Ricci Curvature Metrics”Overview
Section titled “Overview”Ricci curvature metrics provide a geometric perspective on software architecture by measuring the “curvature” of the dependency graph. Negative curvature indicates architectural problems such as bridges between modules, cycles, and hubs.
This implementation is based on Perelman’s work on Ricci flow and applies it to software architecture analysis.
Mathematical Background
Section titled “Mathematical Background”Forman-Ricci Curvature (FRC)
Section titled “Forman-Ricci Curvature (FRC)”Forman-Ricci curvature is a fast approximation of Ricci curvature that can be computed in O(E) time:
F(e) = 4 - deg(u) - deg(v) + 3 * trianglesWhere:
deg(u)is the undirected degree of node utrianglesis the number of common neighbors (triangles through edge e)
Interpretation:
- Negative FRC: Edge is a “bridge” between clusters (architectural problem)
- Positive FRC: Edge is within a well-connected module (healthy)
Ollivier-Ricci Curvature (ORC)
Section titled “Ollivier-Ricci Curvature (ORC)”Ollivier-Ricci curvature uses optimal transport (Wasserstein-1 distance) for more accurate measurement:
κ(u,v) = 1 - W₁(m_u, m_v) / d(u,v)Where:
W₁is Wasserstein-1 distance (Earth Mover’s Distance)m_u, m_vare probability distributions on neighborsd(u,v)is the graph distance
Interpretation:
- Negative ORC: Strong indication of architectural bridge
- More accurate but slower than FRC (O(E·N²) worst case)
Metrics
Section titled “Metrics”Core Curvature Metrics
Section titled “Core Curvature Metrics”-
ricci.frc_neg_share: Share of edges with negative Forman curvature (0-1)- < 10%: Healthy boundaries
- 10-30%: Mixed structure
-
30%: High probability of architectural problems
-
ricci.frc_bridge_mass: Sum of |κ| for negative FRC edges, normalized- < 0.03: Usually good
- 0.03-0.08: Worth investigating top edges
-
0.08: Architecture held together by bridges
-
ricci.frc_hotspot_conc: Concentration of negative curvature in top 5% nodes-
0.50: Half of “bridge pain” in small group → refactoring candidates
- < 0.30: Problem distributed (more systemic)
-
-
ricci.orc_neg_share: Share of edges with negative Ollivier curvature -
ricci.orc_bridge_mass: Sum of |κ| for negative ORC edges
Pattern Detection (Canonical Neighborhoods)
Section titled “Pattern Detection (Canonical Neighborhoods)”Based on Perelman’s theory, we detect 5 canonical patterns:
CN-1: Neck/Bridge Edge
Section titled “CN-1: Neck/Bridge Edge”- Metric:
ricci.pattern_neck_count - Criteria: Negative curvature (κ < -0.3) AND high edge betweenness (> p95)
- Meaning: Thin “neck” connecting two clusters
- Surgery: Interface extraction, Dependency Inversion, Event bus
CN-2: Bidirectional Knot
Section titled “CN-2: Bidirectional Knot”- Metric:
ricci.pattern_knot_count - Criteria: Part of SCC with size 2-5
- Meaning: Small cycles (mutual dependencies)
- Surgery: Extract types.ts, remove barrel exports
CN-3: Cap/Terminal Leaf
Section titled “CN-3: Cap/Terminal Leaf”- Metric:
ricci.pattern_cap_violation_rate - Criteria: Low degree AND imports wrong layers
- Meaning: Leaf nodes that violate layer boundaries
- Surgery: Add adapter layer
CN-4: Horn/Tendril
Section titled “CN-4: Horn/Tendril”- Metric:
ricci.pattern_horn_count - Criteria: Node on chain length > 5 with low clustering
- Meaning: Long thin chains (helper -> helper -> helper)
- Surgery: Collapse into single module or facade
CN-5: Hub
Section titled “CN-5: Hub”- Metric:
ricci.pattern_hub_count - Criteria: High degree (> p95) AND high incident negative curvature
- Meaning: Architecture collapsing onto single node
- Surgery: Split by domain (shared/date, shared/http, etc.)
Spectral Metrics
Section titled “Spectral Metrics”-
ricci.algebraic_connectivity: λ₂ (Fiedler eigenvalue)- Small values: Graph can be easily split
- Large values: Graph is tightly connected
-
ricci.conductance_min: Minimum conductance across communities- ≤ 0.05: Clear module boundaries
- 0.05-0.15: Moderate boundaries
-
0.15: Blurry boundaries
-
ricci.conductance_median: Median conductance
Flow Metrics
Section titled “Flow Metrics”-
ricci.flow_energy_drop: Energy convergence after flow iterations-
40%: Structure converges, boundaries stable
- 15-40%: Moderate
- < 15%: Weak modularity or noisy graph
-
-
ricci.flow_time_to_separation: Iterations until boundaries stabilize- ≤ 10: Clear boundaries
- 10-30: Moderate
-
30: Blurry boundaries
-
ricci.cut_stability: Bootstrap stability score (0-1)- ≥ 0.80: Stable boundaries
- 0.60-0.80: Tolerable
- < 0.60: Unstable boundaries
Reduced Change Distance (RCD) Metrics
Section titled “Reduced Change Distance (RCD) Metrics”RCD extends traditional structural dependency weights by incorporating change-coupling from Git history, providing a “cost of change” perspective on the architecture.
Mathematical Foundation:
cost(e) = α_runtime · w_runtime + α_coupling · w_coupling + α_symbols · w_symbolsWhere:
w_runtime: Structural edge weight (import strength)w_coupling: Normalized Git co-change frequency [0, 1]w_symbols: Number of symbols used in the importα_*: Tunable weights (default: 1.0, 1.0, 0.5)
Core Metrics:
-
ricci.rcd_within: Average edge cost within detected communities- Lower values: Well-encapsulated modules with low change coupling
- Higher values: Modules change together frequently (potential for consolidation)
-
ricci.rcd_cross: Average edge cost across community boundaries- Lower values: Clean boundaries with minimal cross-module coupling
- Higher values: Leaky boundaries, modules change together despite separation
-
ricci.rcd_anisotropy: Ratio of cross-boundary to within-community costs- High ratio (cross >> within): Clear boundaries, good separation
- Low ratio (cross ≈ within): Leaky boundaries, potential misalignment with evolution
- Ideal: Anisotropy > 1.5 indicates well-defined module boundaries
Interpretation:
RCD metrics reveal the alignment between structural architecture and evolutionary patterns:
- High RCD_CROSS + Low RCD_ANISOTROPY: Files separated structurally but coupled evolutionarily → Consider merging or reducing coupling
- High RCD_WITHIN: High internal coupling → Module is cohesive or too large
- Low RCD_CROSS: Minimal cross-boundary coupling → Good architectural boundaries
Singularity Score Composite
Section titled “Singularity Score Composite”The Singularity Score unifies three dimensions (curvature, betweenness, change coupling) into a single prioritization metric for architectural issues.
Mathematical Foundation:
S(e) = z_curvature + z_betweenness + z_couplingWhere each component is a z-score (standardized to mean=0, std=1):
z_curvature: How negative the edge curvature is (from FRC/ORC)z_betweenness: How central the edge is in shortest pathsz_coupling: How frequently the endpoints change together (from Git)
Components:
- Curvature Z-Score: Structural “pain” (negative curvature = bridge)
- Betweenness Z-Score: Information flow criticality (high betweenness = chokepoint)
- Coupling Z-Score: Evolutionary coupling (frequent co-change = hidden dependency)
Interpretation:
-
High composite score (> 2.0): Critical architectural issue requiring immediate attention
- Combines structural, information flow, and evolutionary problems
- Highest ROI for refactoring efforts
-
Moderate score (1.0-2.0): Significant issue worth investigating
- May have one or two dimensions of concern
-
Low score (< 1.0): Minor or no issue
Use Cases:
- Priority ranking: Sort refactoring candidates by composite score
- Hotspot identification: Edges with high scores across all dimensions
- Pattern validation: Cross-validate canonical neighborhood detection with quantitative scores
Output:
Singularity scores are included in the details section of the metric result, with top edges ranked by composite score. Each edge includes:
- Individual z-scores for curvature, betweenness, coupling
- Composite score
- Source and target nodes
Configuration
Section titled “Configuration”metrics: - id: ricci_curvature enabled: true config: alpha: 0.5 # ORC idleness parameter flow_iterations: 10 # Ricci flow steps flow_step_size: 0.1 # η parameter pattern_detection: true # Enable CN-1 to CN-5 surgery_suggestions: true top_edges: 20 # Number of worst edges to report
# RCD (Reduced Change Distance) weights rcd_alpha_runtime: 1.0 # Weight for structural dependencies rcd_alpha_coupling: 1.0 # Weight for Git co-change coupling rcd_alpha_symbols: 0.5 # Weight for symbol usageOutput Details
Section titled “Output Details”The plugin provides detailed output in the details field:
-
top_negative_curvature_edges: Top N edges with worst curvature, including:from,to: Node IDsfrc_curvature,orc_curvature: Curvature valuessurgery_type: Recommended refactoring
-
top_hub_nodes: Top 10 hub nodes with severity scores -
surgery_suggestions: Prioritized list of refactoring recommendations, now enhanced with singularity scores for better prioritization -
singularity_scores: Edges ranked by composite score, including:from,to: Node indicescurvature_z,betweenness_z,coupling_z: Individual z-scorescomposite: Combined singularity score
-
rcd_metrics: Summary of Reduced Change Distance analysis:rcd_within: Average edge cost within communitiesrcd_cross: Average edge cost across boundariesrcd_anisotropy: Boundary clarity metric (cross/within ratio)
Performance
Section titled “Performance”- Forman-Ricci: O(E) - Fast, suitable for large graphs
- Ollivier-Ricci: O(E·N²) worst case - Accurate but slower
- Spectral metrics: O(N³) for eigenvalue computation
- Pattern detection: O(E + V) - Fast
- RCD computation: O(E + H) where H is Git history size - Fast, scales with edge count
- Singularity scores: O(E) - Fast, linear in edge count (z-score normalization)
For large graphs (>1000 nodes), consider:
- Using FRC as primary metric
- Approximating ORC with Sinkhorn algorithm
- Using power iteration for λ₂ only
- RCD and Singularity scores add minimal overhead (~5-10% on top of base curvature computation)
References
Section titled “References”- Ollivier, Y. (2009). “Ricci curvature of Markov chains on metric spaces”
- Forman, R. (2003). “Bochner’s method for cell complexes and combinatorial Ricci curvature”
- Perelman, G. (2002-2003). “Ricci flow with surgery”
- D’Ambros, M., et al. (2012). “On the interplay between structural and logical coupling in software”
- Nagappan, N., et al. (2008). “The influence of organizational structure on software quality”
Interpretation Guide
Section titled “Interpretation Guide”What Negative Curvature Means
Section titled “What Negative Curvature Means”Negative curvature on an edge indicates:
- Bridge: Edge connects two clusters with few other connections
- Layer violation: Edge crosses architectural boundaries incorrectly
- Hub connection: Edge connects to/from a hub node
When to Act
Section titled “When to Act”- High NEG_SHARE (>30%): Systematic architectural problems
- High BRIDGE_MASS (>0.08): Architecture held together by bridges
- High HOTSPOT_CONC (>0.50): Concentrated problems, easier to fix
- Pattern detection: Specific refactoring opportunities
Surgery Recommendations
Section titled “Surgery Recommendations”The plugin provides specific surgery suggestions based on detected patterns:
- InterfaceExtraction: Create ports.ts to decouple modules
- DependencyInversion: Apply DIP
- BreakCycle: Extract types.ts without back-imports
- RemoveBarrel: Inline barrel exports causing cycles
- LayerAdapters: Add adapter layer for boundary violations
- CollapseChain: Merge helper chains into single module
- SplitHub: Split hub by domain
Example Usage
Section titled “Example Usage”// The plugin is automatically registered and computed// Results are available in MetricResult with:// - values: HashMap of metric keys to values// - details: JSON with top edges, hubs, and surgery suggestionsBest Practices
Section titled “Best Practices”- Start with FRC: Use Forman-Ricci for initial analysis (fast)
- Deep dive with ORC: Use Ollivier-Ricci for critical edges (accurate)
- Monitor trends: Track NEG_SHARE and BRIDGE_MASS over time
- Prioritize with Singularity Scores: Use composite scores to rank refactoring candidates
- Focus on edges with composite scores > 2.0 first
- Singularity scores combine structural, information flow, and evolutionary signals
- Analyze RCD metrics: Check alignment between structure and evolution
- High RCD_ANISOTROPY (> 1.5): Good module boundaries
- Low RCD_ANISOTROPY (< 1.0): Potential architectural drift
- Use RCD to validate that structural boundaries match change patterns
- Cross-validate patterns: Use multiple signals together
- Negative curvature + high betweenness + high coupling = critical issue
- Canonical neighborhoods + singularity scores = high-confidence refactoring targets
- Tune RCD weights: Adjust
rcd_alpha_*parameters based on your context- Emphasize
rcd_alpha_couplingif evolutionary coupling is primary concern - Increase
rcd_alpha_symbolsfor fine-grained API usage analysis
- Emphasize
Priority 2 Features: Entropy & Feature Analysis
Section titled “Priority 2 Features: Entropy & Feature Analysis”Entropy Metrics (Anti-Gaming Health Indicators)
Section titled “Entropy Metrics (Anti-Gaming Health Indicators)”These metrics are difficult to manipulate locally and provide stable long-term health signals.
ricci.entropy_structural
Section titled “ricci.entropy_structural”Shannon entropy of degree distribution (normalized)
- High (> 0.7): Balanced architecture with diverse node roles
- Medium (0.4-0.7): Typical for layered systems
- Low (< 0.4): High concentration - few nodes dominate
Use: Track whether architecture becomes more concentrated over time
ricci.entropy_flow
Section titled “ricci.entropy_flow”Shannon entropy of betweenness distribution
- High: Traffic distributed - resilient
- Low: Traffic through bottlenecks - fragile
Use: Detect architecture collapsing onto hub nodes
ricci.entropy_curvature
Section titled “ricci.entropy_curvature”Shannon entropy of edge curvature distribution
- High: Mixed edge types - complex boundaries
- Low: Uniform edges (all healthy or all problematic)
ricci.degree_gini
Section titled “ricci.degree_gini”Gini coefficient of degree distribution (0=equal, 1=concentrated)
- < 0.3: Well-balanced
- 0.3-0.6: Moderate inequality (expected)
- > 0.6: Dominated by few nodes
Thresholds:
- ✅ Good: < 0.4
- ⚠️ Warning: 0.4-0.6
- ❌ Critical: > 0.6
ricci.concentration_top10
Section titled “ricci.concentration_top10”Share of total degree held by top 10% nodes
- < 0.3: Distributed
- 0.3-0.5: Moderate
- > 0.5: High concentration
Monotonicity Tracking
Section titled “Monotonicity Tracking”Track whether metrics improve or degrade over releases. Helps identify:
- Trend violations: When metrics regress
- Architecture health direction: Overall improvement or degradation
- Early warning signals: Detect problems before they become critical
Implementation: Store snapshots of bridge_mass, neg_share, and hub_concentration over time. Compute violations as the number of times metrics increased (got worse) between releases.
Interpretation:
trend_score = 1.0: All metrics improvingtrend_score = 0.67: 33% of metrics regressingtrend_score < 0.5: More regressions than improvements - urgent action needed
Feature-Level Root Cause Analysis
Section titled “Feature-Level Root Cause Analysis”When a specific feature/module starts having problems, this workflow identifies:
- What changed architecturally (delta metrics)
- Where the problem is (root cause edges/nodes)
- Why it happened (boundary drifts, new necks)
Key Concepts:
Delta Metrics
Section titled “Delta Metrics”-
delta_ccr: Change in Cross-Change Rate (files changing with other modules)- > 0.1: Boundaries eroding
- > 0.2: Significant violation
-
delta_aniso: Change in anisotropy (cross/within cost ratio)- < 0: Boundaries weakening
- Healthy target: anisotropy > 2.0
-
new_necks: Edges that became thin bottlenecks -
delta_hotspot_shift: Change in hub concentration -
new_cycles: New circular dependencies
Root Cause Ranking
Section titled “Root Cause Ranking”Edges and nodes ranked by composite score considering:
- Curvature (structural)
- Betweenness (information flow)
- Co-change coupling (evolution)
Boundary Drifts
Section titled “Boundary Drifts”Identifies pairs of modules that started co-changing, indicating:
- Wrong boundaries
- Missing abstractions
- Need for interface extraction
See: Full documentation for detailed usage examples and API reference.
Priority 2 Configuration
Section titled “Priority 2 Configuration”metrics: - id: ricci_curvature enabled: true config: # ... existing config ...
# Priority 2: Enable entropy metrics compute_entropy: true
# Priority 2: Feature analysis (optional) # feature_analysis: # enabled: true # features: # - name: "checkout" # path_prefix: "src/features/checkout" # problem_date: "2024-03-15T00:00:00Z"
policy: invariants: # Anti-gaming entropy thresholds - metric: ricci.degree_gini op: "<=" value: 0.6
- metric: ricci.concentration_top10 op: "<=" value: 0.5
- metric: ricci.entropy_structural op: ">=" value: 0.4Why Entropy Metrics are “Anti-Gaming”
Section titled “Why Entropy Metrics are “Anti-Gaming””- Hard to fake locally: Improving entropy requires system-wide changes
- Scale-invariant: Normalized metrics work for any system size
- Multi-dimensional: Can’t optimize one without considering others
- Temporal signal: Monotonicity tracking catches regressions
When to Use Feature Analysis
Section titled “When to Use Feature Analysis”Use feature-level root cause analysis when:
- A specific feature/screen suddenly becomes problematic
- Bug frequency increases in a module
- Development velocity drops for a feature
- Multiple developers complain about a subsystem
The analysis will pinpoint:
- Which dependencies became problematic
- When the degradation started
- What type of refactoring will help most