Skip to content
Arxo Arxo

Git History

The engine can analyze Git history to compute file churn, co-change graphs, author metrics, and related indices. These are used by metrics such as MSR (Module Survival Risk), change coupling, hotspot score, ownership, and truck factor.

Git history is configured under data.git_history in DataConfig (from arxo-types::config::schema):

data:
git_history:
max_commits: 10000 # max commits to analyze (default: 10000 when unset)
since: "6 months ago" # start of time range (optional; ISO date or relative)
until: "now" # end of time range (optional; default: now)
enable_cochange: true # build co-change graph (default: true)
enable_authorship: true # build file_authors and coauthor_graph (default: true)
OptionTypeDescription
max_commitsnumber (optional)Maximum number of commits to analyze. Omit for default (e.g. 10000).
sincestring (optional)Start of time window (e.g. "2024-01-01", "6 months ago"). Omit for full history.
untilstring (optional)End of time window. Omit for “now”.
enable_cochangeboolean (optional)When true, build co-change graph (files changed together). Default: true. Set false for faster runs if no metric needs co-change.
enable_authorshipboolean (optional)When true, build file authors and co-author graph. Default: true. Set false if you don’t need ownership/bus factor.

Leaving git_history unset or null disables git-based analysis; metrics that depend on it will not have history data.

From the git history, the engine builds (internally) data used by:

  • Churn metrics — lines added/removed per file, commit count.
  • Co-change graph — files that change together (for change coupling, MSR).
  • Author metrics — ownership, bus factor, co-authorship.

This data is exposed through the DataStore as git_history() (e.g. GitHistory from arxo-types::data::git_history). Custom metric plugins can access it via MetricContextDataStoregit_history().

When implementing a metric plugin, your MetricContext provides a DataStore. You can call:

let git_history = data_store.git_history().await?;
// Use churn, co-change, authorship data as needed for your metric.

The exact shape of GitHistory is defined in arxo-types (e.g. file churn, co-change edges, author info). Use it read-only; the engine fills it from the repository at the project path.

  • max_commits limits how far back the engine walks. Use a lower value for very large repos or CI.
  • since / until reduce the window and can speed up analysis.
  • enable_cochange: false or enable_authorship: false reduce work when you don’t need those features.
  • The project path must be (or contain) a Git repository.
  • The engine runs Git commands (e.g. git log) to gather history; ensure Git is available and the repo is readable.