Configuration Reference

The engine is configured via a Config struct (from arxo-types). You can load it from a YAML file or build it programmatically. When using the FFI, pass a JSON-serialized config string.

Top-Level Structure

source:        # Optional: Git repository to clone and analyze
data:          # Analysis data collection (language, graphs, git, telemetry)
architecture:  # Optional: Layer definitions for layering metrics
metrics:       # Optional: List of metric configs (id, enabled, config)
metric_preset: # Optional: Preset name, e.g. "ci", "quick" (replaces metrics list)
policy:        # Optional: Invariants to evaluate (metric, op, value)
report:        # Optional: Output format and file
run_options:  # Optional: Cache control, incremental, quiet

Source (Git)

When analyzing a remote repository:

source:
  git_url: "https://github.com/user/repo"
  git_ref: "main"           # branch, tag, or commit (default: HEAD)
  shallow: true             # shallow clone (default: true)
  cache_ttl_hours: 24       # cache TTL in hours (default: 24)

For local analysis, omit source and pass the project path at runtime.

Data

Controls what data is collected and how.

Language

data:
  language: auto   # typescript | rust | python | java | kotlin | go | cpp | csharp | php | auto

auto (default): Detect from file extensions in the project.
Other values restrict analysis to that language’s files.

Import Graph

data:
  import_graph:
    group_by: "folder"    # default: "folder"
    group_depth: 2       # default: 2 (depth for folder grouping)
    exclude:             # glob patterns to exclude
      - "node_modules"
      - "dist"
      - "**/*.test.ts"

group_by: How to group file-level nodes (e.g. "folder").
group_depth: Depth used for folder grouping.
exclude: Patterns to skip when building the import graph.

Call Graph

data:
  call_graph:
    min_confidence: 0.0   # 0.0–1.0; edges below this are dropped (default: 0.0)

Git History

data:
  git_history:
    max_commits: 10000      # max commits to analyze (optional)
    since: "6 months ago"   # start of time range (optional)
    until: "2024-01-01"     # end of time range (optional)
    enable_cochange: true    # build co-change graph (default: true)
    enable_authorship: true # build file_authors / coauthor_graph (default: true)

Set enable_cochange or enable_authorship to false to speed up runs when those metrics are not needed.

Test Coverage

data:
  test_coverage:
    paths: ["coverage/lcov.info", "**/jacoco.xml"]
    format: "auto"   # lcov | jacoco | cobertura | auto (default: auto)
    low_coverage_threshold: 0.5   # 0.0–1.0 (default: 0.5)

Telemetry (OpenTelemetry)

data:
  telemetry:
    source_path: "./traces"
    format: "otel_json"   # otel_json | zipkin_json | jaeger_json
    time_window:          # optional
      start: "2024-01-01T00:00:00Z"
      end: "2024-01-02T00:00:00Z"
    service_name: "my-service"   # optional filter

Language Presets

Predefined exclude patterns per ecosystem:

data:
  language_presets:
    - "node"      # node_modules, dist, etc.
    - "python"    # __pycache__, .venv, etc.
    - "rust"      # target, etc.

Values are merged with import_graph.exclude.

Per-Language Options

data:
  languages:
    python:
      type_checker: true   # Jedi type extraction (default: true when unset)
    # typescript: {}       # reserved
    # java: {}             # reserved
    # kotlin: {}           # reserved
    # go: {}               # reserved
    # rust: {}             # reserved

Architecture (Layers)

For layering and dependency rules:

architecture:
  layers:
    - name: "ui"
      paths: ["src/ui/**"]
      allowed_effects: ["log"]
      can_depend_on: ["domain", "infra"]
    - name: "domain"
      paths: ["src/domain/**"]
      allowed_effects: []
      can_depend_on: ["infra"]
    - name: "infra"
      paths: ["src/infra/**"]
      allowed_effects: ["io", "network", "storage"]
      can_depend_on: []

paths: Glob patterns for files in the layer.
allowed_effects: Effect types allowed in that layer.
can_depend_on: Layer names this layer may depend on.

Metrics

Explicit list of metrics (ignored if metric_preset is set):

metrics:
  - id: "scc"
    enabled: true
    config: {}    # optional metric-specific config
  - id: "centrality"
    enabled: true
    config: null

Metric Preset

Use a named preset instead of listing metrics:

metric_preset: "quick"   # or "ci", "full", etc.

When set, the preset’s metric list replaces any metrics array.

Policy (Invariants)

Rules evaluated after metrics run:

policy:
  invariants:
    - metric: "scc.cycle_count"
      op: "<="
      value: 0
    - metric: "propagation_cost.system.ratio"
      op: "<="
      value: 0.15

Operators: <=, >=, ==, <, > (serialized in YAML/JSON as strings).

Violations appear in the result (e.g. OrchestrationResult.violations).

Report

report:
  format: "console"   # console | json | html | sarif | snapshot | msgpack
  file: "./out.json" # optional; used for json/html/etc.
  metric_timings: false   # include per-metric run time in reports
  estimated_timings: false # show estimated run time in console

Run Options

Runtime behavior (often passed when invoking the engine, e.g. via FFI):

run_options:
  disable_cache: false   # turn off cache
  incremental: false     # use incremental parse cache (requires cache enabled)
  quiet: false           # suppress per-metric timings to stderr

Loading Config (Rust)

use arxo_types::config::{load_config, default_config, Config};

// From file (YAML)
let config = load_config("arxo.yml")?;

// Default (single metric, console output)
let config = default_config();

FFI: JSON Config

When using the C-compatible FFI, serialize Config to JSON and pass it as the config string. The same fields apply; use lowercase for enums (e.g. "language": "typescript", "op": "<=").

Example minimal JSON:

{
  "data": {
    "language": "typescript",
    "import_graph": {
      "group_by": "folder",
      "group_depth": 2,
      "exclude": ["node_modules", "dist"]
    }
  },
  "metric_preset": "quick",
  "report": { "format": "json" }
}

Next Steps

Getting Started — Install and first run
Rust API — Use config from Rust
FFI API — Use config from C/Go/Zig
Plugin System — Custom metrics and data access