Skip to content
Arxo Arxo

Language Provider API

You can add support for additional languages by implementing the LanguageProvider trait from arxo-lang-api and registering your provider with the engine’s LanguageRegistry. The engine uses providers to parse imports, exports, calls, and effects, and to optionally resolve modules and post-process call data.

:::info For engine extenders This page describes internal extension APIs used when building the engine or adding support for new languages. When using the closed-source engine via arxo-loader or the FFI, you cannot register language providers; the public API is load, version, and analyze → JSON. See Rust API and FFI API for the public library API. :::

  • Trait: LanguageProvider in arxo-lang-api
  • Registry: LanguageRegistry — the engine (or your integration) registers built-in and custom providers
  • Lookup: By file extension (provider_for_extension) or by language id (provider_for_id)
  • Detection: detect_languages scans a directory and returns which providers have files there (with counts and weights)
[dependencies]
arxo-lang-api = "0.1"
arxo-types = "0.1"
async-trait = "0.1"
anyhow = "1.0"
fn id(&self) -> &'static str; // e.g. "my_lang"
fn name(&self) -> &'static str; // e.g. "My Language"
fn extensions(&self) -> &'static [&'static str]; // e.g. &["myl", "mylang"]
  • id: Unique identifier; used in config and registry lookup.
  • name: Human-readable name.
  • extensions: File extensions this provider handles (without the dot). The registry maps each extension to this provider.
async fn parse_imports(&self, path: &Path, content: &str) -> Result<FileImports>;
async fn parse_exports(&self, path: &Path, content: &str) -> Result<FileExports>;
async fn parse_directory(&self, path: &Path, exclude: &[String]) -> Result<ImportGraph>;
async fn parse_calls(&self, path: &Path, content: &str, node_id: &NodeId) -> Result<CallParseResult>;
fn parse_file_full_sync(&self, path: &Path, content: &str, node_id: &NodeId) -> Result<FullParseResult>;
async fn parse_effects(&self, path: &Path, content: &str) -> Result<Vec<(EffectType, String)>>;
  • parse_imports: Extract import specifiers and local names for one file. Return FileImports (symbols, wildcards, optional package name).
  • parse_exports: Extract exported names and classes, default export, re-exports. Return FileExports.
  • parse_directory: Walk the directory (respecting exclude), parse all relevant files, and build an ImportGraph (nodes = files, edges = imports). Used for initial import-graph construction.
  • parse_calls: Extract call sites from the file. node_id is the file’s NodeId. Return CallParseResult (calls plus optional type/local-definition info).
  • parse_file_full_sync: Synchronous “full file” parse: imports + exports + calls. Default implementation calls the async parsers; you can override for sync-only contexts (e.g. rayon).
  • parse_effects: Detect side-effect kinds (IO, network, storage, log, time, random, mutation, LLM) and return (EffectType, API name) pairs per file.

Types FileImports, FileExports, CallParseResult, FullParseResult, and EffectType are from arxo-lang-api and arxo-types::data::effects.

parse_file_full (async) has a default that composes parse_imports, parse_exports, and parse_calls into FullParseResult. Override only if you need different behavior.

fn create_module_resolver(&self, source_root: &Path) -> Option<ModuleResolverKind>;

Return Some(ModuleResolverKind::TypeScript(Box::new(my_resolver))) if your language uses a TypeScript-style module resolver (resolve import specifiers to file paths). The engine uses this for resolution when building the import/call graph. Default is None.

TypeScriptModuleResolver:

fn resolve(&self, from_path: &Path, source: &str) -> Option<PathBuf>;
fn prime_stat(&self, path: &Path, is_file: bool) {} // optional cache priming
fn post_process(
&self,
source_path: &Path,
ctx: &mut dyn LanguagePostProcessContext,
file_calls: &mut Vec<(NodeId, Vec<CallInfo>)>,
) -> Result<()>;

Called after call extraction so you can refine or resolve calls using LanguagePostProcessContext (e.g. resolve targets, attach type info). Default is a no-op.

fn is_builtin_call(&self, call: &CallInfo) -> bool;
fn default_exclude_patterns(&self) -> &'static [&'static str];
  • is_builtin_call: Return true if the call targets a built-in or stdlib symbol (so the engine can mark it as external). Default: false.
  • default_exclude_patterns: Default globs for this language (e.g. ["node_modules", "target"]). Used when resolving language presets. Default: &[].
  • FileImports: symbols (local name → ImportedSymbol), wildcard_packages, package_name.
  • ImportedSymbol: local_name, original_name, source, is_default, is_namespace.
  • FileExports: exported_names, exported_classes, has_default_export, re-exports, optional package_scope_names.
  • CallInfo: name, edge_type (CallEdgeType), source_file, span, is_precise, target_files, chained/callback info, etc.
  • CallParseResult: Parsed calls plus optional type/local-definition data for the resolver.
  • FullParseResult: imports, exports, calls (from parse_calls).
  • SourceSpan: lo, hi (byte offsets).
  • NodeId: From arxo-types; file path identifier.

The engine (and optional CallResolutionContext) use these to build the call graph and entity graph.

use arxo_lang_api::LanguageRegistry;
let mut registry = LanguageRegistry::new();
registry.register(Arc::new(MyLangProvider) as Arc<dyn LanguageProvider>);
// Lookup
registry.provider_for_extension("myl") -> Option<Arc<dyn LanguageProvider>>;
registry.provider_for_id("my_lang") -> Option<Arc<dyn LanguageProvider>>;
// All extensions and providers
registry.all_extensions() -> Iterator<Item = &'static str>;
registry.providers() -> Iterator<Item = &Arc<dyn LanguageProvider>>;
// Detect languages in a directory
registry.detect_languages(path, exclude_patterns) -> Result<Vec<(Arc<dyn LanguageProvider>, f64, usize)>>;

detect_languages walks the directory (respecting exclude patterns), counts files per provider by extension, and returns (provider, weight, file_count) for each language that has files. The engine uses this for language: auto.

The engine implements CallResolutionContext (in arxo-lang-api) to resolve call targets (e.g. import source → NodeId, same-package/same-directory resolution). Your provider does not implement this; it only uses it if you implement post_process and the engine passes a context that implements LanguagePostProcessContext (and possibly CallResolutionContext). Built-in providers (TypeScript, Java, Kotlin, Python, etc.) hook into this for resolution; see arxo-lang-* crates for examples.

use std::path::Path;
use arxo_lang_api::{LanguageProvider, LanguageRegistry, FileImports, FileExports, CallParseResult, FullParseResult};
use arxo_types::core::types::NodeId;
use arxo_types::data::effects::EffectType;
use arxo_types::data::import_graph::ImportGraph;
use async_trait::async_trait;
use std::sync::Arc;
struct MyLangProvider;
#[async_trait]
impl LanguageProvider for MyLangProvider {
fn id(&self) -> &'static str { "my_lang" }
fn name(&self) -> &'static str { "My Language" }
fn extensions(&self) -> &'static [&'static str] { &["myl"] }
async fn parse_imports(&self, _path: &Path, content: &str) -> anyhow::Result<FileImports> {
// Parse content and fill FileImports
Ok(FileImports::new())
}
async fn parse_exports(&self, _path: &Path, _content: &str) -> anyhow::Result<FileExports> {
Ok(FileExports::default())
}
async fn parse_directory(&self, path: &Path, exclude: &[String]) -> anyhow::Result<ImportGraph> {
// Walk path, exclude patterns, parse files, build ImportGraph
Ok(ImportGraph::new())
}
async fn parse_calls(&self, _path: &Path, _content: &str, _node_id: &NodeId) -> anyhow::Result<CallParseResult> {
Ok(CallParseResult::default())
}
fn parse_file_full_sync(&self, path: &Path, content: &str, node_id: &NodeId) -> anyhow::Result<FullParseResult> {
// Sync version: e.g. block_on(parse_imports/parse_exports/parse_calls) or implement inline
Ok(FullParseResult {
imports: FileImports::new(),
exports: FileExports::default(),
calls: CallParseResult::default(),
})
}
async fn parse_effects(&self, _path: &Path, _content: &str) -> anyhow::Result<Vec<(EffectType, String)>> {
Ok(Vec::new())
}
}
// Register with engine's registry (engine-specific; see integration docs)
// registry.register(Arc::new(MyLangProvider));

How you register a custom provider depends on how you run the engine:

  • Rust (Orchestrator): The engine typically builds a LanguageRegistry and passes it into the data store or orchestrator. You may need to extend the engine’s registration point (e.g. a hook or a builder that accepts extra providers) if you are embedding the engine. The open-source arxo-lang-* crates show how built-in providers are registered.
  • FFI: Custom providers are not currently exposed via the C FFI; the FFI uses the engine’s built-in set of languages. To support a new language from FFI, you would add it inside the engine and ship an updated binary.

For full details, see the engine’s API and the arxo-lang-* implementations (e.g. arxo-lang-typescript, arxo-lang-rust).

  • Graph types — What ImportGraph and call-related types look like
  • Configurationdata.language and language_presets
  • Plugin system — Metrics that consume the data produced by providers