Language Provider API
Language Provider API
Section titled “Language Provider API”You can add support for additional languages by implementing the LanguageProvider trait from arxo-lang-api and registering your provider with the engine’s LanguageRegistry. The engine uses providers to parse imports, exports, calls, and effects, and to optionally resolve modules and post-process call data.
:::info For engine extenders This page describes internal extension APIs used when building the engine or adding support for new languages. When using the closed-source engine via arxo-loader or the FFI, you cannot register language providers; the public API is load, version, and analyze → JSON. See Rust API and FFI API for the public library API. :::
Overview
Section titled “Overview”- Trait:
LanguageProviderinarxo-lang-api - Registry:
LanguageRegistry— the engine (or your integration) registers built-in and custom providers - Lookup: By file extension (
provider_for_extension) or by language id (provider_for_id) - Detection:
detect_languagesscans a directory and returns which providers have files there (with counts and weights)
Dependencies
Section titled “Dependencies”[dependencies]arxo-lang-api = "0.1"arxo-types = "0.1"async-trait = "0.1"anyhow = "1.0"LanguageProvider Trait
Section titled “LanguageProvider Trait”Required: identity and extensions
Section titled “Required: identity and extensions”fn id(&self) -> &'static str; // e.g. "my_lang"fn name(&self) -> &'static str; // e.g. "My Language"fn extensions(&self) -> &'static [&'static str]; // e.g. &["myl", "mylang"]id: Unique identifier; used in config and registry lookup.name: Human-readable name.extensions: File extensions this provider handles (without the dot). The registry maps each extension to this provider.
Required: parsing
Section titled “Required: parsing”async fn parse_imports(&self, path: &Path, content: &str) -> Result<FileImports>;async fn parse_exports(&self, path: &Path, content: &str) -> Result<FileExports>;async fn parse_directory(&self, path: &Path, exclude: &[String]) -> Result<ImportGraph>;async fn parse_calls(&self, path: &Path, content: &str, node_id: &NodeId) -> Result<CallParseResult>;fn parse_file_full_sync(&self, path: &Path, content: &str, node_id: &NodeId) -> Result<FullParseResult>;async fn parse_effects(&self, path: &Path, content: &str) -> Result<Vec<(EffectType, String)>>;parse_imports: Extract import specifiers and local names for one file. ReturnFileImports(symbols, wildcards, optional package name).parse_exports: Extract exported names and classes, default export, re-exports. ReturnFileExports.parse_directory: Walk the directory (respectingexclude), parse all relevant files, and build anImportGraph(nodes = files, edges = imports). Used for initial import-graph construction.parse_calls: Extract call sites from the file.node_idis the file’sNodeId. ReturnCallParseResult(calls plus optional type/local-definition info).parse_file_full_sync: Synchronous “full file” parse: imports + exports + calls. Default implementation calls the async parsers; you can override for sync-only contexts (e.g. rayon).parse_effects: Detect side-effect kinds (IO, network, storage, log, time, random, mutation, LLM) and return (EffectType, API name) pairs per file.
Types FileImports, FileExports, CallParseResult, FullParseResult, and EffectType are from arxo-lang-api and arxo-types::data::effects.
Optional: default implementation
Section titled “Optional: default implementation”parse_file_full (async) has a default that composes parse_imports, parse_exports, and parse_calls into FullParseResult. Override only if you need different behavior.
Optional: module resolution
Section titled “Optional: module resolution”fn create_module_resolver(&self, source_root: &Path) -> Option<ModuleResolverKind>;Return Some(ModuleResolverKind::TypeScript(Box::new(my_resolver))) if your language uses a TypeScript-style module resolver (resolve import specifiers to file paths). The engine uses this for resolution when building the import/call graph. Default is None.
TypeScriptModuleResolver:
fn resolve(&self, from_path: &Path, source: &str) -> Option<PathBuf>;fn prime_stat(&self, path: &Path, is_file: bool) {} // optional cache primingOptional: post-process
Section titled “Optional: post-process”fn post_process( &self, source_path: &Path, ctx: &mut dyn LanguagePostProcessContext, file_calls: &mut Vec<(NodeId, Vec<CallInfo>)>,) -> Result<()>;Called after call extraction so you can refine or resolve calls using LanguagePostProcessContext (e.g. resolve targets, attach type info). Default is a no-op.
Optional: builtins and excludes
Section titled “Optional: builtins and excludes”fn is_builtin_call(&self, call: &CallInfo) -> bool;fn default_exclude_patterns(&self) -> &'static [&'static str];is_builtin_call: Returntrueif the call targets a built-in or stdlib symbol (so the engine can mark it as external). Default:false.default_exclude_patterns: Default globs for this language (e.g.["node_modules", "target"]). Used when resolving language presets. Default:&[].
Shared Types (arxo-lang-api)
Section titled “Shared Types (arxo-lang-api)”FileImports:symbols(local name →ImportedSymbol),wildcard_packages,package_name.ImportedSymbol:local_name,original_name,source,is_default,is_namespace.FileExports:exported_names,exported_classes,has_default_export, re-exports, optionalpackage_scope_names.CallInfo:name,edge_type(CallEdgeType),source_file,span,is_precise,target_files, chained/callback info, etc.CallParseResult: Parsed calls plus optional type/local-definition data for the resolver.FullParseResult:imports,exports,calls(fromparse_calls).SourceSpan:lo,hi(byte offsets).NodeId: Fromarxo-types; file path identifier.
The engine (and optional CallResolutionContext) use these to build the call graph and entity graph.
LanguageRegistry
Section titled “LanguageRegistry”use arxo_lang_api::LanguageRegistry;
let mut registry = LanguageRegistry::new();registry.register(Arc::new(MyLangProvider) as Arc<dyn LanguageProvider>);
// Lookupregistry.provider_for_extension("myl") -> Option<Arc<dyn LanguageProvider>>;registry.provider_for_id("my_lang") -> Option<Arc<dyn LanguageProvider>>;
// All extensions and providersregistry.all_extensions() -> Iterator<Item = &'static str>;registry.providers() -> Iterator<Item = &Arc<dyn LanguageProvider>>;
// Detect languages in a directoryregistry.detect_languages(path, exclude_patterns) -> Result<Vec<(Arc<dyn LanguageProvider>, f64, usize)>>;detect_languages walks the directory (respecting exclude patterns), counts files per provider by extension, and returns (provider, weight, file_count) for each language that has files. The engine uses this for language: auto.
CallResolutionContext
Section titled “CallResolutionContext”The engine implements CallResolutionContext (in arxo-lang-api) to resolve call targets (e.g. import source → NodeId, same-package/same-directory resolution). Your provider does not implement this; it only uses it if you implement post_process and the engine passes a context that implements LanguagePostProcessContext (and possibly CallResolutionContext). Built-in providers (TypeScript, Java, Kotlin, Python, etc.) hook into this for resolution; see arxo-lang-* crates for examples.
Minimal Example Skeleton
Section titled “Minimal Example Skeleton”use std::path::Path;use arxo_lang_api::{LanguageProvider, LanguageRegistry, FileImports, FileExports, CallParseResult, FullParseResult};use arxo_types::core::types::NodeId;use arxo_types::data::effects::EffectType;use arxo_types::data::import_graph::ImportGraph;use async_trait::async_trait;use std::sync::Arc;
struct MyLangProvider;
#[async_trait]impl LanguageProvider for MyLangProvider { fn id(&self) -> &'static str { "my_lang" } fn name(&self) -> &'static str { "My Language" } fn extensions(&self) -> &'static [&'static str] { &["myl"] }
async fn parse_imports(&self, _path: &Path, content: &str) -> anyhow::Result<FileImports> { // Parse content and fill FileImports Ok(FileImports::new()) }
async fn parse_exports(&self, _path: &Path, _content: &str) -> anyhow::Result<FileExports> { Ok(FileExports::default()) }
async fn parse_directory(&self, path: &Path, exclude: &[String]) -> anyhow::Result<ImportGraph> { // Walk path, exclude patterns, parse files, build ImportGraph Ok(ImportGraph::new()) }
async fn parse_calls(&self, _path: &Path, _content: &str, _node_id: &NodeId) -> anyhow::Result<CallParseResult> { Ok(CallParseResult::default()) }
fn parse_file_full_sync(&self, path: &Path, content: &str, node_id: &NodeId) -> anyhow::Result<FullParseResult> { // Sync version: e.g. block_on(parse_imports/parse_exports/parse_calls) or implement inline Ok(FullParseResult { imports: FileImports::new(), exports: FileExports::default(), calls: CallParseResult::default(), }) }
async fn parse_effects(&self, _path: &Path, _content: &str) -> anyhow::Result<Vec<(EffectType, String)>> { Ok(Vec::new()) }}
// Register with engine's registry (engine-specific; see integration docs)// registry.register(Arc::new(MyLangProvider));Integration with the Engine
Section titled “Integration with the Engine”How you register a custom provider depends on how you run the engine:
- Rust (Orchestrator): The engine typically builds a
LanguageRegistryand passes it into the data store or orchestrator. You may need to extend the engine’s registration point (e.g. a hook or a builder that accepts extra providers) if you are embedding the engine. The open-sourcearxo-lang-*crates show how built-in providers are registered. - FFI: Custom providers are not currently exposed via the C FFI; the FFI uses the engine’s built-in set of languages. To support a new language from FFI, you would add it inside the engine and ship an updated binary.
For full details, see the engine’s API and the arxo-lang-* implementations (e.g. arxo-lang-typescript, arxo-lang-rust).
Next steps
Section titled “Next steps”- Graph types — What
ImportGraphand call-related types look like - Configuration —
data.languageand language_presets - Plugin system — Metrics that consume the data produced by providers