Gnoma is an agentic coding assistant developed in Go, designed to be provider-agnostic. It integrates with major cloud-based LLMs such as Anthropic, OpenAI, Google Gemini, and Mistral, as well as local models like Ollama and llama.cpp. A core feature is its multi-armed bandit router, which intelligently selects the most suitable model for each prompt based on factors like capability, declared strengths, latency, and cost. This routing mechanism is transparent, with the chosen 'arm' visible in the TUI for every turn.
The assistant supports flexible routing preferences, allowing users to prioritize local models, cloud models, or let the bandit decide automatically. It also incorporates a 'Tier 0 SLM routing' where a small local model handles trivial tasks, reserving more powerful providers for complex work. Privacy and security are key considerations, with features including content boundary and secret scanning for all outgoing LLM messages and incoming tool results. Paths are canonicalized, Unicode is sanitized, and a `SafeProvider` boundary prevents incognito mode data from being stored long-term. Gnoma explicitly states a 'no phone home' policy, ensuring no analytics or metrics are sent off-machine, though prompts routed to cloud providers will transmit data to those services.
Gnoma is extensible through hooks, skills, Model Context Protocol (MCP) servers, and plugins. It also supports vision capabilities, allowing image input via `[Image: /path]` markers or clipboard paste in the TUI, with routing to vision-capable models. The tool is distributed as a single static binary, supporting multi-arch containers, and does not require a daemon or runtime dependencies. Currently in pre-1.0 status (v0.3.0), it is actively maintained with potential for breaking changes in minor versions.