opencode Setup: Self-Hosted AI Coding Assistant on ARM64
Correction 2026-05-13: original article claimed opencode is not affected by the Mistral strict-alternation BadRequest bug class. First-run test against local Mistral showed this is wrong: the auto-title generator sends two consecutive USER messages and gets rejected with HTTP 400. Plus the
opencode config setCLI does not exist (use the JSON config file). Both sections corrected inline below.
New to self-hosting AI? The Self-Hosted AI: Start Here hub walks the hardware-decision tree, inference-engine choice, and the operational gotchas that bite hardest in the first three months.
Quick Take
- opencode is a Node-based AI coding assistant with three frontends from one config: CLI, Electron desktop app, and
opencode serveweb mode.- Provider-agnostic: speaks the OpenAI completions API and points at any local endpoint (SGLang on port 30000, vLLM on 30001, anything that answers
/v1/chat/completions).- Replaces the OpenHands agent layer on this stack, which needed eight published fix articles to stay running on Mistral Small 4 because of structural Microagent-injection bugs.
- Costs the Docker-sandbox model of OpenHands. opencode runs in your shell, modifies your files directly. Pair with per-shell-command approval prompts and the same git-discipline that keeps human-typed mistakes from being permanent.
- This article documents the install, the config for a local OpenAI-compatible endpoint, and what changes on day one of running it.
This stack ran OpenHands as the agent layer from April through May. The setup recipe is documented at OpenHands Setup with Mistral-via-SGLang and the BadRequest fix at OpenHands BadRequest Fix. Both articles are still accurate for anyone running OpenHands today, the recipes work, the workarounds hold. What changed is that the bug class kept generating new shapes, and the cost-benefit on this hardware no longer favored keeping OpenHands in the chair.
opencode vs OpenHands at the architecture level
| OpenHands | opencode | |
|---|---|---|
| Runtime | Docker container (sandbox) | Node.js CLI plus TUI |
| Install | docker run --rm ... (multi-GB image, plus runtime sandbox images on first agent action) | npm i -g opencode-ai (~50 MB) |
| Sandbox | full Docker sandbox with its own shell, file system, network namespace | runs in YOUR shell, modifies YOUR files directly |
| Frontends | Web UI only (single browser tab against the container) | CLI plus Desktop App (Electron) plus Web Server mode |
| Provider | OpenAI-compatible plus Anthropic; per-provider plumbing | OpenAI-compatible (any provider that speaks the API) |
| Bad-Request-class | Ships the Mistral strict-alternation bug structurally (#14287) | Narrower variant via auto-title generator (see correction below) |
The flexibility row matters most in practice. opencode runs in three modes from one ~/.opencode/ config. Start a refactor in the desktop app on the couch, drop into the CLI from a terminal to verify a build, open opencode serve to share the session with a second machine. The state is shared, not the process. OpenHands is single-mode by design.
The sandbox row is the cost worth being honest about. OpenHands’ Docker-sandbox model meant a runaway agent could not rm -rf your home directory because the agent did not live there. opencode runs in your actual shell. An over-eager tool call can do real damage to real files. The mitigation is the per-shell-command approval prompt, an explicit allowlist, and the discipline you already apply to keep human-typed mistakes from being permanent (commit early, branch always, never --force without thinking).
Install opencode
Two install paths, both produce the same CLI binary plus a launcher that opens the desktop app:
# Path A: npm global, fast, ARM64-supported
npm i -g opencode-ai@latest
# Path B: Homebrew tap (macOS or Linux with brew)
brew install anomalyco/tap/opencode
# Optional: Electron desktop app via Homebrew cask
brew install --cask opencode-desktop
After install, verify:
opencode --version
opencode --help
The desktop app reads the same ~/.opencode/config.json that the CLI writes. Install the cask only if you actually want the GUI. The CLI alone is enough for terminal-first workflows.
Point opencode at your local inference endpoint
opencode does not ship with a provider preset for self-hosted SGLang or vLLM, but the openai-compatible provider handles them. Config is in ~/.opencode/config.json:
{
"provider": "openai-compatible",
"api_base": "http://localhost:30000/v1",
"api_key": "not-needed-local",
"model": "Mistral-Small-4"
}
Correction 2026-05-13: the opencode config set ... CLI does not exist in 1.14.48. The real config path is a JSON file edit. Full example with both local providers in one file:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"local-sglang": {
"npm": "@ai-sdk/openai-compatible",
"name": "Local SGLang Mistral",
"options": {
"baseURL": "http://localhost:30000/v1",
"apiKey": "not-needed-local"
},
"models": {
"Mistral-Small-4": {"name": "Mistral Small 4 (local SGLang)"}
}
},
"local-qwen": {
"npm": "@ai-sdk/openai-compatible",
"name": "Local Qwen3.6 vLLM",
"options": {
"baseURL": "http://localhost:30001/v1",
"apiKey": "not-needed-local"
},
"models": {
"qwen3.6-35b": {"name": "Qwen3.6-35B-A3B (local vLLM)"}
}
}
}
}
The apiKey field is a placeholder. SGLang on a private network does not require authentication, but the OpenAI client library refuses to send a request without a non-empty key. not-needed-local is conventional.
The model name under each provider’s models block must match the --served-model-name the inference server published. Select the active provider/model at run time with opencode run --model local-sglang/Mistral-Small-4 "..." (or pick from the TUI provider switcher).
Context-length gotcha (Mistral safer-config users)
opencode reserves max_tokens=32000 for completion by default on the build agent. With an 11552-token system prompt plus the 32000-token reserve, total request size is 43552 tokens, which exceeds the Mistral safer-launch context of 32768. Two options:
- Restore Mistral context to 65536 (revert
--context-lengthflag), accept the memory pressure trade-off. - Switch to Qwen3.6 (native 262144 context, no overflow at any practical workload).
A per-agent max_tokens override in opencode.json is theoretically a third option but not validated in this stack yet.
Correction 2026-05-13: opencode has a narrower variant of the BadRequest bug
First run against the local Mistral-Small-4 endpoint produced HTTP 400 from the auto-title-generator. opencode sends two consecutive user messages to the model on every new session:
{"role": "user", "content": "Generate a title for this conversation:\n"},
{"role": "user", "content": "<the actual user prompt>"}
Mistral 400 response:
After the optional system message, conversation roles must alternate user
and assistant roles except for tool calls and results.
Same bug class as the OpenHands RecallAction injection, narrower scope: only the title generator does it, not every turn. Workarounds:
- Switch to Qwen3.6-35B-A3B (the Qwen3.6 migration endpoint). Qwen3.6’s chat template does not enforce strict alternation, so the second USER passes.
- Sidecar proxy that collapses consecutive USER messages, same pattern as the OpenClaw setup uses.
- Disable opencode auto-titling if/when the config flag exists (open question, not in 1.14.48 docs).
The original sales pitch claim “opencode does not inject synthetic USER messages” was based on the architecture overview, not on a first-run test. The test caught what the overview missed.
Use opencode
Start a session against your current directory:
cd /path/to/your/repo
opencode
This drops into the TUI with a chat panel and a file-tree pane. opencode reads the repo, indexes the file structure, and waits for input. Type a task in natural language:
Refactor the error handling in src/api.ts to use the AppError class instead of raw
throw new Error.
opencode plans the edit, shows a diff preview, and prompts before writing files. Approve, deny, or edit the plan inline. The diff is applied to your working tree. No commit, no push, no docker volume copy-out. The change is in your repo as if you had typed it yourself.
For shell commands the agent wants to run (run tests, install a package, check git status), opencode prompts per-command unless the command is on your allowlist. The allowlist is editable in ~/.opencode/allowlist.json. Build it up over time from commands you trust.
Open questions, follow-up article
This article is the install and the why. A day-2 field report follows after running opencode end-to-end against Qwen3.6 for a real coding session, with the open questions:
- Does opencode’s tool-call parser handle Qwen3.6’s
qwen3_coderparser format cleanly, or does it expect OpenAI’s function-calling shape and need translation? - Does the per-shell-command approval prompt fatigue out in practice, or does the allowlist mechanism take care of the 80% case?
- Does session-state sharing CLI ↔ desktop ↔
opencode serveactually work, or is it a marketing claim with caveats? - Does the loss of Docker sandbox bite at any point, or is git-discipline enough?
Target for the field report: end of week.
Rollback to OpenHands
If opencode does not work out, rollback is five minutes of work. The OpenHands container image was removed but is one docker pull ghcr.io/all-hands-ai/openhands:latest away. The state directory archived to /data/openhands-state.archive-2026-05-13/ (patches, sessions, config.toml) is mv back to /data/openhands-state/ and the legacy recreate-openhands.sh is at /data/scripts/archive/recreate-openhands.sh.2026-05-13. The setup recipe at OpenHands Setup with Mistral-via-SGLang is still accurate.
Cross-references
- The OpenHands recipe this article supersedes: OpenHands Setup with Mistral-via-SGLang
- The Mistral BadRequest fix that worked but did not close the bug class: OpenHands BadRequest Fix
- The upstream issue documenting the RecallAction-as-USER bug: All-Hands-AI/OpenHands #14287
- The LLM-stack migration this depends on: Spark Arena Rank 4 Made Me Add Qwen3.6
- The TTS spike running alongside the LLM migration: TTS Spike Day 1: VibeVoice Sample Matrix
What I Am Trying
- opencode CLI plus Electron desktop, single config in
~/.opencode/- Pointed at local Mistral SGLang on port 30000 today, switching to Qwen3.6-35B-A3B vLLM on port 30001 once the bring-up is confirmed
- No Mistral-strict-alternation surface in the agent, no docker-sandbox layer
- Git-discipline as the only safety net for in-shell agent actions; explicit allowlist for the common shell calls