How to run OpenHands and Aider locally with Mistral Small 4 and Qwen3 Coder Next for reliable, private AI-assisted development.

SOVEREIGN DEV STUDIO v2: Self-Hosted AI Coding Agents That Actually Work

New here? The Self-Hosted AI: Start Here hub article covers the broader stack this service runs inside: the hardware tree, the inference engine choice, the minimum-viable deploy. Read that for context, then come back here for the service-specific details.

OpenHands crashes when Mistral Small 4 throws role-alternation errors in multi-turn sessions.

Quick Take

  • OpenHands handles large codebase refactors better than Aider but needs strict role alternation in prompts.
  • Mistral Small 4 works only if you disable OpenHands’ prompt extensions and match model names exactly.
  • Aider is faster for quick edits but chokes on long contexts without SGLang’s RadixAttention.
  • Both tools share the same local LLM endpoints, so you can switch without restarting the engine.

What OpenHands Actually Does

OpenHands is a multi-turn coding agent that maintains long-running sessions with full repository context. It solves the problem of repeatedly re-sending the same context for each request, which happens when you use stateless APIs like vLLM. For example, when I refactor a trading agent that spans 47 files, OpenHands keeps the entire codebase in memory via SGLang’s RadixAttention, so each follow-up request doesn’t re-parse the files. This reduces latency from 1.2 seconds per request with vLLM to 280 milliseconds with SGLang running on Mistral Small 4.

The agent works by first cloning your repository into a sandboxed workspace. It then runs a loop: user task → plan → code changes → tests → commit. Each iteration uses the same in-memory context, which is why the role-alternation bug with Mistral Small 4 matters. If OpenHands injects an extra user message, the model rejects the request with a BadRequestError. This means that without the fix in config.toml, you cannot use Mistral Small 4 at all.

Deploying OpenHands with Docker

The Docker setup is straightforward but requires three things: the right image, the right ports, and the right volumes. The compose file below pulls the latest OpenHands image, connects it to SGLang on port 8001, and mounts the workspace directory so the agent can read and write files.

  openhands:
    image: docker.all-hands.dev/all-hands-ai/openhands:latest
    platform: linux/arm64
    container_name: openhands
    environment:
      LLM_BASE_URL: http://host.docker.internal:8001/v1
      LLM_MODEL: openai/Intel/Qwen3-Coder-Next-int4-AutoRound
      LLM_API_KEY: not-needed-local
      SANDBOX_RUNTIME_CONTAINER_IMAGE: docker.all-hands.dev/all-hands-ai/runtime:latest
      WORKSPACE_BASE: /data/projects
      OPENHANDS_TELEMETRY: "false"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /data/projects:/opt/workspace_base
      - /data/openhands-state:/.openhands-state
      - /data/projects/shared:/shared:ro
    extra_hosts:
      - host.docker.internal:host-gateway
    ports:
      - "3001:3000"
    restart: unless-stopped

Start it with docker compose up -d openhands. After it’s running, open http://localhost:3001, go to Settings → LLM, and set the provider to OpenAI-Compatible with base URL http://localhost:8000/v1, model openai/Intel/Qwen3-Coder-Next-int4-AutoRound, and no API key. Save and create a new task. In practice, this setup handles 94 tokens per second on Mistral Small 4 and 69 tokens per second on Qwen3 Coder Next, which is enough for interactive coding but not for batch processing.

The Mistral Small 4 Role-Alternation Bug and Its Fix

Mistral Small 4 enforces strict role alternation: system → user → assistant → user → assistant. OpenHands’ default microagent system injects an extra user message to retrieve context, which breaks this alternation. This happens because enable_prompt_extensions = true adds a user message for every task, regardless of other settings. The result is a BadRequestError from SGLang.

The fix has three parts. First, disable prompt extensions in config.toml:

# /data/openhands-state/config.toml
[llm]
model = "openai/Mistral-Small-4"
base_url = "http://host.docker.internal:30000/v1"
api_key = "not-needed-local"
native_tool_calling = true
drop_params = true
modify_params = true

[agent]
enable_prompt_extensions = false

Second, set the model name exactly as SGLang serves it. If SGLang starts with --served-model-name Mistral-Small-4, config.toml must use model = "openai/Mistral-Small-4", not the HuggingFace path. Third, mount config.toml into the container as read-only so OpenHands picks it up immediately.

Without this fix, OpenHands is unusable with Mistral Small 4. In my case, the error appeared after the first multi-turn session, and the diagnosis script showed two consecutive user messages in the session events. Applying the fix resolved it immediately.

Aider: The Terminal Companion for Quick Edits

Aider is a CLI tool that integrates directly with your terminal and git repository. It’s ideal for small changes, one to three files, where you don’t need a full multi-turn session. For example, when I fix a typo in a 200-line Python file, Aider loads the file into memory, streams changes, and commits them automatically if auto-commits are enabled.

Install it with pip install aider-chat --break-system-packages. Configure it in ~/.aider.conf.yml:

model: openai/Intel/Qwen3-Coder-Next-int4-AutoRound
openai-api-base: http://localhost:8001/v1
openai-api-key: not-needed
auto-commits: true
dirty-commits: false
stream: true
map-tokens: 4096

The map-tokens: 4096 setting increases the context window for large codebases, which SGLang handles better than vLLM. To use it, run aider src/agents/polymarket_agent.py from your project directory. Aider streams changes in real time, so you see edits as they happen. For debugging, /run pytest tests/ executes tests directly from the chat, and /undo reverts the last commit if something breaks.

When to Use OpenHands vs. Aider

OpenHands excels at large, multi-file refactors because it keeps the entire repository in memory. Aider is faster for quick edits but struggles with long contexts without SGLang’s RadixAttention. Use OpenHands for new features or debugging with tests, and Aider for small changes or when you’re on the go via SSH. For example, I use OpenHands to refactor a trading agent that spans 47 files, but I use Aider to fix a typo in a single file while traveling on my GX10 ARM64 server.

What I Actually Use

  • OpenHands: for multi-turn coding sessions where I need full repository context and tool calls.
  • Aider: for quick terminal edits and when I’m working remotely over SSH.
  • Mistral Small 4: as the primary coding model because it’s fast and fits in 128 GB RAM when paired with SGLang.

When OpenHands and Aider stop being interchangeable

The original post frames OpenHands and Aider as alternatives that solve the same problem at different points on the deep-vs-fast curve. After enough hours with both, the boundary is sharper than that.

OpenHands is the right choice when the task involves multiple files, the agent needs to plan-then-execute, and the rollback-if-wrong cost is high enough that human review at the diff level adds value. Setup tasks where one wrong systemd config locks you out of the box, or refactors that touch a dozen files in a coordinated way, are the natural fit. The web UI plus the diff-review step is the structural feature that matters here.

Aider is the right choice when the task is a single-file or two-file edit, the loop “ask, see diff, accept or refine” needs to happen in seconds rather than minutes, and the user already knows what the change should look like roughly. Most of the per-article polish-pass work in the blog pipeline is Aider-shaped, not OpenHands-shaped. Quick edits, fast iteration, no web UI overhead.

The Mistral role-alternation bug that the original post documents (enable_prompt_extensions = false for OpenHands) has an Aider-side analog: Aider sometimes builds prompt structures that the SGLang/Mistral combination rejects with the same BadRequestError. The fix on the Aider side is --no-pretty --map-tokens 0 to keep prompt structure simpler. Different config knob, same failure mode, same root cause.

Operationally, the right mental model is that OpenHands and Aider share a model endpoint and an SSH-key-managed Git workflow but otherwise live in separate process trees. They do not coordinate, do not share state, and do not need to. A merge-conflict between an OpenHands-edited file and an Aider-edited file is just a normal Git conflict, resolved with normal Git tooling. That separation of concerns is the load-bearing design choice; trying to make them aware of each other would re-introduce the exact agent-orchestration complexity that the dual-tool setup was meant to avoid.

Stack

OpenHands AI Agent

Self-hosted multi-turn coding agent architecture

6
Storage In-memory context cache
5
Workspace Sandboxed repo clone
4
Model Runtime Mistral Small 4/Qwen3
3
LLM Endpoint SGLang/vLLM server (port 8001)
2
Agent Engine OpenHands core service
1
User Interface Web dashboard (localhost:3001)

Was this worth it? Zap the article.

Value for value, no signup. Sats go straight to the writer.