Fix SGLang 400 errors from Vibe by patching role alternation, reasoning_effort, and tool-call handling without forking the codebase.

Vibe 400 Fix


Alternating Roles Breakage in message_utils.py

SGLang enforces strict role alternation: system? → user → assistant → user → ... and tool-calls must follow the pattern assistant(tool_calls) → tool(result) → assistant. Vibe violates this in three common scenarios, especially when users hit ESC mid-tool-call or hammer the chat without clearing history. The errors manifest as 400 Bad Request responses from SGLang, often with unhelpful messages like "Invalid message role sequence" or "Tool call without tool result".

Broken patterns and their 400 errors:

SituationPatternResultExample Error
Long session with multiple user inputsuser → user400"Invalid message role sequence"
ESC without active tool-callassistant(leer) → user400"Invalid message role sequence"
ESC during tool-callassistant(tool_calls) → user (missing tool result)400"Tool call without tool result"

The fix isn’t a monkey patch—it’s a surgical replacement of merge_consecutive_user_messages() in message_utils.py. The original function aggressively merges consecutive user messages but fails to handle tool-call sequences properly, leaving dangling tool_calls without corresponding tool results. This is particularly problematic with Mistral Small 4, where SGLang’s strict validation rejects any request with incomplete tool-call sequences.

Here’s the patched version that handles edge cases like user → assistant(tools) → tool → user → drop tool-seq → user → user, which would otherwise remain unmerged and trigger 400 errors:

import re, pathlib, sys

TARGET_FILE = pathlib.Path.home() / ".local/share/uv/tools/mistral-vibe/lib/python3.12/site-packages/vibe/core/llm/message_utils.py"
MARKER = "# V4-PATCH-APPLIED"

def apply_patch():
    if MARKER in TARGET_FILE.read_text():
        return  # Already patched
    src = TARGET_FILE.read_text()
    # Replace the function body
    new_func = """def merge_consecutive_user_messages(messages):
    def _merge_pass(msgs):
        out = []
        i = 0
        while i < len(msgs):
            msg = msgs[i]
            if msg["role"] == "assistant" and not msg.get("content"):
                i += 1
                continue
            if i + 1 < len(msgs) and msgs[i + 1]["role"] == msg["role"]:
                merged = {"role": msg["role"], "content": msg.get("content", "") + msgs[i + 1].get("content", "")}
                out.append(merged)
                i += 2
            else:
                out.append(msg)
                i += 1
        return out

    def _drop_incomplete_tools(msgs):
        out = []
        i = 0
        while i < len(msgs):
            msg = msgs[i]
            if msg.get("tool_calls") and (i + 1 >= len(msgs) or msgs[i + 1]["role"] != "tool"):
                i += 1
                continue
            out.append(msg)
            i += 1
        return out

    msgs = _merge_pass(messages)
    msgs = _drop_incomplete_tools(msgs)
    msgs = _merge_pass(msgs)
    return msgs"""
    patched = re.sub(r"def merge_consecutive_user_messages\(messages\):.*?(?=\ndef |\Z)", new_func, src, flags=re.DOTALL)
    TARGET_FILE.write_text(patched + "\n" + MARKER)

Watch out: The three-pass design is critical. Skipping the second _drop_incomplete_tools() pass will leave tool-call sequences broken, and SGLang will still reject the request. Test with a session that includes tool calls to verify the patch works.


reasoning_effort Missing in reasoning_adapter.py

SGLang rejects requests without reasoning_effort or with invalid values like "none" or "low". Vibe’s default config sends no reasoning_effort when thinking="off", and sends "low" when thinking="low", both rejected by SGLang with errors like "reasoning_effort must be one of: high, medium, low" or "Invalid reasoning effort level".

Broken payload example (vibe v1.2.3):

# Original code in reasoning_adapter.py (vibe v1.2.3)
payload = {"model": "Mistral-Small-4", "messages": [...], "temperature": temperature}
if thinking != "off":
    payload["reasoning_effort"] = thinking  # sends "low" or null

Fix:

# Patched code (vibe v1.2.3)
payload = {"model": "Mistral-Small-4", "messages": [...], "temperature": 1.0}
payload["reasoning_effort"] = "high"  # Only valid values accepted

Gotcha: Setting temperature to anything other than 1.0 while reasoning_effort="high" triggers SGLang errors like "temperature must be 1.0 when reasoning_effort is high". Lock it to 1.0 or the request fails.

Warning: After uv tool upgrade mistral-vibe to v1.3.0, you must re-apply this manually. The file lives at: /home/username/.local/share/uv/tools/mistral-vibe/lib/python3.12/site-packages/vibe/core/llm/backend/reasoning_adapter.py


thinking-Mapping in mistral.py

Vibe’s MistralBackend maps "low""none" for SGLang compatibility, but SGLang actually rejects "none" in some cases with "reasoning_effort must be one of: high, medium, low". This mapping was introduced to handle older SGLang versions but causes failures with Mistral Small 4.

Broken mapping (vibe v1.2.3):

# Original in mistral.py (vibe v1.2.3)
_THINKING_TO_REASONING_EFFORT = {
    "off": "none",
    "low": "none",
    "medium": "high",
    "high": "high",
}

Fix:

# Patched mapping (vibe v1.2.3)
_THINKING_TO_REASONING_EFFORT = {
    "off": "high",
    "low": "high",
    "medium": "high",
    "high": "high",
}

Note: This file is at: /home/username/.local/share/uv/tools/mistral-vibe/lib/python3.12/site-packages/vibe/core/llm/backend/mistral.py

Watch out: If you later switch to direct Mistral API (not SGLang), revert this mapping—it’s specific to SGLang’s stricter rules. Direct Mistral API accepts "none" and "low", so forcing "high" will degrade performance.


Upgrading Vibe Without Breaking Fixes

After upgrading to vibe v1.3.0, the patches won’t persist unless you re-apply them manually. The message_utils.py patch runs automatically on next vibe start via ~/bin/vibe-patch.py, but the other two require manual intervention.

# Upgrade Vibe to v1.3.0
uv tool upgrade mistral-vibe --version 1.3.0

# Re-apply reasoning_adapter.py patch (temperature=1.0 + reasoning_effort="high")
# Re-apply mistral.py mapping (_THINKING_TO_REASONING_EFFORT all → "high")
# message_utils.py patch runs automatically on next vibe start via ~/bin/vibe-patch.py

Warning: If you skip the manual patches, SGLang will reject requests with 400 errors like "Invalid reasoning effort level" or "Tool call without tool result" until you restore them. Always verify patches after upgrades.


Diagnosing Requests in Real Time

Check if your patches are active and requests are valid. Use these commands to inspect live traffic:

# Verify message_utils.py patch
grep "V4-PATCH-APPLIED" ~/.local/share/uv/tools/mistral-vibe/lib/python3.12/site-packages/vibe/core/llm/message_utils.py

# Inspect live request for reasoning_effort
curl -s -X POST http://localhost:30000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"Mistral-Small-4","messages":[{"role":"user","content":"Solve 42*87?"}],"stream":false,"max_tokens":100,"reasoning_effort":"high"}' \
  | python3 -c "import sys,json; d=json.load(sys.stdin); print('reasoning_content:', bool(d['choices'][0]['message'].get('reasoning_content')))"

Note: If reasoning_content is missing, your patches aren’t applied or SGLang rejected the request. Check SGLang logs for "Invalid reasoning effort level" or "Tool call without tool result".


What I Actually Use

  • Mistral Small 4: Runs locally on GB10 (Blackwell, ARM64) via SGLang v0.2.11 for privacy and performance.
  • SGLang: Handles Mistral Small 4 with strict role alternation and reasoning_effort constraints. Version 0.2.11 enforces "reasoning_effort" validation.
  • Vibe: Coding assistant with OpenAI-compatible API style for seamless integration. Version v1.2.3 with manual patches applied.
Flow

Vibe 400 Fix

Handling role alternation in message processing

1
Problem ESC mid-tool-call breaks role alternation
2
Diagnosis SGLang enforces strict role patterns
3
Broken Pattern user → user or missing tool result
4
Fix Applied Patch merge_consecutive_user_messages()
5
Result Valid role alternation restored
Illustration: Vibe 400 Fix