JSON config - HAILO_INTERNAL_FAILURE(8)

I got errors every time using chat on my local LLM.

I |2026-04-19 15:17:10 1776608230231926| GenerationThread:New conversation, clearing context and sending 1 messages
[HailoRT] [error] Failed to render prompt from JSON strings: [json.exception.parse_error.101] parse error at line 2, column 0: syntax error while parsing value - invalid string: control character U+000A (LF) must be escaped to \u000A or \n; last read: ‘"### Task:<U+000A>’
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INTERNAL_FAILURE(8)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INTERNAL_FAILURE(8)

Im running Raspbery Pi 5 (trixie os) Pi Hat+ 2 (hailo10h) hailort 5.3.0 , model zoo 5.3.0, hailo drivers 5.3.0, tappas core 5.3.0.

Everything was working on 5.2.0 and 5.1.0

Just got on this problem with last update

Hi @user1006,

This is bug we aware of and working on a fix:

Thanks,

I have the same issue. It seems it happens with different models…

Just to give more context with a specific example:

The model (I tried qwen2.5-coder:1.5b) seems to lose context because the hailo-ollama bridge fails to sanitize/escape control characters (specifically Newlines/LF) when sending the conversation history to the HailoRT backend.

Error Log:

Plaintext

[HailoRT] [error] Failed to render prompt from JSON strings: [json.exception.parse_error.101] parse error at line 2, column 0: syntax error while parsing value - invalid string: control character U+000A (LF) must be escaped to \u000A or \n; last read: '"Sure! Here's a simple \"Hello World\" program in JavaScript:<U+000A>'
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INTERNAL_FAILURE(8)

Steps to Reproduce:

  1. Chat with any model via Open WebUI.

  2. Receive a response containing a code block or multiple paragraphs (containing \n).

  3. Send a follow-up message.

  4. The hailo-ollama service fails to parse the history JSON, clears the context, and the model “forgets” the previous turns.

It seems the JSON generator in the bridge is not properly escaping \n into \\n before passing it to the HailoRT render prompt function.

1 Like

Thanks @darguez for sharing the information!

In my case, using Open WebUI 0.9.5, this kept occurring after the first response from the device b/c only then did OWUI send the prompt template that’s listed in the RAG section of its settings, which starts with ### Tasks: . Which was confusing since the the error message thus contained something I have never typed following “last read:”.

As @darguez correctly points out, the issue is an improperly escaped unicode LF.

Just as an FYI to those that want to get this working and are comfortable with compiling hailo-ollama from source.

There are two way of fixing this:

  1. A trivial hotfix: Sanitise the message string using C++ regex to replace the newline character (unicode or otherwise) with an escaped newline character.
    The following code, when added to the generate_one function around line 107 in generation_context.cpp will accomplish this:
const std::string commonNewlinePattern(R"(\n)");
std::regex rgx(commonNewlinePattern);
const std::string escapedNewline("\\n");
std::string sanitizedPrompt
for(uint idx(0); idx < messages_to_send.size(); ++idx){
	sanitizedPrompt = std::regex_replace(messages_to_send[idx], rgx, escapedNewline);
	messages_to_send[idx] = sanitizedPrompt
}

along with an #include <regex> with all the other includes.

It’s a very hacky fix since it applies the substitution to the entire JSON string and is trying to fix JSON formatting issues manually which is considered a bad practice. Thus the next one is IMHO preferable:

  1. Using the nlohmann/json library (already used in the code anyway) to automatically sanitise the messages at the time of the assembly of the prompt_json_strings vector in the functions chat and chat_completions in controller.cpp. The following code accomplishes this in chat:
    // Convert messages to JSON strings for structured prompts
    std::vector<std::string> prompt_json_strings;
    nlohmann::json j_string;
    std::string sanitizedMessageContent;
    for (const auto &message : *generation_params->messages) {
        j_string = message->content;
        sanitizedMessageContent = j_string.dump();
        prompt_json_strings.push_back(R"({"role": ")" + message->role + R"(", "content": )" +
                                      sanitizedMessageContent + R"(})");
    }

it extends the existing assembly loop by first storing the message string in a json object
and then letting the library do all of the escaping by serializing the string with .dump().
Note the removal of escape_json_quotes and the removal of two " in the raw string literals surrounding the message.
This is necessary to prevent double leading/trailing quotations resulting in malformed JSON strings (failure on first query) and proper escaping of quotation marks in the message body in the follow-up chats (failure on later queries).

Applying the second version to hailo-ollama 5.3.0 appears to work for me w/o issues now when querying OWUI 0.9.5 on C++ code, with follow-up queries. Tested with

Demonstrate the use of the move-constructor in C++ to avoid memory 
copies when one member of the class is a std::vector.

(no issue with LF) and on

Provide some suggestions on how to sanitise user inputs stored in strings in C++ in a way that they can be safely stored in JSON objects without resulting in parsing errors due to improperly escaped special characters.

Which generates C++ code with quotation marks and does not result in errors during follow-up queries anymore.

2 Likes

Thank you @Tony for the detailed info and the fixes! Do you have some reference documentation to compile hailo-ollama? I would like to try your workaround, because Open WebUI is not usable without fixing the issue…

I’m following the steps described in the Readme for the “Build from source” installation method

I’m installing to /home/<username>/bin/hailo-ollama (e.g. -DCMAKE_PREFIX_PATH=/home/user/bin/hailo-ollama) s.t. it does not conflict with the version installed via the package manager. I’m configuring the build not via plain CMake but via the terminal user interface ccmake for my own convenience.

Since the build defaults to using static libraries for its dependencies the installed hailo-ollama can be run without further modifications via /home/user/bin/hailo-ollama/bin/hailo-ollama. But in my case this fails because the of some improper access permissions.
Since I did not track those down yet I’m just running it as superuser via
sudo /home/user/bin/hailo-ollama/bin/hailo-ollama

Thank you!, I will give it a try. It would be great if HAILO (@Michael) released a new version soon fixing this.

1 Like

Ok, I couldn’t make it work with your snipped in option 1 @Tony , because I’d still had issues with new lines:


HailoRT] [error] Failed to render prompt from JSON strings: [json.exception.parse_error.101] parse error at line 2, column 0: syntax error while parsing value - invalid string: control character U+000A (LF) must be escaped to \u000A or \n; last read: ‘"### Task:<U+000A>’

[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INTERNAL_FAILURE(8)

So, I asked Gemini and gave me another similar approach that worked fine for me:

generation_context.cpp

// — START SANITIZE HOTFIX —
for(size_t idx = 0; idx < messages_to_send.size(); ++idx) {
    std::string& str = messages_to_send\[idx\];    

    // Scape real line breaks (LF / U+000A)
    size_t pos = 0;

    while ((pos = str.find('\\n', pos)) != std::string::npos) {
        str.replace(pos, 1, "\\\\n");
        pos += 2; // Advance 2 positions to avoid re-evaluating the new escaped '\\n'
    }

    // Escape carriage returns (CR / U+000D) for additional security
    pos = 0;
    while ((pos = str.find('\\r', pos)) != std::string::npos) {
        str.replace(pos, 1, "\\\\r");
        pos += 2;
    }
}
// --- END SANITIZE HOTFIX ---

// Generate using the messages

auto generator_completion = (\*m_llm)->generate(params.generator_params, messages_to_send).expect("Failed to generate");

As I noted: option 1 is not particularly preferable since it modifies the strings via regex. This makes it error-prone. Option 2 using the dedicated JSON library already included by the code is preferable.
The following is the fix that I’m currently running with

1 Like

Thanks @Tony , the fix works fine and there are no issues any more with line breaks.

How do you deal with 2048 tokens context limitation of hailo-ollama models in OWUI 0.9.5 with long chats?

1 Like

Glad I could help!

Unfortunately I haven’t managed to get that big of a chat going yet so I can’t say much about this. My current goal is to use OWUI as an interface between OpenCode and hailo-ollama and try this for agentic coding.

As far as I recall from either some forum post here or GitHub specifications of the models, the context length is fixed per model and can’t be modified.