In my case, using Open WebUI 0.9.5, this kept occurring after the first response from the device b/c only then did OWUI send the prompt template that’s listed in the RAG section of its settings, which starts with ### Tasks: . Which was confusing since the the error message thus contained something I have never typed following “last read:”.
As @darguez correctly points out, the issue is an improperly escaped unicode LF.
Just as an FYI to those that want to get this working and are comfortable with compiling hailo-ollama from source.
There are two way of fixing this:
- A trivial hotfix: Sanitise the message string using C++
regex to replace the newline character (unicode or otherwise) with an escaped newline character.
The following code, when added to the generate_one function around line 107 in generation_context.cpp will accomplish this:
const std::string commonNewlinePattern(R"(\n)");
std::regex rgx(commonNewlinePattern);
const std::string escapedNewline("\\n");
std::string sanitizedPrompt
for(uint idx(0); idx < messages_to_send.size(); ++idx){
sanitizedPrompt = std::regex_replace(messages_to_send[idx], rgx, escapedNewline);
messages_to_send[idx] = sanitizedPrompt
}
along with an #include <regex> with all the other includes.
It’s a very hacky fix since it applies the substitution to the entire JSON string and is trying to fix JSON formatting issues manually which is considered a bad practice. Thus the next one is IMHO preferable:
- Using the
nlohmann/json library (already used in the code anyway) to automatically sanitise the messages at the time of the assembly of the prompt_json_strings vector in the functions chat and chat_completions in controller.cpp. The following code accomplishes this in chat:
// Convert messages to JSON strings for structured prompts
std::vector<std::string> prompt_json_strings;
nlohmann::json j_string;
std::string sanitizedMessageContent;
for (const auto &message : *generation_params->messages) {
j_string = message->content;
sanitizedMessageContent = j_string.dump();
prompt_json_strings.push_back(R"({"role": ")" + message->role + R"(", "content": )" +
sanitizedMessageContent + R"(})");
}
it extends the existing assembly loop by first storing the message string in a json object
and then letting the library do all of the escaping by serializing the string with .dump().
Note the removal of escape_json_quotes and the removal of two " in the raw string literals surrounding the message.
This is necessary to prevent double leading/trailing quotations resulting in malformed JSON strings (failure on first query) and proper escaping of quotation marks in the message body in the follow-up chats (failure on later queries).
Applying the second version to hailo-ollama 5.3.0 appears to work for me w/o issues now when querying OWUI 0.9.5 on C++ code, with follow-up queries. Tested with
Demonstrate the use of the move-constructor in C++ to avoid memory
copies when one member of the class is a std::vector.
(no issue with LF) and on
Provide some suggestions on how to sanitise user inputs stored in strings in C++ in a way that they can be safely stored in JSON objects without resulting in parsing errors due to improperly escaped special characters.
Which generates C++ code with quotation marks and does not result in errors during follow-up queries anymore.