Hailo-Ollama Server with bigger model support?

Hello All…

I received my Hailo10h chip and got everything compiled and running on my Raspberry Pi 5 under regular old RaspianOS. Then I wrote a custom “Hailo Assistant” device for the Home Assistant (based off the Ollama device) to talk to Hailo-Ollama also running on the Pi 5 and using the Hailo10h chip for inference.

With all that said, I am getting really bad responses from all of the current list of models supported by the Hailo-Ollama server. The RPi5 with 8GB memory can easily run larger more accurate models. How exactly can one of the models listed here be integrated into the Hailo-Ollama server: hailo_model_zoo_genai/docs/MODELS.rst at main · hailo-ai/hailo_model_zoo_genai · GitHub

Is there a workflow for creating a proper manifest.json file so that Hailo-Ollama will see and use a new model? Or do I just copy one and modify it?

I’m real close to success on this project. I just need a better model.

Jeff

1 Like

How about a Hailo-Ollama server that supports tools? Because tools support is absolutely necessary for a Personal Assistant on the Home Assistant. Without tools, I will have to pre-program every command I want to automate into the code.

Hi @jeff.singleton ,
For time being only the supported list of models work with hailo-ollama, and from all our supported GenAI models (hence compiled to HEF) - the relevant for ollama use case - are in the hailo-ollama supported list…
We do plan in the future to release LoRA fine tuning.

@jeff.singleton We are on it and looking into that. In the meanwhile please see the community project at: hailo-ollama tools support - #5 by TheRubbishRaider

@jeff.singleton I want to try and tackle compiling some larger models too, that might be next.

My understanding is that we should be able to compile modules to the hef with the dataflow compiler but it doesn’t look like ARM is supported (and it wants minimum 16gb ram) so I think that might have to get handed off to a bigger hosted machine.

The docs say that for genai we can only run pre-optimized models through, but theuy also say that we can compile ONNX/TF models to hef so :man_shrugging:t2:

Hi @TheRubbishRaider , For time being the compilation is intended for the regular non GenAI models.

I need to point out an issue with one of the newer models available - Qwen2-1.5B-Instruct-Function-Calling-v1 - which appears to be an attempt at something for Home Automation? The v1 model is too big for the Hailo10h and so it just won’t load.

Model is here:

https://dev-public.hailo.ai/v5.2.0/blob/Qwen2-1.5B-Instruct-Function-Calling-v1.hef

1 Like

Hi @jeff.singleton ,

Thanks for calling it out - we are appreciate it!
The model is currently not supported by Hailo-Ollama: hailo_model_zoo_genai/docs/MODELS.rst at main · hailo-ai/hailo_model_zoo_genai · GitHub
It should be possible to use it via the API, e.g., VDevice object etc.

Thanks,

Thanks Michael, I’m actually not running it with Hailo-Ollama, but rather with the custom method that @TheRubbishRaider and I have going with tools support.

The output from “hailortcli parse-hef ./Qwen2-1.5B-Instruct-Function-Calling-v1.hef” shows the Hailo10h is supported by the model.

Is it just not in the HailoRT code yet?

Is there anything I can do to help speed things along?

Regards,

Jeff

Hi @jeff.singleton ,

The model should work with something like this: hailo-apps/hailo_apps/python/gen_ai_apps/simple_llm_chat at main · hailo-ai/hailo-apps · GitHub

Then I would try it here: hailo-apps/hailo_apps/python/gen_ai_apps/agent_tools_example at main · hailo-ai/hailo-apps · GitHub

Thanks,