hailo-ollama tools support

Woohoo! I believe I have tool/function calling working the ollama way (with tool calls in the message payload) on the 10H/AI+ 2.

I wanted to go as low as possible to strip everything out except the hailo_server bin that can be included when you compile hailort from source. That opens a few ports for RPC.

In order to make this work I (claude) had to modify the hailort c++ code a bit. I opened a PR targeting hailoai/hailrt but if you want to test it out you should be able to build from source from my repo: GitHub - jordanskole/hailort: An open source light-weight and high performance inference framework for Hailo devices

It’s not trivial, but its not that difficult either.

Start by cleaning all your old hailo libraries so you can install hailort v5.2.0

You will need to make sure that you build with the server and genai flags set to true:

# from /hailort
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release \\ 
-DHAILO_BUILD_EXAMPLES=ON 
-DHAILO_BUILD_CLIENT_TOKENIZER=ON \\ 
-DHAILO_BUILD_HAILORT_SERVER=ON \\ 
-DHAILO_BUILD_GENAI_SERVER=ON

cmake --build build --config Release -j$(nproc)

and then to install hailortcli and hailortlib

sudo cmake --install build

That will install the hailortcli and hailortlib binaries and add them to your path

Then you will need to manually copy hailo_server

sudo cp build/hailort_server /usr/local/bin/hailort_server

I added a service to systemctl to start all the hailo_server’s on boot

You should now have all the hailo rpc servers running! Next step is to expose an http server that mimics the ollama spec. This is of course already implemented in hailo-ai/hailo_model_zoo_genai like @Michael said above but it has some minja / schema validation middleware that bonks if you include tools in your payload.

I am working through it in node, and you can use this node library GitHub - jordanskole/hailo-node: Node.js client for HailoRT GenAI server if you want to play along at home.

3 Likes