Hailo Ollama serve to network

Hi everyone,

can i configure hailo ollama so that the api is served on the network?
Or is it just locally available on the raspberry pi 5?
I would like to host the open web-ui on a different pi and tell it that the ollama-hailo is on the other pi 5

hailo-ollama is being served on the network.

I’ve tested the basic commands using two Raspberry Pi devices.

GitHub reference:
GitHub - Hailo Model Zoo GenAI

  1. Start hailo-ollama on the server Raspberry Pi.

  2. On the client Raspberry Pi, you can chat with a model by replacing localhost with the server Pi’s IP address:

curl --silent http://localhost:8000/api/chat \
     -H 'Content-Type: application/json' \
     -d '{"model": "qwen2:1.5b", "messages": [{"role": "user", "content": "Tell me a joke"}]}'
curl --silent http://192.168.1.x:8000/api/chat \
     -H 'Content-Type: application/json' \
     -d '{"model": "qwen2:1.5b", "messages": [{"role": "user", "content": "Tell me a joke"}]}'

I would expect the WebUI to work the same way over the network. I haven’t tested it yet, but I’ll do so when I have time.

Let me know if you get it running first or if you need any further help.

i tried it and it works.
Was just using the standard ollama port but hailo-ollama is running on a different port.

Now it seems quiet slow in the open-webui. I’ll run some experiments if its the hailo chip (if tokjen generation is slow) or if its the pi with open-webui.