llama.cpp server and cli with Hailo 10h Support

jeff.singleton · February 19, 2026, 5:04pm

All,

I went ahead and did a thing. Claude and I added some high level API support for the Hailo 10h chip into llama.cpp. Everything is documented in HAILO_README.md.

Disclaimer: You will need to use my fork since I did use AI to write the C code for this integration. The kind people at llama.cpp won’t use this code because of that.

I’m here to answer questions and take suggestions.

Jeff

Michael · February 19, 2026, 5:10pm

@jeff.singleton Thanks a lot for this great contribution!