All,
I went ahead and did a thing. Claude and I added some high level API support for the Hailo 10h chip into llama.cpp. Everything is documented in HAILO_README.md.
Disclaimer: You will need to use my fork since I did use AI to write the C code for this integration. The kind people at llama.cpp won’t use this code because of that.
I’m here to answer questions and take suggestions.
Jeff
