Are there any updates regarding the deployment of large language models (LLMs) on Hailo’s NPU?
Currently, the Dataflow Compiler (DFC) fails to compile, and the stack of supported models for the NPU remains limited and repetitive. This constraint significantly DIMINISHES THE UTILITY of a device that claims a performance of 13 and 26 TOPS.
Additionally, I came across a forum post mentioning ongoing work to enable Whisper on Hailo’s platform. Could you please share the timeline or expected release date for this feature?
To run LLMs efficiently we have designed a new accelerator called Hailo-10H. It has a DDR interface to allow it to store large models locally to free the host from managing the context switching.
It is currently not generally available. Our R&D are working on all the supporting software needed.
You can build any model based on the supported layers in the Hailo Dataflow Compiler (see User Guide for details) and run it on the Hailo hardware. The Hailo Model Zoo is just a collection of open-source models for popular AI tasks many of our customers are interested in. If you have some other AI tasks and models please let us know.
I do not have a timeline. If you want to integrate this into a product, please send me a PM with some details about your company, product and timeline and I will check with my colleagues what information we can share.