Hello everyone. I have a few questions for you about hailo modules.
I would like to buy several Hailo units (8 10 or future ones) but I would like to know if they are able of doing what I plan to do with them first.
1 Is it possible to run the Hailo units on a normal PC (cpu + motherboard + windows) on a pcie socket ? (With an m.2 > pcie adapter obviously)
2 Given that there are Pcie > 4 m.2 slot adapters.
Would a computer then be able to use the 4 modules all together on a single LLM application?
3 Given that there are several Pcie sockets on a motherboard, and that multi-gpu is sometimes supported up to 4 graphics cards.
Is it possible to have 4 strips of 4 hailo modules working on the same pc?
There is the subject of pcie lane multipliers, but it’s a too much hard subject for my under moderate level of IA user.
The final idea is to have a super-pc capable of doing:
16 x 26 = 416 TFlops (hailo 8)
Of only 16 x 2.5 = 40 watts
For only 16 x ~200 = $ ~3200
All while using not the vram but the cpu ram instead for the LLMs (which goes up to ~200-300 max at the moment) to be able to use any of the biggest LLMs.
So I could eventually have an imitation of a H100 for much less money when my computer are fully evolved (2030).
4 How far can a such an idea work?
((Will the hailo 10 be compatible in such a setup with their future versions…))
I want to use : Ollama, Comfyui, OpenHands, Home Assistant.
If yes to all these questions, then bonus question: is it still possible to invest in you?
Welcome to the Hailo Community! Your planned PC project sounds fantastic, and we’re here to help address your questions:
Compatibility:
Hailo accelerators are compatible with x86/ARM architectures and support both Windows and Linux operating systems.
Using an M.2 to PCIe adapter is a valid and supported method for integrating the modules into your PC.
& 3. Multiple Card Configurations:
Yes, you can run multiple Hailo cards together. Many of our clients use configurations with up to 64 cards for various applications.
LLM (Large Language Model) Considerations:
Using 4 Hailo-8 cards with CPU RAM for LLMs is an innovative idea, but this setup hasn’t been tested or validated for such use cases.
For LLM workloads, we recommend the Hailo-10H, which is specifically optimized for these tasks. A single Hailo-10H can handle certain LLMs effectively, and we’ve tested models like LLaMA internally with promising results (though not officially released yet).
I will check with our R&D team to explore the feasibility of running LLMs with 4 Hailo-8 cards and provide updates.
Additional Notes:
Running applications like ComfyUI, OpenHands, or Home Assistant should work seamlessly. However, for LLMs, the Hailo-10H offers the best performance.
For multi-card setups (Hailo-8 or Hailo-10H), ensure your PC has adequate PCIe lanes, sufficient power supply, and proper cooling solutions.
Let me know if you have more questions, and I’ll follow up with insights from R&D regarding your specific use case.
Note: For information about investments, please contact our team through the website.
Best regards,
Omri
Application Engineer & Community Manager
Hailo