Recommendations for PC Specs for Training AI Models Compatible with Hailo-8, Jetson, or Similar Hardware (Computer Vision & Signal Classification)

Hey everyone,

I’m looking to build or buy a PC tailored specifically for training AI models for Computer Vision and Signal Classification that will eventually be deployed on edge hardware like the Hailo-8, NVIDIA Jetson, or similar accelerators. My goal is to create an efficient setup that balances cost and performance while ensuring smooth training and compatibility with these devices.

Details About My Needs

  • Model Training: I’ll be training deep learning models (e.g., CNNs, RNNs) using frameworks like TensorFlow, PyTorch, HuggingFace, and ONNX.
  • Edge Device Constraints: The edge devices I’m targeting have limited resources, so my workflow might includes model optimization techniques like quantization and pruning.
  • Inference Testing: I plan to experiment with real-time inference tests on Hailo-8 or Jetson hardware during the development phase.
  • Use Case: My primary application involves object detection (for work) and, at a later stage, signal classification. For both cases, recall is our highest priority (missed true positives are fatal). Precision is also important (We don’t, want false alarms, but better having some false alarms then missing an event)

Questions for Recommendations

  1. CPU: What’s the ideal number of cores, and which models would be most suitable?
  2. GPU: Suggestions for GPUs with sufficient VRAM and CUDA support for training large models?
  3. RAM: How much memory is optimal for this type of work?
  4. Storage: What NVMe SSD sizes and additional HDD/SSD options would you recommend for data storage?
  5. Motherboard & Other Components: Any advice on compatibility with Hailo-8 or considerations for future upgrades?
  6. Additional Tips: Any recommendations for OS, cooling, or other peripherals that might improve efficiency?

If you’ve worked on similar projects or have experience training models for deployment on these devices, I’d love to hear your thoughts and recommendations!

Thanks in advance for your help!

Your question is exactly what I wanted to ask once mine has been answered on my topic. (How many Hailo modules can be used simultaneously on a PC & how to do that.)

I will try to help to the best of my knowledge. Even if I find that you are rushing things ^^.

For the moment I have not yet seen a PC working fully under Hailo.
From what I understand on a PC you can only run 1 M.2 or USB card. (I have not yet seen an assembly with more, but I am still reading the entire forum.)

On Arduino/Raspberry modules i have seen assemblies of 4 Hailo and theoretically we could do 6. (From what I understand)

So I hope/dream to be able to do this:
-1 multi Gpu motherboard 4 gpu. At 5 pcie slots minimum. = 1200€
-A cpu capable of running such a motherboard = 1000€
200 GB of cpu ram > 2000€
Pcie:
-1 recovery graphics card in order to have an image 50€
-4 pcie>M.2 cards with 4 M.2 slots 400€
16 hailo in the M.2 slots (16*200= 3200€)

~ 6600€/$ =
16 hailo8 x 26 = 416 TFlops (no information on fp 16 or 32)
For 50 watts + pc.

Hailo modules will use the cpu ram as Vram (if i did not make any error), so 200 cpu ram ≈ 100-150 vram (It’s less efficient from what I understand, but I don’t know by how much).

Otherwise another system that really works at the moment, while remaining on a contained price =
4 7900XTX at 1000€

It works very well under linux and from 6-8 months under windows.
But i must admit that some models such as flux voluntarily boycott this company. ( I did not know why)

7400€ the pc = (fp 32 461= 240 TFlops) ( fp 16 4122= 480 TFlops.
For 500-800 watt4 = 2000-3200watt.
And under 24
4 = 96 GB of VRAM.

The 4090s do the same but are at ~2000.
Which brings the pc to 11,000€
But they are better compatible with Windows.

For NVIDIA jetsones… I don’t know enough about the subject.

And the legendary H100 at 20,000€ each.
200 TFlops (fp16) (techpowerup. Com my source)
50 TFlops (fp32)
And 80 GB of VRAM each.

The LLama 405B model requires 230 GB of vram.
So only 3 H100 can make it work…
But some motherboards can do 300gb of cpu ram… So Hailo could theoretically work on it too.
(If one day a motherboard finally ends up being and supports 4 parallel gpu & 300gb of ram)

Here is the state of the market according to my current little knowledge (imperfect).
If this could help you ^^.

What you are looking for is a fairly standard “training on server, inferencing at the edge” type of architecture which has been popular since the debut of the concept of “edge AI”. I have identified some key phrases in your post, such as “tailored specifically for training AI models”, “efficient setup”, “smooth training and compatibility”. If you can find the answers to the questions in this post, you are pretty arriving at the balanced system you are looking for. Give it a try. :smiley:

Tailored? Training. Server Side.

There isn’t much the “tailored” for training. More powerful GPUs and CPUs are coming up every 6 month and, as long as you have the resources/money, you can always buy new ones to accelerate you training. Should you? The key question to be answered is:

  • How often must you re-train your models? Every day? Once a month?

Usually, higher re-training frequency calls for a more powerful GPU. As to which GPU holds such balance, you first need to know:

  • What models are you going to train?

If you rely on existing “standard models” such as YOLO, ResNet, etc, there are many benchmarks you can use as references to select the GPUs. As a side note, RTX are economical if you don’t have to worry about power consumption; otherwise, look for data center GPUs. So just carry this question along if you care about the utility bills:

  • What is the euro/kWh * GPU watts * hours of running?

On the CPU side, I assume that you are looking for AMD64(X86) architecture. (Correct me if I am wrong.) Again, it all depends on the models you must handle. For example, if your goal is to develop models that are eventually running on HAILO-8, a good starting point is the user manual for DataFlow Compiler. Usually the latest Intel or AMD CPUs can tick all the boxes.

By the way, I am not sure why you’d like to perform model optimisation on the edge device. Usually, quantisation is performed on the server side and the optimised model is loaded on the edge device to “run”.

Tailored? Inferencing. Edge.

On the inferencing side, however, you may have more-or-less tailored hardware solutions. Here, the key questions are:

  • What sensor interface do you need? If we are talking about cameras, PoE camera? USB camera? Or perhaps GMSL?
  • What connectivity? Your edge device has to be connected to the server via, either wired or wireless.
  • Is the edge device used in a regulated environment? Here, certification may be required and not every edge device can survive.

For example, if you are looking for tailored HAILO-8 inferencing appliances, there are plenty, single HAILO chip for inferencing, multiple HAILO-8 chips to run multiple models .

How much RAM?

On your CPU + GPU machine, the more the merrier. But, do you NEED that much? Perhaps try this rule-of-thumb formula: 1.5 x GPU RAM.

PCIe lanes

Take the HAILO-8 accelerator as an example, to maximise its performance, you better give it 4 PCIe lanes. So, if you want to use M.2, not just looking for a M.2 slot, but also pay attention to the number of PCIe lanes associated with that slot.