Hello, after making my own reasoning dll that runs on Windows os and successfully commercializing it, add inference functions in Onnxruntime and OpenVino (CPU, GPU, NPU) to the dll, and add Job Scheduling, parallel processing, and synchronization functions.
Hailo would like to ask if it is possible to use and distribute work with other devices at the same time.
Yes, it is possible to do what you’re asking, and there are a few different ways to approach it.
Using ONNX Runtime:
You can integrate ONNX Runtime into your solution for inference, supporting CPU, GPU, and NPU. Check out this example for a guide.
Splitting the HEF Model:
You can split your HEF model into two parts:
Run the first part on a Hailo-8 (26 TOPS) or Hailo-8L (13 TOPS) chip.
Pipeline the output to a CPU or GPU for post-processing or to complete the rest of the model.
Parallel Processing:
Hailo devices support parallel processing, allowing you to run different model parts across multiple devices, including CPUs and GPUs, while maintaining synchronization for efficient performance.