Inference for Multi-Modal Model

Dear community,

Is it possible to infer a multi-modal model with Hailo-8? I have a project that requires me to use object detection (so image data as input) and sound recognition (so sound data as input) so I want to use 1 multi-modal instead of 2 different models. Is it possible?

Thank you!

Hey @nino,

Yes, it is possible to use multi-modal models with the Hailo-8. For more information on how to do this and utilize the model effectively, I recommend referring to the following sections in our documentation:

HailoRT User Guide:

  • Input and Output VStream Setup
  • Virtual Devices and Multi-Network Support

Hailo Model Zoo User Guide:

  • Model Quantization and Compilation
  • Supported Architectures

Additionally, you can run these models in parallel using the scheduler.

Best Regards,
Omria