Is it possible to infer a multi-modal model with Hailo-8? I have a project that requires me to use object detection (so image data as input) and sound recognition (so sound data as input) so I want to use 1 multi-modal instead of 2 different models. Is it possible?
Yes, it is possible to use multi-modal models with the Hailo-8. For more information on how to do this and utilize the model effectively, I recommend referring to the following sections in our documentation:
HailoRT User Guide:
Input and Output VStream Setup
Virtual Devices and Multi-Network Support
Hailo Model Zoo User Guide:
Model Quantization and Compilation
Supported Architectures
Additionally, you can run these models in parallel using the scheduler.