Hello. I am creating various AI applications using the Hailo-8 accelerator module.
The Hailo Model Zoo publishes models for various image ML tasks, but models for tasks that use videos as inputs, such as Action Recognition, are not available.
This is likely due to the limited support for Conv3D, which is frequently used in video recognition tasks.
Quoted from DFC:
Note: Models that contain Conv3D layer must have rank-4 input and output (at most 4 dimensions), so the Conv3D layer must reside inside a “2D” model.
Considering the architecture of Hailo8, is it difficult to fully support Conv3D? I would like to know if there is a possibility of supporting it in the future and what the priority would be.
Fully supporting Conv3D with rank-5 inputs is not necessarily an architecture question. However it will affect all parts of the Hailo Dataflow Compiler and the HailoRT runtime. It is currently not planned on our roadmap for this year.
I recommend you get in contact with your local Hailo FAE and sales team. We can provide feedback to our R&D and included business information that will be taken into consideration when they plan the development roadmap.
Hello. I’m looking at pages 132 and 133 of the manual (hailo_dataflow_compiler_v3.28.0_user_guide.pdf) about Conv3D. However, I don’t quite understand what it means. Could you please explain how to implement a rank-4 input Conv3D? It would be very helpful if you could provide some sample code.