I am currently trying to run inference on a custom model with Hailo8. When I attempt to parse the model’s ONNX file using Hailo’s Data Flow Compiler, I encounter the following unsupported layer error.the version of the Data Flow Compiler is 2024_04.
UnexpectedNodeError in op /Unsqueeze_1: Unexpected node /Unsqueeze_1 (Unsqueeze)
UnsupportedReduceMeanLayerError in op /_attention/layer_norm/ReduceMean: Reduce mean layer /_attention/layer_norm/ReduceMean has unsupported axis -1 (must be over one spatial dimension only).
UnsupportedReduceMeanLayerError in op /_mlp/layer_norm/ReduceMean: Reduce mean layer /_mlp/layer_norm/ReduceMean has unsupported axis 2 (must be over one spatial dimension only).
UnsupportedReduceMeanLayerError in op /_attention/layer_norm/ReduceMean_1: Reduce mean layer /_attention/layer_norm/ReduceMean_1 has unsupported axis 2 (must be over one spatial dimension only).
UnsupportedReduceMeanLayerError in op /_mlp/layer_norm/ReduceMean_1: Reduce mean layer /_mlp/layer_norm/ReduceMean_1 has unsupported axis 2 (must be over one spatial dimension only).
UnsupportedFeatureSplitterLayerError in op /_attention/Split: Feature splitter vertex /_attention/Split is splitting input over unsupported axis 2
UnexpectedNodeError in op /Unsqueeze: Unexpected node /Unsqueeze (Unsqueeze)
However, according to the Data Flow Compiler user guide, both the ReduceMean and Unsqueezed layers are supposed to be supported.
unsuqeezed layer can used only in specific cases such as
between Dense and Conv layers.
The manual says that Average Pooling(Reduce Mean) works fine, but sometimes it doesn’t work for unknown reasons.
I suggest some solutions.
First, update DFC and HRT newest verstion
Second, Translate from TFLite or TF to hef instead of onnx.
Last, Converts unsupported layers from hailo to supported layers.
Yes, I am using the supercombo.onnx model.
As you pointed out, I was able to compile the model and perform inference up to a certain point. However, if possible, I would prefer to run the entire supercombo.onnx model, including the final layers, on Hailo8.
I was able to specify an intermediate point in the ML model using --end-node-names and perform inference up to that point with Hailo. For the remaining part, I currently use a tool I have to edit the ONNX and create an ONNX file that performs inference only from the point after --end-node-names.
However, I believe that the Hailo dataflow compiler tool might have a feature to offload the remaining part to the CPU. When performing inference with Hailo up to an intermediate point in the ML model, how is the remaining part expected to be inferred?