How to compile and use YOLOv5 model without NMS?

I would like to compile a YOLOV5s model without NMS so that the confidence and IoU thresholds can be easily configured during runtime without interruptions. I have the following model graph:

If I select the 3 conv layers as the end_node_names when compiling the HEF, the compilation is successful, however my model output is not scaled correctly and contains both positive and negative values, I’m assuming this is because the following add/mul/pow operations are being lost.

However, when selected the 3 Concat layers as the end nodes, the compilation fails with the following errors:

[info] Simplified ONNX model for a parsing retry attempt (completion time: 00:00:01.40)
Traceback (most recent call last):
  File "/app/omnipro_venv/lib/python3.9/site-packages/hailo_sdk_client/sdk_backend/parser/", line 235, in translate_onnx_model
    parsing_results = self._parse_onnx_model_to_hn(
  File "/app/omnipro_venv/lib/python3.9/site-packages/hailo_sdk_client/sdk_backend/parser/", line 316, in _parse_onnx_model_to_hn
    return self.parse_model_to_hn(
  File "/app/omnipro_venv/lib/python3.9/site-packages/hailo_sdk_client/sdk_backend/parser/", line 367, in parse_model_to_hn
    fuser = HailoNNFuser(converter.convert_model(), net_name, converter.end_node_names)
  File "/app/omnipro_venv/lib/python3.9/site-packages/hailo_sdk_client/model_translator/", line 83, in convert_model
  File "/app/omnipro_venv/lib/python3.9/site-packages/hailo_sdk_client/model_translator/", line 40, in _create_layers
  File "/app/omnipro_venv/lib/python3.9/site-packages/hailo_sdk_client/model_translator/", line 163, in _add_direct_layers
    raise ParsingWithRecommendationException(
hailo_sdk_client.model_translator.exceptions.ParsingWithRecommendationException: Parsing failed. The errors found in the graph are:
 UnsupportedModelError in op /model.24/Add: In vertex /model.24/Add_input the constant value shape (1, 3, 80, 80, 2) must be broadcastable to the output shape [80, 80, 6]
 UnsupportedModelError in op /model.24/Mul_3: In vertex /model.24/Mul_3_input the constant value shape (1, 3, 80, 80, 2) must be broadcastable to the output shape [80, 80, 6]
 UnsupportedModelError in op /model.24/Add_1: In vertex /model.24/Add_1_input the constant value shape (1, 3, 40, 40, 2) must be broadcastable to the output shape [40, 40, 6]
 UnsupportedModelError in op /model.24/Mul_7: In vertex /model.24/Mul_7_input the constant value shape (1, 3, 40, 40, 2) must be broadcastable to the output shape [40, 40, 6]
 UnsupportedModelError in op /model.24/Add_2: In vertex /model.24/Add_2_input the constant value shape (1, 3, 20, 20, 2) must be broadcastable to the output shape [20, 20, 6]
 UnsupportedModelError in op /model.24/Mul_11: In vertex /model.24/Mul_11_input the constant value shape (1, 3, 20, 20, 2) must be broadcastable to the output shape [20, 20, 6]
Please try to parse the model again, using these end node names: /model.24/Sigmoid_2, /model.24/Sigmoid_1, /model.24/Sigmoid

Is there something I can do to accomplish my goal of getting correctly scaled model output WITHOUT any non max suppression?

1 Like

I should also note that based on the above errors, I don’t see any reason that those input shapes should not be broadcastable to the listed output shapes, given that the number of total elements remains unchanged. Any help would be appreciated, thank you!

Hi @connor.malley, please put the last 3 Convs as the end nodes. That should work.

Hi Nadav,

When I add the 3 conv layers as the end nodes the model successfully compiles but the output is not scaled correctly, most likely due to the fact that the next few operations are not being applied (sigmoid, mul, add, and pow).

It looks like these operations are only being applied to the X,Y,W,H portions of these bounding boxes (I have 15 classes). For future reference, is there any reason why the shapes listed above would not be considered broadcastable? Ideally I would like to compile the model with everything EXCEPT for NMS. I appreciate any suggestions you may have.

Hi @connor.malley,
We have that option, we call it “bbox_only”, see this link as example how to achieve this.

The reason that we’re mostly using the “full” version including the NMS, is that for usually it’s more practical, removing also the post processing python code that needs to be applied after the NN part.

1 Like

Hi @connor.malley, I’m facing the same problem you had. Are the last three convolutional layers the end nodes? Could you share how your postprocessing configuration file looks?


Yes the last 3 convolutional layers should be the end nodes.

In the postprocessing script you can set
nms_postprocess(“…/…/postprocess_config/yolov5s_nms_config.json”, yolov5, engine=cpu)

then just use the InferVStreams.set_nms_score_threshold(…) and InferVstreams.set_nms_iou_threshold(…) methods to dynamically change the NMS

1 Like