Raspberry Pi 5 + Hailo8L for instance segmentation using yolo v5

I am seeing this error while converting the yolov5m-seg model (in onnx format) to .hef using hailomz command

hailo_sdk_client.model_translator.exceptions.MisspellNodeError: Unable to find end node names: [‘output1’, ‘Conv_326’, ‘Conv_305’, ‘Conv_284’], please verify and try again.

Steps followed so far :
I have re-trained the yolov5m-seg.pt using a custom annotated dataset and verified its working in an ubuntu 22.04 PC. Then converted this model into onnx format using the python script

from ultralytics import YOLO
model = YOLO(“runs/train-seg/exp4/weights/best.pt”)
model.export(format=“onnx”, imgsz=640, opset=11)

I have followed this example for exporting
Retraining example

after this, I have tried to convert onnx into .hef file using the below command

hailomz compile yolov5m_seg --ckpt=retrained_model.onnx --hw-arch hailo8l --calib-path dataset/train/images --classes 1

This resulted in the below error

Start run for network yolov5m_seg …
Initializing the hailo8l runner…
[info] Translation started on ONNX model yolov5m_seg
[info] Restored ONNX model yolov5m_seg (completion time: 00:00:00.39)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:01.39)
Traceback (most recent call last):



raise MisspellNodeError(f"Unable to find {err_str}: {wrong_names}, please verify and try again.")
hailo_sdk_client.model_translator.exceptions.MisspellNodeError: Unable to find end node names: [‘output1’, ‘Conv_326’, ‘Conv_305’, ‘Conv_284’], please verify and try again.

please help me navigate through this [I have successfully converted the yolo v8 model into .hef file for the same segmentation task. but the postprocessing was not optimized for yolov8-seg, so I had to fall back to yolov5-seg]

Hey @mailtoprasan,

I noticed that the issue you’re experiencing is related to the incorrect output node names in your model. It seems that the output node names are not matching the ones specified in the compilation command, possibly because the model has been modified.

To resolve this, I suggest the following steps:

  1. Open your model file (retrained_model.onnx) in Netron (https://netron.app) to inspect the model architecture.
  2. In Netron, navigate to the “Model Properties” section on the left sidebar and expand it if necessary.
  3. Look for the “Outputs” subsection within “Model Properties” and take note of the names listed under it. These are the current output node names of your model.
  4. Update the --output-layer argument in the hailomz compile command or in the alls configuration file with the correct output node names you found in Netron.

If you have any further questions or if you encounter any problems while following these steps, please let me know, and I’ll be happy to assist you further.

1 Like

Thank you @omria for the suggestion. Recompiling with end-node-names solved this problem for me!

Steps followed :

I have opened the onnx model in netron app. It showed the output names as following

I have updated hailomz compile arguments to include end-node-names as below

hailomz compile yolov5m_seg --ckpt=retrained_model.onnx --hw-arch hailo8l --calib-path dataset/train/images --classes 1  --end-node-names output0 465 onnx::Split_480 onnx::Split_518 onnx::Split_556

This resolved the earlier error of not finding the output node names, but still unable to complete the compilation siting dimension mismatch

<Hailo Model Zoo INFO> Start run for network yolov5m_seg ...
<Hailo Model Zoo INFO> Initializing the hailo8l runner...
[info] Translation started on ONNX model yolov5m_seg
[info] Restored ONNX model yolov5m_seg (completion time: 00:00:00.39)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:01.25)
[info] Simplified ONNX model for a parsing retry attempt (completion time: 00:00:02.48)
Traceback (most recent call last): 
.....
.....
.....
hailo_sdk_client.model_translator.exceptions.ParsingWithRecommendationException: Parsing failed. The errors found in the graph are:
 UnsupportedShuffleLayerError in op /model.24/Reshape_1: Failed to determine type of layer to create in node /model.24/Reshape_1
 UnsupportedModelError in op /model.24/Add: In vertex /model.24/Add_input the constant value shape (1, 3, 80, 80, 2) must be broadcastable to the output shape [80, 80, 6]
 UnsupportedModelError in op /model.24/Mul_3: In vertex /model.24/Mul_3_input the constant value shape (1, 3, 80, 80, 2) must be broadcastable to the output shape [80, 80, 6]
 UnsupportedShuffleLayerError in op /model.24/Reshape_3: Failed to determine type of layer to create in node /model.24/Reshape_3
 UnsupportedModelError in op /model.24/Add_1: In vertex /model.24/Add_1_input the constant value shape (1, 3, 40, 40, 2) must be broadcastable to the output shape [40, 40, 6]
 UnsupportedModelError in op /model.24/Mul_7: In vertex /model.24/Mul_7_input the constant value shape (1, 3, 40, 40, 2) must be broadcastable to the output shape [40, 40, 6]
 UnsupportedShuffleLayerError in op /model.24/Reshape_5: Failed to determine type of layer to create in node /model.24/Reshape_5
 UnsupportedModelError in op /model.24/Add_2: In vertex /model.24/Add_2_input the constant value shape (1, 3, 20, 20, 2) must be broadcastable to the output shape [20, 20, 6]
 UnsupportedModelError in op /model.24/Mul_11: In vertex /model.24/Mul_11_input the constant value shape (1, 3, 20, 20, 2) must be broadcastable to the output shape [20, 20, 6]
Please try to parse the model again, using these end node names: /model.24/proto/cv3/act/Mul, /model.24/Transpose_1, /model.24/Transpose, /model.24/Transpose_2

When the end-node-names was changed as suggested in the error message like below

hailomz compile yolov5m_seg --ckpt=retrained_model.onnx --hw-arch hailo8l --calib-path Dataset/train/images --classes 1 --end-node-names /model.24/Transpose /model.24/Transpose_2 /model.24/Transpose_1 /model.24/proto/cv3/act/Mul

Compilation was successful and generated .hef file


To test this .hef file, I have put it in a raspberry pi 5 with hailo8L

  1. Tried replacing this .hef file in the hailo-rpi5-examples/doc/basic-pipelines.md at main · hailo-ai/hailo-rpi5-examples · GitHub
    but it gave segmentation fault
  2. Tried to use it in the Hailo-Application-Code-Examples/runtime/python/instance_segmentation at main · hailo-ai/Hailo-Application-Code-Examples · GitHub
    but post processing was not successful

Is this the right way to test?

Hi! Were you able to get your model inferring on the raspi5? I am currently stuck at the same step you’ve described.

@narbuts
Are you specifically interested in yolov5-segm only? or is yolov8 is also ok?

I am actually trying to get v8-seg working, but having issues with the inference pipeline. I’m quite sure my .hef model is correct. Do you have any helpful tips? @shashi

@narbuts
We have integrated yolov8 models (classification, detection, keypoints and segemntation) into our PySDK. You can can now easily run all these models with same code. Please see yolov8 for the demo code. Please let us know if you encounter any issue.

1 Like

@shashi
Great! I’ll have a look. Is this method only viable for models you have provided or it can also be used on a yolov8 model I’ve trained myself?

@narbuts
It works for any yolov8 segmentation model provided you compile it to have 10 output tensors (which is the default) and use our json.

@narbuts I am still not able to get it run on my raspberry pi 5.

@shashi

I tried PySDK and it infers! The only thing I am quite surprised about is that the average inference time per image is about 2 seconds. Is there any way to optimize the pipeline? The current model is trained at an input size of 640x640. I’ll attach a sample image below together with the code :slight_smile: .

import degirum as dg, degirum_tools
import time
# choose a model to run inference on by uncommenting one of the following lines
model_name = "yolov8m_seg"


inference_host_address = "@local"

zoo_url = "/home/dryerpi/hailo_examples/models/yolov8_SEG"

# choose image source
image_source = "/home/dryerpi/hailo_examples/test_images_resize/image_20241020_084510_017850.png"

token = '' # leave empty for local inference

model = dg.load_model(
    model_name=model_name,
    inference_host_address=inference_host_address,
    zoo_url=zoo_url,
    token=token
)
model.output_confidence_threshold = 0.6

# perform AI model inference on given image source
time_start = time.time()
print(f" Running inference using '{model_name}' on image source '{image_source}'")
inference_result = model(image_source)
print(time.time() - time_start)

# print('Inference Results \n', inference_result)  # numeric results
#print(inference_result)
print("Press 'x' or 'q' to stop.")

# show results of inference
with degirum_tools.Display("AI Camera") as output_display:
    output_display.show_image(inference_result)

Thanks!

@narbuts
Glad to see that you got your own segmentation model compiled and integrated to PySDK. Regarding the inference time, we need to analyze a little bit more to know where the problem could be. Sometimes, it is just the first inference that is slow as the model needs to be prepared for inference. Can you please try the following and see what happens:

import degirum as dg, degirum_tools
import time
# choose a model to run inference on by uncommenting one of the following lines
model_name = "yolov8m_seg"


inference_host_address = "@local"

zoo_url = "/home/dryerpi/hailo_examples/models/yolov8_SEG"

# choose image source
image_source = "/home/dryerpi/hailo_examples/test_images_resize/image_20241020_084510_017850.png"

token = '' # leave empty for local inference

model = dg.load_model(
    model_name=model_name,
    inference_host_address=inference_host_address,
    zoo_url=zoo_url,
    token=token
)
model.output_confidence_threshold = 0.6

# perform AI model inference on a batch of images
time_start = time.time()
for result in model.predict_batch([image_source]*100):
    pass
print(time.time() - time_start)

This will give us an idea of how long it takes to run 100 images. We can know if only the first inference is slow from this experiment.

PySDK supports measuring time for various parts of inference pipeline to analyze the bottleneck. You can run the following code to check time statistics:

import degirum as dg, degirum_tools
import time
# choose a model to run inference on by uncommenting one of the following lines
model_name = "yolov8m_seg"


inference_host_address = "@local"

zoo_url = "/home/dryerpi/hailo_examples/models/yolov8_SEG"

# choose image source
image_source = "/home/dryerpi/hailo_examples/test_images_resize/image_20241020_084510_017850.png"

token = '' # leave empty for local inference

model = dg.load_model(
    model_name=model_name,
    inference_host_address=inference_host_address,
    zoo_url=zoo_url,
    token=token,
    measure_time=True

)
model.output_confidence_threshold = 0.6

# perform AI model inference on a batch of images
time_start = time.time()
for result in model.predict_batch([image_source]*100):
    pass
print(time.time() - time_start)
print(model.time_stats())

Please let me know how these results look and we can see what the bottlenecks are.

@shashi

Thank you for your reply. You were right; the inference indeed speeds up after the initial iterations, finally averaging around 220ms per image, which is more than acceptable!

Another question: does degirum provide a concise guide on how to convert the yolo .pt models into .hef?

For the model I am currently using, I followed this guide, using this method of converting a .onnx model into .hef:

hailomz compile --ckpt yolov8s-seg.onnx --calib-path /path/to/calibration/imgs/dir/ --yaml path/to/yolov8s-seg.yaml --start-node-names name1 name2 --end-node-names name1

This method did work fine for the model with an input size of 640x640, but when trying to do the same with a 1280x1280 input size model, this process hangs indefinitely and does not finish.

May there be a better way in converting the models? Thanks!

Another update, managed to get a 1280x1280 model, but getting this error:

Running inference using 'yolov8n_seg' on image source '/home/dryerpi/hailo_examples/test_images_resize/image_20241020_084510_017850.png'
degirum.exceptions.DegirumException: [ERROR]Operation failed
HailoRT Runtime Agent: Failed to configure infer model, status = HAILO_INVALID_HEF.
hailo_runtime_agent.cpp: 500 [DG::HailoRuntimeAgentImpl::Configure]
When running model 'yolov8n_seg'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/dryerpi/degirum_test.py", line 34, in <module>
    inference_result = model(image_source)
  File "/home/dryerpi/miniconda3/envs/degirum/lib/python3.9/site-packages/degirum/log.py", line 59, in wrap
    return f(*args, **kwargs)
  File "/home/dryerpi/miniconda3/envs/degirum/lib/python3.9/site-packages/degirum/model.py", line 232, in __call__
    return self.predict(data)
  File "/home/dryerpi/miniconda3/envs/degirum/lib/python3.9/site-packages/degirum/log.py", line 59, in wrap
    return f(*args, **kwargs)
  File "/home/dryerpi/miniconda3/envs/degirum/lib/python3.9/site-packages/degirum/model.py", line 223, in predict
    res = list(self._predict_impl(source))
  File "/home/dryerpi/miniconda3/envs/degirum/lib/python3.9/site-packages/degirum/model.py", line 1249, in _predict_impl
    raise DegirumException(
degirum.exceptions.DegirumException: Failed to perform model 'yolov8n_seg' inference: [ERROR]Operation failed
HailoRT Runtime Agent: Failed to configure infer model, status = HAILO_INVALID_HEF.
hailo_runtime_agent.cpp: 500 [DG::HailoRuntimeAgentImpl::Configure]
When running model 'yolov8n_seg'

@narbuts
Glad to hear that you have acceptable performance.

@narbuts
This looks like an error thrown by HailoRT agent. Just to rule out any PySDK related issue , you can do the following:

  1. Run hailortcli commands to benchmark and see if you get the INVALID_HEF error.
  2. Reboot the device and see if the error is repeatable.

@shashi
By running hailortcli benchmark on the model, this is what I get:

Starting Measurements...
Measuring FPS in HW-only mode
[HailoRT] [error] HEF format is not compatible with device. Device arch: HAILO8L, HEF arch: HAILO8
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_HEF(26)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_HEF(26)
[HailoRT CLI] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_HEF(26) - Failed configure vdevice from hef
[HailoRT CLI] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_HEF(26) - Measuring FPS in HW-only mode failed

It seems like the model has been compiled for the Hailo8, not 8L, which is weird since my workflow has been the same as for the first model, which is functioning.

@narbuts
We added yolov8n-seg models compiled at 1280x1280 for both Hailo8 and Hailo8L. They are available in our model zoo. Both are working fine on respective devices. So, your issue is probably related to compiler settings in which you need to specify target device.