I can compile hailo yolov5m.onnnx and my own i3yolov5m onnx to hef with nms layer, and get the detection results. But when I integrate the hailo detection to my test application, and run over the test cases, the accuracy drops obviously comparing with tensorRT model which is running on GPU. (the tensorRT engine is built from same i3yolov5m.onnx)
Following the tutorial, I’m trying to build the onnx to hef without quantization and optimization. But seems I cannot. I I use the following code to build.
nnx_model_name = ‘i3yolov5m23’
onnx_path = ‘…/models/i3yolov5m23.onnx’
assert os.path.isfile(onnx_path), ‘Please provide valid path for ONNX file’
Initialize a new client runner
runner = ClientRunner(hw_arch=‘hailo8’)
Any other hw_arch can be used as well.
Translate YOLO model from ONNX
runner.translate_onnx_model(onnx_path, onnx_model_name,
start_node_names=[‘images’],
end_node_names=[‘Conv_344’, ‘Conv_309’, ‘Conv_274’],
net_input_shapes={‘images’: [1, 3, 640, 640]})
calib_dataset = np.load(‘i3calib_set_640_640.npy’)
Call Optimize to perform the optimization process
runner.optimize(calib_dataset)
Save the result state to a Quantized HAR file
quantized_model_har_path = f’{onnx_model_name}_quantized_model_nms.har’
runner.save_har(quantized_model_har_path)
hef = runner.compile()
file_name = f’{model_name}.hef’
with open(file_name, ‘wb’) as f:
f.write(hef)
I found I have to invoke “runner.optimize(calib_dataset)”, otherwise when I compile it to hef, I get warning that there is no weights.
The hef I got could run with hailo8, and I get 3 output streams. But when I tested, the output box is messy. I tried to compile yolov5m from hailo website, and get same wrong boxes. And I found the qp_zp of the output steam is about 181, 182 or 184 of yolov5s. I downloaded the yolo_vehicle from hailo website, and it works correctly.
Can you please let me know how to compile yolov5m and my yolo from onnx to hef without quantization and optimization? And can you please uplaod me a yolov5m onnx without nms layer to test?
Thanks,
Dan