We just updated from Hailo 4.20 to Hailo 4.22, in order to have paddle OCR working reliably (see paddle_ocr example times out · Issue #353 · hailo-ai/Hailo-Application-Code-Examples · GitHub ) .We are running an object detection pipeline before running the ocr and this is now broken. What previously worked (with 4.20) now returns:
Aug 21 15:15:17 raspberrypi bash[12695]: terminate called after throwing an instance of 'std::invalid_argument'
Aug 21 15:15:17 raspberrypi bash[12695]: what(): Output tensor best/yolov8_nms_postprocess is not an NMS type
Aug 21 15:15:17 raspberrypi bash[12695]: --------------------------------------
Aug 21 15:15:17 raspberrypi bash[12695]: C++ Traceback (most recent call last):
Aug 21 15:15:17 raspberrypi bash[12695]: --------------------------------------
Aug 21 15:15:17 raspberrypi bash[12695]: 0 filter_letterbox
Aug 21 15:15:17 raspberrypi bash[12695]: 1 filter
Aug 21 15:15:17 raspberrypi bash[12695]: 2 HailoNMSDecode::HailoNMSDecode(std::shared_ptr<HailoTensor>, std::map<unsigned char, std::string, std::less<unsigned char>, std::allocator<std::pair<unsigned char const, std::string > > >&, float, unsigned int, bool)
the model was compiled with NMS, here is the log of the onnx to hef conversion:
xxx:~$ hailo parser onnx best.onnx --hw-arch hailo8
[info] No GPU chosen, Selected GPU 0
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1755811062.500358 9861 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1755811062.507213 9861 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
[info] Current Time: 21:17:49, 08/21/25
[info] CPU: Architecture: x86_64, Model: Intel(R) Xeon(R) CPU @ 2.20GHz, Number Of Cores: 12, Utilization: 0.0%
[info] Memory: Total: 83GB, Available: 81GB
[info] System info: OS: Linux, Kernel: 6.8.0-1034-gcp
[info] Hailo DFC Version: 3.32.0
[info] HailoRT Version: Not Installed
[info] PCIe: No Hailo PCIe device was found
[info] Running `hailo parser onnx best.onnx --hw-arch hailo8`
[info] Translation started on ONNX model best
[info] Restored ONNX model best (completion time: 00:00:00.25)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:00.93)
[info] Simplified ONNX model for a parsing retry attempt (completion time: 00:00:02.17)
Parsing failed with recommendations for end node names: ['/model.23/Concat_3'].
Would you like to parse again with the recommendation? (y/n)
y
[info] According to recommendations, retrying parsing with end node names: ['/model.23/Concat_3'].
[info] Translation started on ONNX model best
[info] Restored ONNX model best (completion time: 00:00:00.18)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:00.84)
[info] NMS structure of yolov8 (or equivalent architecture) was detected.
[info] In order to use HailoRT post-processing capabilities, these end node names should be used: /model.23/cv2.0/cv2.0.2/Conv /model.23/cv3.0/cv3.0.2/Conv /model.23/cv2.1/cv2.1.2/Conv /model.23/cv3.1/cv3.1.2/Conv /model.23/cv2.2/cv2.2.2/Conv /model.23/cv3.2/cv3.2.2/Conv.
[info] Start nodes mapped from original model: 'images': 'best/input_layer1'.
[info] End nodes mapped from original model: '/model.23/Concat_3'.
[info] Translation completed on ONNX model best (completion time: 00:00:02.12)
Would you like to parse the model again with the mentioned end nodes and add nms postprocess command to the model script? (y/n)
y
[info] Translation started on ONNX model best
[info] Restored ONNX model best (completion time: 00:00:00.19)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:00.89)
[info] NMS structure of yolov8 (or equivalent architecture) was detected.
[info] In order to use HailoRT post-processing capabilities, these end node names should be used: /model.23/cv2.0/cv2.0.2/Conv /model.23/cv3.0/cv3.0.2/Conv /model.23/cv2.1/cv2.1.2/Conv /model.23/cv3.1/cv3.1.2/Conv /model.23/cv3.2/cv3.2.2/Conv /model.23/cv2.2/cv2.2.2/Conv.
[info] Start nodes mapped from original model: 'images': 'best/input_layer1'.
[info] End nodes mapped from original model: '/model.23/cv2.0/cv2.0.2/Conv', '/model.23/cv3.0/cv3.0.2/Conv', '/model.23/cv2.1/cv2.1.2/Conv', '/model.23/cv3.1/cv3.1.2/Conv', '/model.23/cv2.2/cv2.2.2/Conv', '/model.23/cv3.2/cv3.2.2/Conv'.
[info] Translation completed on ONNX model best (completion time: 00:00:02.40)
[info] Appending model script commands to best from string
[info] Added nms postprocess command to model script.
[info] Saved HAR to: /home/stanislas.duprey/tmp/best.har
xxx:~$ hailo optimize best.har --calib-set-path calib_set.npy --hw-arch hailo8
[info] No GPU chosen, Selected GPU 0
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
[info] Current Time: 21:20:22, 08/21/25
[info] CPU: Architecture: x86_64, Model: Intel(R) Xeon(R) CPU @ 2.20GHz, Number Of Cores: 12, Utilization: 0.1%
[info] Memory: Total: 83GB, Available: 81GB
[info] System info: OS: Linux, Kernel: 6.8.0-1034-gcp
[info] Hailo DFC Version: 3.32.0
[info] HailoRT Version: Not Installed
[info] PCIe: No Hailo PCIe device was found
[info] Running `hailo optimize best.har --calib-set-path calib_set.npy --hw-arch hailo8`
[info] For NMS architecture yolov8 the default engine is cpu. For other engine please use the 'engine' flag in the nms_postprocess model script command. If the NMS has been added during parsing, please parse the model again without confirming the addition of the NMS, and add the command manually with the desired engine.
[info] The layer best/conv80 was detected as cls_layer.
[info] Using the default score threshold of 0.001 (range is [0-1], where 1 performs maximum suppression) and IoU threshold of 0.7 (range is [0-1], where 0 performs maximum suppression).
Changing the values is possible using the nms_postprocess model script command.
[info] The activation function of layer best/conv80 was replaced by a Sigmoid
[info] Starting Model Optimization
[warning] Reducing optimization level to 1 (the accuracy won't be optimized and compression won't be used) because there's less data than the recommended amount (1024)
[info] Model received quantization params from the hn
...
[info] The calibration set seems to not be normalized, because the values range is [(0.0, 1.0), (0.0, 1.0), (0.0, 1.0)].
Since the neural core works in 8-bit (between 0 to 255), a quantization will occur on the CPU of the runtime platform.
Add a normalization layer to the model to offload the normalization to the neural core.
Refer to the user guide Hailo Dataflow Compiler user guide / Model Optimization / Optimization Related Model Script Commands / model_modification_commands / normalization for details.
[info] Model Optimization is done
[info] Saved HAR to: /home/stanislas.duprey/tmp/best_optimized.har
xxx:~$ hailo compiler best_optimized.har --hw-arch hailo8
[info] No GPU chosen, Selected GPU 0
[info] Current Time: 21:40:13, 08/21/25
[info] CPU: Architecture: x86_64, Model: Intel(R) Xeon(R) CPU @ 2.20GHz, Number Of Cores: 12, Utilization: 0.0%
[info] Memory: Total: 83GB, Available: 81GB
[info] System info: OS: Linux, Kernel: 6.8.0-1034-gcp
[info] Hailo DFC Version: 3.32.0
[info] HailoRT Version: Not Installed
[info] PCIe: No Hailo PCIe device was found
[info] Running `hailo compiler best_optimized.har --hw-arch hailo8`
[info] Compiling network
[info] To achieve optimal performance, set the compiler_optimization_level to "max" by adding performance_param(compiler_optimization_level=max) to the model script. Note that this may increase compilation time.
[info] Loading network parameters
[info] Starting Hailo allocation and compilation flow
[info] Adding an output layer after conv80
[info] Building optimization options for network layers...
[info] Successfully built optimization options - 10s 337ms
[info] Trying to compile the network in a single context
[info] Single context flow failed: Recoverable single context error
[info] Building optimization options for network layers...
[info] Successfully built optimization options - 17s 878ms
[info] Using Multi-context flow
[info] Resources optimization params: max_control_utilization=60%, max_compute_utilization=60%, max_compute_16bit_utilization=60%, max_memory_utilization (weights)=60%, max_input_aligner_utilization=60%, max_apu_utilization=60%
[info] Finding the best partition to contexts...
[info] Searching for a better partition...
[...<==>.................................] Elapsed: 00:00:38
[info] Partition to contexts finished successfully
[info] Partitioner finished after 152 iterations, Time it took: 26m 15s 958ms
[info] Applying selected partition to 3 contexts...
[info] Validating layers feasibility
...
[info] Successful Mapping (allocation time: 29m 0s)
[info] Compiling kernels of best_context_2...
[info] Bandwidth of model inputs: 9.375 Mbps, outputs: 6.79321 Mbps (for a single frame)
[info] Bandwidth of DDR buffers: 14.0625 Mbps (for a single frame)
[info] Bandwidth of inter context tensors: 60.9375 Mbps (for a single frame)
[info] Building HEF...
[info] Successful Compilation (compilation time: 28s)
[info] Compilation complete
[info] Saved HEF to: /home/stanislas.duprey/tmp/best.hef
[info] Saved HAR to: /home/stanislas.duprey/tmp/best_compiled.har
Done with the following hailo version:
Hailo Dataflow Compiler v3.32.0