Running hailo optimize --use-random-calib-set model.har on a custom model leads to the error
hailo_sdk_client.sdk_backend.sdk_backend_exceptions.BackendValueError: Unsupported NaN values were found in params
Before this, I called hailo parser tf model.tflite --parsing-report-path ./parser.logwithout issues. At no step within the process (TF → TFLite → .har) could I observe NaNs in my parameters. I describe the issue better on GitHub (I admit, the repo does not quite fit the issue, since this is a custom model and not a zoo model)
The error “Unsupported NaN values were found in params” indicates that your model contains invalid numerical values (NaNs) in its parameters. Here are some suggestions to help you resolve this issue:
Inspect the parameters for NaNs: Begin by using a script to load and inspect the .tflite or intermediate ONNX model for any NaN values.
If NaNs are present, this confirms the issue.
Check for NaNs during TensorFlow training: During the training process, ensure that no layers produce NaNs. You can add TensorFlow debugging code to identify NaNs in the trainable variables:
for var in model.trainable_variables:
if tf.reduce_any(tf.math.is_nan(var)).numpy():
print(f"NaN detected in variable: {var.name}")
If NaNs appear during backpropagation, consider applying gradient clipping:
Replace NaNs in model parameters: To avoid disruptions in the optimization process, replace NaNs with a small value (e.g., 1e-6) or zeros in the weights.
Use a custom calibration dataset: Instead of using the --use-random-calib-set option, which might introduce instability, opt for a custom calibration dataset that is similar to your deployment data.
If you have any further questions or need additional assistance, please don’t hesitate to ask!