Model compilation run-time error

liamw9534 · September 2, 2024, 2:17pm

I have a yolov8n model trained on a custom dataset with 1 class label. I am running the following command:

The optimization and calibration process appears to run fine until I encounter the following error:

[info] Fine Tune is done (completion time is 00:04:07.25)
[info] Starting Layer Noise Analysis
Full Quant Analysis:  50%|███████████████████████████████████████████████████                                                   | 1/2 [00:00<00:00,  7.61iterations/s]Traceback (most recent call last):
  File "/root/hailo/bin/hailomz", line 33, in <module>
    sys.exit(load_entry_point('hailo-model-zoo', 'console_scripts', 'hailomz')())
  File "/root/hailo_model_zoo/hailo_model_zoo/main.py", line 122, in main
    run(args)
  File "/root/hailo_model_zoo/hailo_model_zoo/main.py", line 111, in run
    return handlers[args.command](args)
  File "/root/hailo_model_zoo/hailo_model_zoo/main_driver.py", line 250, in compile
    _ensure_optimized(runner, logger, args, network_info)
  File "/root/hailo_model_zoo/hailo_model_zoo/main_driver.py", line 91, in _ensure_optimized
    optimize_model(
  File "/root/hailo_model_zoo/hailo_model_zoo/core/main_utils.py", line 321, in optimize_model
    runner.optimize(calib_feed_callback)
  File "/root/hailo/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/root/hailo/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 2093, in optimize
    self._optimize(calib_data, data_type=data_type, work_dir=work_dir)
  File "/root/hailo/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/root/hailo/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 1935, in _optimize
    self._sdk_backend.full_quantization(calib_data, data_type=data_type, work_dir=work_dir)
  File "/root/hailo/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 1045, in full_quantization
    self._full_acceleras_run(self.calibration_data, data_type)
  File "/root/hailo/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 1229, in _full_acceleras_run
    optimization_flow.run()
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 306, in wrapper
    return func(self, *args, **kwargs)
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 316, in run
    step_func()
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped
    result = method(*args, **kwargs)
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/tools/subprocess_wrapper.py", line 111, in parent_wrapper
    raise SubprocessTracebackFailure(*child_messages)
hailo_model_optimization.acceleras.utils.acceleras_exceptions.SubprocessTracebackFailure: Subprocess failed with traceback

Traceback (most recent call last):
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/tools/subprocess_wrapper.py", line 73, in child_wrapper
    func(self, *args, **kwargs)
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 347, in step3
    self.finalize_optimization()
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped
    result = method(*args, **kwargs)
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 405, in finalize_optimization
    self._noise_analysis()
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped
    result = method(*args, **kwargs)
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 585, in _noise_analysis
    algo.run()
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/algorithms/optimization_algorithm.py", line 50, in run
    return super().run()
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/algorithms/algorithm_base.py", line 151, in run
    self._run_int()
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/algorithms/hailo_layer_noise_analysis.py", line 83, in _run_int
    self.analyze_full_quant_net()
  File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/algorithms/hailo_layer_noise_analysis.py", line 197, in analyze_full_quant_net
    lat_model.predict_on_batch(inputs)
  File "/root/hailo/lib/python3.10/site-packages/keras/engine/training.py", line 2603, in predict_on_batch
    outputs = self.predict_function(iterator)
  File "/root/hailo/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_filenvpcgmfo.py", line 15, in tf__predict_function
    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
  File "/root/hailo/lib/python3.10/site-packages/keras/engine/training.py", line 2155, in step_function
    outputs = model.distribute_strategy.run(run_step, args=(data,))
  File "/root/hailo/lib/python3.10/site-packages/keras/engine/training.py", line 2143, in run_step
    outputs = model.predict_step(data)
  File "/root/hailo/lib/python3.10/site-packages/keras/engine/training.py", line 2111, in predict_step
    return self(x, training=False)
  File "/root/hailo/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_fileau2rkomm.py", line 188, in tf__call
    ag__.for_stmt(ag__.converted_call(ag__.ld(self)._model.flow.toposort, (), None, fscope), None, loop_body_5, get_state_9, set_state_9, (), {'iterate_names': 'lname'})
  File "/tmp/__autograph_generated_fileau2rkomm.py", line 167, in loop_body_5
    ag__.if_stmt(ag__.not_(continue__1), if_body_3, else_body_3, get_state_8, set_state_8, (), 0)
  File "/tmp/__autograph_generated_fileau2rkomm.py", line 94, in if_body_3
    n_ancestors = ag__.converted_call(ag__.ld(self)._native_model.flow.ancestors, (ag__.ld(lname),), None, fscope)
  File "/tmp/__autograph_generated_filewueztmji.py", line 12, in tf__ancestors
    retval_ = ag__.converted_call(ag__.ld(nx).ancestors, (ag__.ld(self), ag__.ld(source)), None, fscope)
TypeError: in user code:

    File "/root/hailo/lib/python3.10/site-packages/keras/engine/training.py", line 2169, in predict_function  *
        return step_function(self, iterator)
    File "/root/hailo/lib/python3.10/site-packages/keras/engine/training.py", line 2155, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/root/hailo/lib/python3.10/site-packages/keras/engine/training.py", line 2143, in run_step  **
        outputs = model.predict_step(data)
    File "/root/hailo/lib/python3.10/site-packages/keras/engine/training.py", line 2111, in predict_step
        return self(x, training=False)
    File "/root/hailo/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "/tmp/__autograph_generated_fileau2rkomm.py", line 188, in tf__call
        ag__.for_stmt(ag__.converted_call(ag__.ld(self)._model.flow.toposort, (), None, fscope), None, loop_body_5, get_state_9, set_state_9, (), {'iterate_names': 'lname'})
    File "/tmp/__autograph_generated_fileau2rkomm.py", line 167, in loop_body_5
        ag__.if_stmt(ag__.not_(continue__1), if_body_3, else_body_3, get_state_8, set_state_8, (), 0)
    File "/tmp/__autograph_generated_fileau2rkomm.py", line 94, in if_body_3
        n_ancestors = ag__.converted_call(ag__.ld(self)._native_model.flow.ancestors, (ag__.ld(lname),), None, fscope)
    File "/tmp/__autograph_generated_filewueztmji.py", line 12, in tf__ancestors
        retval_ = ag__.converted_call(ag__.ld(nx).ancestors, (ag__.ld(self), ag__.ld(source)), None, fscope)

    TypeError: Exception encountered when calling layer 'lat_model' (type LATModel).
    
    in user code:
    
        File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/algorithms/lat_utils/lat_model.py", line 340, in call  *
            n_ancestors = self._native_model.flow.ancestors(lname)
        File "/root/hailo/lib/python3.10/site-packages/hailo_model_optimization/acceleras/model/hailo_model/model_flow.py", line 31, in ancestors  *
            return nx.ancestors(self, source)
    
        TypeError: outer_factory.<locals>.inner_factory.<locals>.tf__func() missing 1 required keyword-only argument: '__wrapper'
    
    
    Call arguments received by layer 'lat_model' (type LATModel):
      • inputs=tf.Tensor(shape=(8, 640, 640, 3), dtype=float32)

I did search the forum for similar issues and encountered this post Problem With Model Optimization - #31 by klausk which seemed relevant to my model, since it only has one output class.

I attempted to re-run the steps with a modified yolov8n.alls file but got the same error, so I am not sure how to proceed. I also tried other model sizes and got a similar error to the above.

liamw9534 · September 2, 2024, 2:27pm

Sorry, I omitted the command I was running. It is here:

hailomz compile --ckpt aicam-v5n.onnx --calib-path hailo-pkgs/val --yaml hailo_model_zoo/hailo_model_zoo/cfg/networks/yolov8n.yaml --hw-arch hailo8l --classes 1 --performance

I also should point out that I ran my converted onnx file through onnx-simplifier as recommended here: Dataflow compiler best practice

alexf · September 3, 2024, 12:13pm

Hi, Alex here, did you try to run without --performance ?
It may activate some additional tools one of which throws.

liamw9534 · September 3, 2024, 12:17pm

Yes, I tried without --performance and got the same result.

nina-vilela · September 3, 2024, 12:39pm

Hi @liamw9534,

We saw this happening sometimes due to a GPU driver incompatibility.

Are you following the requirements below?

liamw9534 · September 3, 2024, 12:51pm

Hello,

I am using a cloud instance with:

Ubuntu 22.04 64-bit
RTX 3090 GPU
24 GB RAM

My docker is derived from nvidia/cuda:11.8.0-devel-ubuntu22.04 with some apt package additions as follows:

unzip
python3.10-venv
python3.10-dev
graphviz-dev
libgl1-mesa-glx
libcudnn8=8.9.0.*-1+cuda11.8

I don’t know which GPU driver version is installed by the base docker image. If there is a way I can check then please let me know.

liamw9534 · September 3, 2024, 3:21pm

Ok, nvidia-smi is indicating a GPU driver version of 550. Since this is a cloud instance I won’t be able to update this. Is the GPU driver version 525 the only one that is working?

liamw9534 · September 5, 2024, 9:52am

Hello,

I setup a WSL2 instance (because I can’t control the driver version when running in the cloud). I got exactly the same error. Below is my setup:

Windows 11 PC x86 64-bit
GTX 1050 Ti GPU (Windows Driver Version 31.0.15.2879)
Ubuntu 22.04.3 LTS
CUDA 11.8
CUDANN 8.9.0.131
NVIDIA SMI 525.104
Driver Version: 528.79 (allows up to CUDA Version: 12.0)
Python 3.10
DFC 3.28.0

I don’t think this issue is related to the driver version because I am fairly sure this is a compatible line-up of software based on NVIDIA’s compatibility charts. How to proceed?

nina-vilela · September 11, 2024, 3:33pm

Hi @liamw9534,

From the cases we’ve seen, this issue is solved once the driver incompatibility is resolved. We know that the error message is currently very unclear and are working on solving it.

I’ll confirm if the quantization is succesful from my end using the model you sent me. Could you please also share your model script commands?

Topic		Replies	Views
Can't Optimize or Compile YoloV8 with trained on Custom Dataset General yolov8	10	1370	November 10, 2024
DNN library is not found. General	4	67	April 20, 2025
Problem while Model Optimization General optimization , hailo8	4	43	September 29, 2025
Problem With Model Optimization General dfc , hailo8	49	2903	November 6, 2025
Can not use hailomz to compile the model General dfc , hailo8	4	71	October 21, 2025

Model compilation run-time error

Related topics