Hailo Quantization Level 4 (AdaQuant) fails with unexpected error

stwerner · August 2, 2024, 7:18am

Hi, I am trying out the AdaQuant algorithm to optimize my yolox detection model with the model-script outlined below. Unfortunately, I am getting an unexpected error throughout the optimization. Any ideas why this happens and how to solve these issues?

model_optimization_config(calibration, batch_size=16, calibset_size=4096)

model_optimization_flavor(optimization_level=4, compression_level=1)

post_quantization_optimization(adaround, policy=enabled, batch_size=8, dataset_size=4096)

nms_postprocess("nms.json", yolox, engine=cpu)

File "/local/workspace/hailo_virtualenv/bin/hailomz", line 8, in <module>loss: 0.5620 - round_loss: 0.0000 - annealing_b: 20.0000]
    sys.exit(main())
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_zoo/main.py", line 122, in main
    run(args)
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_zoo/main.py", line 111, in run
    return handlers[args.command](args)
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_zoo/main_driver.py", line 227, in optimize
    optimize_model(
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_zoo/core/main_utils.py", line 321, in optimize_model
    runner.optimize(calib_feed_callback)
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/runner/client_runner.py", line 2093, in optimize
    self._optimize(calib_data, data_type=data_type, work_dir=work_dir)
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/runner/client_runner.py", line 1935, in _optimize
    self._sdk_backend.full_quantization(calib_data, data_type=data_type, work_dir=work_dir)
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 1045, in full_quantization
    self._full_acceleras_run(self.calibration_data, data_type)
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 1229, in _full_acceleras_run
    optimization_flow.run()
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/tools/orchestator.py", line 306, in wrapper
    return func(self, *args, **kwargs)
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 316, in run
    step_func()
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped
    result = method(*args, **kwargs)
  File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/tools/subprocess_wrapper.py", line 113, in parent_wrapper
    raise SubprocessUnexpectedFailure(
hailo_model_optimization.acceleras.utils.acceleras_exceptions.SubprocessUnexpectedFailure: Subprocess step2 failed with unexpected error. exitcode -9

stwerner · August 2, 2024, 7:19am

I am using the v3.28 Hailo DFC Docker container.

Omer · August 4, 2024, 6:25am

Hi @stwerner,
It’s difficult to know what exactly the issue is without seeing what the nms.json holds and without seeing the ONNX\HAR file you are using.

Can you please provide more details and\or the ONNX you used? Is it the yolox from the Hailo Model Zoo?

Regards,

stwerner · August 5, 2024, 7:11am

Hi @Omer ,

I cannot share the ONNX file of my model at the moment. It is similar to the yolox_l_leaky model from the Hailo Model Zoo with a different number of classes. The NMS and post-processing file is basically the same as what is used for the Model Zoo.

Note that I’ve successfully optimized & compiled this model using other configurations of the model-script (i.e., various configurations using optimization level two and compression level one) already, so the NMS file and other configuration options should be fine.

Kind Regards,

Omer · August 5, 2024, 7:46am

Hi @stwerner,
What are the configurations you used that gave you successful compilation?

Regards,

stwerner · August 5, 2024, 8:34am

Hi @Omer

This is the NMS File:

{
	"nms_scores_th": 0.01,
	"nms_iou_th": 0.65,
	"number_of_detection_heads": 3,
	"image_dims": [
		640,
		640
	],
	"max_proposals_per_class": 100,
	"classes": 1,
	"bbox_decoders": [
		{
			"name": "bbox_decoder_8",
			"stride": 8,
			"reg_layer": "conv95",
			"objectness_layer": "conv96",
			"cls_layer": "conv94"
		},
		{
			"name": "bbox_decoder_16",
			"stride": 16,
			"reg_layer": "conv113",
			"objectness_layer": "conv114",
			"cls_layer": "conv112"
		},
		{
			"name": "bbox_decoder_32",
			"stride": 32,
			"reg_layer": "conv130",
			"objectness_layer": "conv131",
			"cls_layer": "conv129"
		}
	]
}

Omer · August 5, 2024, 12:45pm

Hi @stwerner,
Thanks for the info. What were the optimization commands you used with this JSON config that gave successful optimization?

Regards,

stwerner · August 6, 2024, 7:25am

Hi @Omer ,

I use the command hailomz optimize --yaml model.yaml --model-script optim.alls --ckpt <...> --calib-path <...>.

Here is the model.yaml file:

base:
- networks/yolox_l_leaky.yaml

network:
  network_name: yolox_l_leaky

paths:
  alls_script: null

parser:
  nodes:
  - null
  - - Conv_307
    - Sigmoid_309
    - Sigmoid_310
    - Conv_323
    - Sigmoid_325
    - Sigmoid_326
    - Conv_339
    - Sigmoid_341
    - Sigmoid_342


postprocessing:
  device_pre_post_layers:
    nms: true
  postprocess_config_file: nms.json
  meta_arch: yolox
  hpp: true

evaluation:
  labels_offset: 1
  classes: 1

An optim.alls file that works:

model_optimization_config(calibration, batch_size=16, calibset_size=16384)
post_quantization_optimization(finetune, policy=enabled, dataset_size=16384, batch_size=8, epochs=20, learning_rate=0.0001)
nms_postprocess("nms.json", yolox, engine=cpu)

The optim.alls file that doesnt work:

model_optimization_config(calibration, batch_size=16, calibset_size=4096)

model_optimization_flavor(optimization_level=4, compression_level=1)

post_quantization_optimization(adaround, policy=enabled, batch_size=8, dataset_size=4096)
nms_postprocess("nms.json", yolox, engine=cpu)

Regards,

Omer · August 7, 2024, 7:14am

Hi @stwerner,
I believe that you are getting this error because of resource exhaustion.
Optimization level 4 activate the Adaround optimization algorithm, which is pretty costly in memory usage.
You can try lowering the batch size - it will take more time to run optimization, but it can reduce the memory usage significantly. Try batch_size=4 or batch_size=2.

BTW - model_optimization_flavor(optimization_level=4) and post_quantization_optimization(adaround, policy=enabled, batch_size=8) perform the same thing, so one of them is redundant. You can define the calibration size in model_optimization_flavor as well (see the Dataflow Compiler user guide for more info).

Regards,

stwerner · August 7, 2024, 3:34pm

Hi @Omer ,

Thanks for the suggestions! I’ve tested this config on a machine with ~ 64GB of RAM, so I am a bit surprised to learn that the AdaRound algorithm is so heavy in terms of compute.

Regards,

Topic		Replies	Views
Error while compiling hef yolo11 optimization=4 compression=0 General	3	80	March 18, 2025
Problem With Model Optimization General dfc , hailo8	45	2436	July 3, 2025
Adaround is skipped when using optimization_level=4 General hailo8	6	168	January 28, 2025
DFC/MZ - Optimization failure with DETR General dfc , hailo8 , error	3	218	November 1, 2024
Average Pool error cannot quantize General optimization , hailort , hailo8 , error	1	348	August 22, 2024

Hailo Quantization Level 4 (AdaQuant) fails with unexpected error

Related topics