Error while compiling hef yolo11 optimization=4 compression=0

I never managed to get optimization=4 compression=0 working.
First it complained DALI was not installed on the docker, so I did
pip install nvidia-dali-cuda110
pip install nvidia-dali-tf-plugin-cuda110
(not sure why it didn’t come with the docker.
Now, see the “long novel” below with the error messages during adaround calibration phase I think. Totally cryptic for most of us I guess.
My yaml and json are the original for yolov11m as found in the docker, with just the size changed to 1024x1024 in a couple of proper places
My alls is as follows.
Can you pleas help with a set of yaml/json/alls that would actually allow me to get through the hef creation process, using max optimize and 0 compress?
Thanks.
alls:
normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
change_output_activation(conv74, sigmoid)
change_output_activation(conv90, sigmoid)
change_output_activation(conv105, sigmoid)
nms_postprocess(“/local/workspace/Yolo/yolov11m_nms_config.json”, meta_arch=yolov8, engine=cpu)
model_optimization_config(calibration, batch_size=4)
post_quantization_optimization(finetune, policy=enabled, learning_rate=0.00001)
model_optimization_flavor(optimization_level=4, compression_level=0)

Hef compilation:

(hailo_virtualenv) hailo@a:/local/workspace/Yolo$ hailomz compile --ckpt best_1024x1024.onnx --calib-path valid_16mar2025_for_Hailo_training --yaml ./yolov11m.yaml
Start run for network yolov11m …
Initializing the hailo8 runner…
[info] Translation started on ONNX model yolov11m
[info] Restored ONNX model yolov11m (completion time: 00:00:00.20)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:00.69)
[info] NMS structure of yolov8 (or equivalent architecture) was detected.
[info] In order to use HailoRT post-processing capabilities, these end node names should be used: /model.23/cv2.0/cv2.0.2/Conv /model.23/cv3.0/cv3.0.2/Conv /model.23/cv3.1/cv3.1.2/Conv /model.23/cv2.1/cv2.1.2/Conv /model.23/cv2.2/cv2.2.2/Conv /model.23/cv3.2/cv3.2.2/Conv.
[info] Start nodes mapped from original model: ‘images’: ‘yolov11m/input_layer1’.
[info] End nodes mapped from original model: ‘/model.23/cv2.0/cv2.0.2/Conv’, ‘/model.23/cv3.0/cv3.0.2/Conv’, ‘/model.23/cv3.1/cv3.1.2/Conv’, ‘/model.23/cv2.1/cv2.1.2/Conv’, ‘/model.23/cv2.2/cv2.2.2/Conv’, ‘/model.23/cv3.2/cv3.2.2/Conv’.
[info] Translation completed on ONNX model yolov11m (completion time: 00:00:01.28)
[info] Appending model script commands to yolov11m from string
[info] Added nms postprocess command to model script.
[info] Saved HAR to: /local/workspace/Yolo/yolov11m.har
Preparing calibration data…
[info] Loading model script commands to yolov11m from /local/workspace/Yolo/yolov11m.alls
[info] Starting Model Optimization
[info] Model received quantization params from the hn
[info] MatmulDecompose skipped
[info] Starting Mixed Precision
[info] Model Optimization Algorithm Mixed Precision is done (completion time is 00:00:00.47)
[info] LayerNorm Decomposition skipped
[info] Starting Statistics Collector
[info] Using dataset with 64 entries for calibration
Calibration: 100%|███████████████████████████████████████████████████████████████████████████████| 64/64 [00:26<00:00, 2.44entries/s]
[info] Model Optimization Algorithm Statistics Collector is done (completion time is 00:00:27.97)
[info] Starting Fix zp_comp Encoding
[info] Model Optimization Algorithm Fix zp_comp Encoding is done (completion time is 00:00:00.00)
[info] Starting Matmul Equalization
[info] Model Optimization Algorithm Matmul Equalization is done (completion time is 00:00:00.02)
[info] activation fitting started for yolov11m/reduce_sum_softmax1/act_op
[info] No shifts available for layer yolov11m/conv48/conv_op, using max shift instead. delta=0.3565
[info] No shifts available for layer yolov11m/conv48/conv_op, using max shift instead. delta=0.1783
[info] Finetune encoding skipped
[info] Bias Correction skipped
[warning] Adaround: Dataset didn’t have enough data for dataset_size of 1024 Quantizing using calibration size of 1023
[info] Starting Adaround
[info] The algorithm Adaround will use up to 188.79 GB of storage space
[info] Using dataset with 1023 entries for Adaround
[info] Using dataset with 64 entries for bias correction
Adaround: 1%|▎ | 1/125 [1:57:20<242:31:16, 7040.94s/blocks, Layers=[‘yolov11m/conv1_output_0’]]
Traceback (most recent call last):
File “/local/workspace/hailo_virtualenv/bin/hailomz”, line 33, in
sys.exit(load_entry_point(‘hailo-model-zoo’, ‘console_scripts’, ‘hailomz’)())
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/main.py”, line 122, in main
run(args)
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/main.py”, line 111, in run
return handlersargs.command
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/main_driver.py”, line 250, in compile
runner = _ensure_optimized(runner, logger, args, network_info)
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/main_driver.py”, line 91, in _ensure_optimized
optimize_model(
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/core/main_utils.py”, line 353, in optimize_model
runner.optimize(calib_feed_callback)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py”, line 16, in wrapped_func
return func(self, *args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py”, line 2128, in optimize
self._optimize(calib_data, data_type=data_type, work_dir=work_dir)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py”, line 16, in wrapped_func
return func(self, *args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py”, line 1970, in _optimize
self._sdk_backend.full_quantization(calib_data, data_type=data_type, work_dir=work_dir)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py”, line 1125, in full_quantization
self._full_acceleras_run(self.calibration_data, data_type)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py”, line 1319, in _full_acceleras_run
optimization_flow.run()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 306, in wrapper
return func(self, *args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 335, in run
step_func()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 250, in wrapped
result = method(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/subprocess_wrapper.py”, line 111, in parent_wrapper
raise SubprocessTracebackFailure(*child_messages)
hailo_model_optimization.acceleras.utils.acceleras_exceptions.SubprocessTracebackFailure: Subprocess failed with traceback

Traceback (most recent call last):
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/subprocess_wrapper.py”, line 73, in child_wrapper
func(self, *args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 360, in step2
self.post_quantization_optimization()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 250, in wrapped
result = method(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 420, in post_quantization_optimization
self._adaround()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 250, in wrapped
result = method(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 683, in _adaround
algo.run()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/block_by_block/block_by_block.py”, line 344, in run
retval = super().run()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/optimization_algorithm.py”, line 54, in run
return super().run()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/algorithm_base.py”, line 150, in run
self._run_int()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/ada_round/ada_round_v2.py”, line 140, in _run_int
self._comperative_run(pre_quant_cb=self.core_logic)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/block_by_block/block_by_block.py”, line 288, in _comperative_run
self.infer_block_with_cache(block_model, blocks, interlayer_results_quant, quant_cache_dir, dataset_size)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/block_by_block/block_by_block.py”, line 145, in infer_block_with_cache
self._infer_block(
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/block_by_block/block_by_block.py”, line 222, in _infer_block
result = call_block(sample)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py”, line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/tensorflow/python/eager/execute.py”, line 52, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.ResourceExhaustedError: Graph execution error:

Detected at node ‘yolov11m_submodel/conv1/act_op/SelectV2’ defined at (most recent call last):
File “/local/workspace/hailo_virtualenv/bin/hailomz”, line 33, in
sys.exit(load_entry_point(‘hailo-model-zoo’, ‘console_scripts’, ‘hailomz’)())
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/main.py”, line 122, in main
run(args)
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/main.py”, line 111, in run
return handlersargs.command
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/main_driver.py”, line 250, in compile
runner = _ensure_optimized(runner, logger, args, network_info)
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/main_driver.py”, line 91, in _ensure_optimized
optimize_model(
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/core/main_utils.py”, line 353, in optimize_model
runner.optimize(calib_feed_callback)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py”, line 16, in wrapped_func
return func(self, *args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py”, line 2128, in optimize
self._optimize(calib_data, data_type=data_type, work_dir=work_dir)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py”, line 16, in wrapped_func
return func(self, *args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py”, line 1970, in _optimize
self._sdk_backend.full_quantization(calib_data, data_type=data_type, work_dir=work_dir)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py”, line 1125, in full_quantization
self._full_acceleras_run(self.calibration_data, data_type)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py”, line 1319, in _full_acceleras_run
optimization_flow.run()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 306, in wrapper
return func(self, *args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 335, in run
step_func()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 250, in wrapped
result = method(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/subprocess_wrapper.py”, line 94, in parent_wrapper
proc.start()
File “/usr/lib/python3.10/multiprocessing/process.py”, line 121, in start
self._popen = self._Popen(self)
File “/usr/lib/python3.10/multiprocessing/context.py”, line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File “/usr/lib/python3.10/multiprocessing/context.py”, line 281, in _Popen
return Popen(process_obj)
File “/usr/lib/python3.10/multiprocessing/popen_fork.py”, line 19, in init
self._launch(process_obj)
File “/usr/lib/python3.10/multiprocessing/popen_fork.py”, line 71, in _launch
code = process_obj._bootstrap(parent_sentinel=child_r)
File “/usr/lib/python3.10/multiprocessing/process.py”, line 314, in _bootstrap
self.run()
File “/usr/lib/python3.10/multiprocessing/process.py”, line 108, in run
self._target(*self._args, **self._kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/subprocess_wrapper.py”, line 73, in child_wrapper
func(self, *args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 360, in step2
self.post_quantization_optimization()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 250, in wrapped
result = method(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 420, in post_quantization_optimization
self._adaround()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 250, in wrapped
result = method(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 683, in _adaround
algo.run()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/block_by_block/block_by_block.py”, line 344, in run
retval = super().run()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/optimization_algorithm.py”, line 54, in run
return super().run()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/algorithm_base.py”, line 150, in run
self._run_int()
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/ada_round/ada_round_v2.py”, line 140, in _run_int
self._comperative_run(pre_quant_cb=self.core_logic)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/block_by_block/block_by_block.py”, line 288, in _comperative_run
self.infer_block_with_cache(block_model, blocks, interlayer_results_quant, quant_cache_dir, dataset_size)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/block_by_block/block_by_block.py”, line 145, in infer_block_with_cache
self._infer_block(
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/block_by_block/block_by_block.py”, line 222, in _infer_block
result = call_block(sample)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/block_by_block/block_by_block.py”, line 217, in call_block
result = block_model(inputs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/utils/traceback_utils.py”, line 65, in error_handler
return fn(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/engine/training.py”, line 558, in call
return super().call(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/utils/traceback_utils.py”, line 65, in error_handler
return fn(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/engine/base_layer.py”, line 1145, in call
outputs = call_fn(inputs, *args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/utils/traceback_utils.py”, line 96, in error_handler
return fn(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/model/hailo_model/hailo_model.py”, line 1203, in call
for lname in self.flow.toposort():
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/model/hailo_model/hailo_model.py”, line 1210, in call
output = self._call_layer(
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/model/hailo_model/hailo_model.py”, line 1334, in _call_layer
outputs = acceleras_layer(inputs, training=training, encoding_tensors=encoding_tensors, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/utils/traceback_utils.py”, line 65, in error_handler
return fn(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/engine/base_layer.py”, line 1145, in call
outputs = call_fn(inputs, *args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/utils/traceback_utils.py”, line 96, in error_handler
return fn(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/hailo_layers/base_hailo_layer.py”, line 152, in call
for op_name in self._layer_flow.toposort_ops():
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/hailo_layers/base_hailo_layer.py”, line 163, in call
op_result = op(
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/utils/traceback_utils.py”, line 65, in error_handler
return fn(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/engine/base_layer.py”, line 1145, in call
outputs = call_fn(inputs, *args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/utils/traceback_utils.py”, line 96, in error_handler
return fn(*args, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/atomic_ops/base_atomic_op.py”, line 1141, in call
if fully_native:
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/atomic_ops/base_atomic_op.py”, line 1143, in call
elif self.bit_exact:
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/atomic_ops/base_atomic_op.py”, line 1146, in call
outputs = self._numeric_run(inputs, training=training, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/atomic_ops/base_atomic_op.py”, line 1222, in _numeric_run
outputs_num = self.call_hw_sim(inputs_num_lossy, training=training, **kwargs)
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/atomic_ops/activation_op.py”, line 1460, in call_hw_sim
for i in range(num_pieces):
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/atomic_ops/activation_op.py”, line 1468, in call_hw_sim
slopes = tf.where(condition, current_slope, slopes)
Node: ‘yolov11m_submodel/conv1/act_op/SelectV2’
failed to allocate memory
[[{{node yolov11m_submodel/conv1/act_op/SelectV2}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn’t available when running in Eager mode.
[Op:__inference_call_block_290260]

Hey @Thor ,

The issue you are facing is a memory issue , as seen in here :

Separate Model Compilation Process Instead of running everything in one command, break it down into three key steps:

You already did the parsing part this from the compilation , please take this HAR file and try to run optimize with it , with the following debug steps :

Dataset Size Adjustment

  • Add 1 additional image to the dataset (current size: 1023)
  • Aim to have a minimal, representative dataset

Optimization Strategies for Adaround

you will have to add these before the finetune or remove the finetune because they may conflict because of the optimization level = 4

Approach 1: Controlled Dataset Size

post_quantization_optimization(
    adaround, 
    policy=enabled, 
    dataset_size=256, 
    batch_size=8, 
    epochs=100, 
    train_bias=False, 
    cache_compression=enabled
)

Approach 2: Simplified Optimization

post_quantization_optimization(
    adaround, 
    policy=enabled, 
    batch_size=8
)

Approach 3: Epoch-Focused Optimization

post_quantization_optimization(
    adaround, 
    policy=enabled, 
    epochs=100
)
  • Experiment with these approaches incrementally , and please provide the outputs with each run if none of these fix the issue , then we will have more info to pinpoint the issue.

Thank you, @Omria
My machine has about 700GB of free disk space, so storage should not be an issue.
I’ll try the 3 approaches you suggested, but I wanted to ask fsome clarifications first.
You mentioned to break down the compilation in 3 steps.
Step 1, I already have the har so let’s skip it
Steps 2 and 3, can you please let me know exactly the command line with parameters to execute from the shell (I’ll add one image to have 1024)?

“you will have to add these before the finetune or remove the finetune” :
I’m still a little confused about what to specify in the alls to achieve the best possibly model accuracy, which is why I’m trying to use optimization=4
So, assuming I keep the following lines untouched:
normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
change_output_activation(conv74, sigmoid)
change_output_activation(conv90, sigmoid)
change_output_activation(conv105, sigmoid)
nms_postprocess(”/local/workspace/Yolo/yolov11m_nms_config.json", meta_arch=yolov8, engine=cpu)

Should I add after them one of the 3 post_quantization_optimization options you mentioned and after it:
model_optimization_flavor(optimization_level=4, compression_level=0)

Do you recommend adding a post_quantization_optimization after the “flavor”, if so with which parameters?

Thank you again!

@omria sorry in my reply I messed up the the order of post_quantization_optimization(), model_optimizazion_flavor(), and there is also the model_optimization_config()
Too many similar names.
Anyway, I’m sure you understood what I meant, if you can provide some clarity.
Thanks!