**
this is hailo -h:**
[info] No GPU chosen and no suitable GPU found, falling back to CPU.
[info] Current Time: 12:58:46, 12/11/25
[info] CPU: Architecture: x86_64, Model: AMD Ryzen 7 7700 8-Core Processor, Number Of Cores: 16, Utilization: 0.4%
[info] Memory: Total: 124GB, Available: 104GB
[info] System info: OS: Linux, Kernel: 6.8.0-88-generic
[info] Hailo DFC Version: 3.33.0
[info] HailoRT Version: 4.23.0
[info] PCIe: No Hailo PCIe device was found
[info] Running hailo -h
usage: hailo [-h] [–version]
{fw-update,ssb-update,fw-config,udp-rate-limiter,fw-control,fw-logger,scan,sensor-config,run,benchmark,monitor,parse-hef,measure-power,tutorial,analyze-noise,compiler,params-csv,parser,profiler,optimize,visualizer,har,join,har-onnx-rt,runtime-profiler,dfc-studio,help}
…
Hailo Command Line Utility
positional arguments:
{fw-update,ssb-update,fw-config,udp-rate-limiter,fw-control,fw-logger,scan,sensor-config,run,benchmark,monitor,parse-hef,measure-power,tutorial,analyze-noise,compiler,params-csv,parser,profiler,optimize,visualizer,har,join,har-onnx-rt,runtime-profiler,dfc-studio,help}
Hailo utilities aimed to help with everything you need
fw-update Firmware update tool
ssb-update Second stage boot update tool
fw-config Firmware configuration tool
udp-rate-limiter Limit the UDP rate
fw-control Useful firmware control operations
fw-logger Download fw logs to a file
scan Scans for devices (Ethernet or PCIE)
sensor-config Sensor configuration tool
run Run a compiled network
benchmark Measure basic performance on compiled network
monitor Monitor of networks - Presents information about the running networks. To enable monitor, set in the application process the environment variable ‘HAILO_MONITOR’ to
1.
parse-hef Parse HEF to get information about its components
measure-power Measures power consumption
tutorial Runs the tutorials in jupyter notebook
analyze-noise Analyze network quantization noise
compiler Compile Hailo model to HEF binary files
params-csv Convert translated params to csv
parser Translate network to Hailo network
profiler Hailo models Profiler
optimize Optimize model
visualizer HAR visualization tool
har Query and extract information from Hailo Archive file
join Join two Hailo models to a single model
har-onnx-rt Generates ONNX-Runtime model including pre/post processing
runtime-profiler Hailo Runtime Profiler
dfc-studio Start DFC Studio
help show the list of commands
options:
-h, --help show this help message and exit
–version show program’s version number and exit
*
also i try manyally convert via notebooks, i run in terminal “hailo tutorial”, and parse model, then optimize, then try to compile
parse onnx will successfull here`s the notebook output:
runner = ClientRunner(hw_arch=chosen_hw_arch)
hn, npz = runner.translate_onnx_model(
model=onnx_path,
net_name=“yolov11m_seg”,
start_node_names=[“images”],
end_node_names=[
“/model.23/Concat_4”,
“/model.23/Concat”,
“/model.23/proto/cv3/act/Mul”
],
net_input_shapes={“images”: [1, 3, 640, 640]},
)
[info] Translation started on ONNX model yolov11m_seg
[info] Restored ONNX model yolov11m_seg (completion time: 00:00:00.13)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:00.55)
[info] Start nodes mapped from original model: 'images': 'yolov11m_seg/input_layer1'.
[info] End nodes mapped from original model: '/model.23/Concat_4', '/model.23/Concat', '/model.23/proto/cv3/act/Mul'.
[info] Translation completed on ONNX model yolov11m_seg (completion time: 00:00:02.02)
Optimisation takes time, but also successfull:
Slightly increase optimization level for segmentation networks
model_optimization_flavor(optimization_level=2)
“”"
runner.load_model_script(alls)
Perform optimization using your calibration dataset
runner.optimize(calib_dataset)
Save quantized model
quantized_model_har_path = “yolov11m_seg_quantized.har”
runner.save_har(quantized_model_har_path)
print(“Saved:”, quantized_model_har_path)
[info] Loading model script commands to yolov11m_seg from string
[info] Found model with 3 input channels, using real RGB images for calibration instead of sampling random data.
[info] Starting Model Optimization
[info] Using default compression level of 1
[info] Model received quantization params from the hn
[info] MatmulDecompose skipped
[info] Starting Mixed Precision
[info] Assigning 4bit weights to layer yolov11m_seg/conv22 with 2359.30k parameters
[info] Assigning 4bit weights to layer yolov11m_seg/conv32 with 2359.30k parameters
[info] Ratio of weights in 4bit is 0.21
[info] Model Optimization Algorithm Mixed Precision is done (completion time is 00:00:00.42)
[info] LayerNorm Decomposition skipped
[info] Starting Statistics Collector
[info] Using dataset with 64 entries for calibration
Calibration: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 64/64 [01:24<00:00, 1.33s/entries]
[info] Model Optimization Algorithm Statistics Collector is done (completion time is 00:01:25.67)
[info] Starting Fix zp_comp Encoding
[info] Model Optimization Algorithm Fix zp_comp Encoding is done (completion time is 00:00:00.00)
[info] Starting Matmul Equalization
[info] Model Optimization Algorithm Matmul Equalization is done (completion time is 00:00:00.02)
[info] Starting MatmulDecomposeFix
[info] Model Optimization Algorithm MatmulDecomposeFix is done (completion time is 00:00:00.00)
[info] activation fitting started for yolov11m_seg/reduce_sum_softmax1/act_op
[info] Finetune encoding skipped
[info] Bias Correction skipped
[info] Adaround skipped
[warning] Quantization-Aware Fine-Tuning: Dataset didn't have enough data for dataset_size of 1024 Quantizing using calibration size of 978
[info] Starting Quantization-Aware Fine-Tuning
[info] Using dataset with 978 entries for finetune
Epoch 1/4
122/122 ━━━━━━━━━━━━━━━━━━━━ 2104s 16s/step - _distill_loss_yolov11m_seg/concat26: 0.1934 - _distill_loss_yolov11m_seg/concat27: 0.1521 - _distill_loss_yolov11m_seg/conv108: 0.2182 - _distill_loss_yolov11m_seg/conv67: 0.1900 - _distill_loss_yolov11m_seg/conv77: 0.2091 - total_distill_loss: 0.9628
Epoch 2/4
122/122 ━━━━━━━━━━━━━━━━━━━━ 1986s 16s/step - _distill_loss_yolov11m_seg/concat26: 0.2034 - _distill_loss_yolov11m_seg/concat27: 0.1575 - _distill_loss_yolov11m_seg/conv108: 0.2328 - _distill_loss_yolov11m_seg/conv67: 0.1854 - _distill_loss_yolov11m_seg/conv77: 0.2136 - total_distill_loss: 0.9925
Epoch 3/4
122/122 ━━━━━━━━━━━━━━━━━━━━ 1982s 16s/step - _distill_loss_yolov11m_seg/concat26: 0.1894 - _distill_loss_yolov11m_seg/concat27: 0.1474 - _distill_loss_yolov11m_seg/conv108: 0.2142 - _distill_loss_yolov11m_seg/conv67: 0.1748 - _distill_loss_yolov11m_seg/conv77: 0.2067 - total_distill_loss: 0.9326
Epoch 4/4
122/122 ━━━━━━━━━━━━━━━━━━━━ 1983s 16s/step - _distill_loss_yolov11m_seg/concat26: 0.1690 - _distill_loss_yolov11m_seg/concat27: 0.1330 - _distill_loss_yolov11m_seg/conv108: 0.1871 - _distill_loss_yolov11m_seg/conv67: 0.1604 - _distill_loss_yolov11m_seg/conv77: 0.1871 - total_distill_loss: 0.8367
[info] Model Optimization Algorithm Quantization-Aware Fine-Tuning is done (completion time is 02:14:17.65)
[info] Starting Layer Noise Analysis
Full Quant Analysis: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [02:18<00:00, 69.02s/iterations]
[info] Model Optimization Algorithm Layer Noise Analysis is done (completion time is 00:02:19.87)
[info] Output layers signal-to-noise ratio (SNR): measures the quantization noise (higher is better)
[info] yolov11m_seg/output_layer3 SNR: 17.71 dB
[info] yolov11m_seg/output_layer2 SNR: 15.0 dB
[info] yolov11m_seg/output_layer1 SNR: 13.25 dB
[info] Model Optimization is done
[info] Saved HAR to: /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_tutorials/notebooks/yolov11m_seg_quantized.har
Saved: yolov11m_seg_quantized.har
**
Here i have the .har archive, which i trying to compile
**
[info] To achieve optimal performance, set the compiler_optimization_level to "max" by adding performance_param(compiler_optimization_level=max) to the model script. Note that this may increase compilation time.
[info] Loading network parameters
[info] Starting Hailo allocation and compilation flow
[info] Building optimization options for network layers...
[info] Successfully built optimization options - 41s 145ms
[info] Trying to compile the network in a single context
[info] Single context flow failed: Recoverable single context error
[info] Building optimization options for network layers...
[info] Successfully built optimization options - 53s 193ms
[error] Mapping Failed (allocation time: 53s)
[error] Failed to produce compiled graph
No successful assignments: concat27 errors:
Agent infeasible
---------------------------------------------------------------------------
BackendAllocatorException Traceback (most recent call last)
Cell In [4], line 1
----> 1 hef = runner.compile()
3 file_name = f"{model_name}.hef"
4 with open(file_name, "wb") as f:
File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py:911, in ClientRunner.compile(self)
899 def compile(self):
900 """
901 DFC API for compiling current model to Hailo hardware.
902
(...)
909
910 """
--> 911 return self._compile()
File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py:16, in allowed_states.<locals>.wrap.<locals>.wrapped_func(self, *args, **kwargs)
12 if self._state not in states:
13 raise InvalidStateException(
14 f"The execution of {func.__name__} is not available under the state: {self._state.value}",
15 )
---> 16 return func(self, *args, **kwargs)
File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py:1128, in ClientRunner._compile(self, fps, mapping_timeout, allocator_script_filename)
1122 self._logger.warning(
1123 f"Taking model script commands from {allocator_script_filename} and ignoring "
1124 f"previous allocation script commands",
1125 )
1126 self.load_model_script(allocator_script_filename)
-> 1128 serialized_hef = self._sdk_backend.compile(fps, self.model_script, mapping_timeout)
1130 self._auto_model_script = self._sdk_backend.get_auto_alls()
1131 self._state = States.COMPILED_SLIM_MODEL if orig_state in SLIM_STATES else States.COMPILED_MODEL
File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py:1852, in SdkBackendCompilation.compile(self, fps, allocator_script, mapping_timeout)
1850 def compile(self, fps, allocator_script=None, mapping_timeout=None):
1851 self._model.fill_default_quantization_params(logger=self._logger)
-> 1852 hef, mapped_graph_file = self._compile(fps, allocator_script, mapping_timeout)
1853 # TODO: https://hailotech.atlassian.net/browse/SDK-31038
1854 if not SDKPaths().is_internal:
File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py:1846, in SdkBackendCompilation._compile(self, fps, allocator_script, mapping_timeout)
1840 if not model_params and self.requires_quantized_weights:
1841 raise BackendRuntimeException(
1842 "Model requires quantized weights in order to run on HW, but none were given. "
1843 "Did you forget to quantize?",
1844 )
-> 1846 hef, mapped_graph_file, auto_alls = self.hef_full_build(fps, mapping_timeout, model_params, allocator_script)
1847 self._auto_alls = auto_alls
1848 return hef, mapped_graph_file
File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py:1822, in SdkBackendCompilation.hef_full_build(self, fps, mapping_timeout, params, allocator_script)
1820 config_paths = ConfigPaths(self._hw_arch, self._model.name)
1821 config_paths.set_stage("inference")
-> 1822 auto_alls, self._hef_data, self._integrated_graph = allocator.create_mapping_and_full_build_hef(
1823 config_paths.get_path("network_graph"),
1824 config_paths.get_path("mapped_graph"),
1825 config_paths.get_path("compilation_output_proto"),
1826 params=params,
1827 allocator_script=allocator_script,
1828 compiler_statistics_path=config_paths.get_path("compiler_statistics"),
1829 nms_metadata=self._nms_metadata,
1830 har=self.har,
1831 alls_ignore_invalid_cmds=self._alls_ignore_invalid_cmds,
1832 )
1834 return self._hef_data, config_paths.get_path("mapped_graph"), auto_alls
File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/allocator/hailo_tools_runner.py:764, in HailoToolsRunner.create_mapping_and_full_build_hef(self, network_graph_path, output_path, compilation_output_proto, agent, strategy, auto_mapping, params, expected_output_tensor, expected_pre_acts, network_inputs, network_outputs, allocator_script, allocator_script_mode, compiler_statistics_path, nms_metadata, har, alls_ignore_invalid_cmds)
759 if self.hn.net_params.clusters_placement != [[]]:
760 assert (
761 len(self.hn.net_params.clusters_placement) <= self._number_of_clusters
762 ), "Number of clusters in layer placements is larger than allowed number of clusters"
--> 764 self.call_builder(
765 network_graph_path,
766 output_path,
767 compilation_output_proto=compilation_output_proto,
768 agent=agent,
769 strategy=strategy,
770 exit_point=BuilderExitPoint.POST_CAT,
771 params=params,
772 expected_output_tensor=expected_output_tensor,
773 expected_pre_acts=expected_pre_acts,
774 network_inputs=network_inputs,
775 network_outputs=network_outputs,
776 allocator_script=allocator_script,
777 allocator_script_mode=allocator_script_mode,
778 compiler_statistics_path=compiler_statistics_path,
779 nms_metadata=nms_metadata,
780 har=har,
781 alls_ignore_invalid_cmds=alls_ignore_invalid_cmds,
782 )
784 return self._auto_alls, self._output_hef_data, self._output_integrated_pb_graph
File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/allocator/hailo_tools_runner.py:696, in HailoToolsRunner.call_builder(self, network_graph_path, output_path, blind_deserialize, **kwargs)
694 sys.excepthook = _hailo_tools_exception_hook
695 try:
--> 696 self.run_builder(network_graph_path, output_path, **kwargs)
697 except BackendInternalException:
698 try:
File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/allocator/hailo_tools_runner.py:570, in HailoToolsRunner.run_builder(self, network_graph_filename, output_filename, compilation_output_proto, agent, strategy, exit_point, params, expected_output_tensor, expected_pre_acts, network_inputs, network_outputs, allocator_script, allocator_script_mode, compiler_statistics_path, is_debug, nms_metadata, har, alls_ignore_invalid_cmds)
568 compiler_msg = e.hailo_tools_error
569 if compiler_msg:
--> 570 raise e.internal_exception("Compilation failed:", hailo_tools_error=compiler_msg) from None
571 else:
572 raise e.internal_exception("Compilation failed with unexpected crash") from None
BackendAllocatorException: Compilation failed: No successful assignments: concat27 errors:
Agent infeasible
-
GPT says:
Compiler:
All available optimization passes were applied.
An initial attempt to compile the model using a single context (full-graph mode) failed due to size constraints.
Then proceeded with a more granular multi-context partitioning strategy.
Despite multiple retries, the compiler is unable to partition the concat27 layer. The layer cannot be decomposed in a way that simultaneously:
• fits the required intermediate activations into the cluster’s SRAM, and
• meets the bandwidth and context-switch limitations.
but i dont understand what should i do to fix this.
Sorry for the long message. I’m not very proficient in ML, and I really need some help.) Thanks!