Converting YOLOv11 segmentation to hef

Ivan_Hostar · December 11, 2025, 8:16am

Hello, I am currently working on deploying a YOLO11m segmentation model to the Raspberry Pi AI Kit (13 TOPS / Hailo-8L), and I am facing issues while trying to compile the model inside the Hailo AI SW Suite 2025-10 Docker environment.

Here is what I have tried so far:

I am using the AI SW Suite 2025-10 Docker installation, which includes Dataflow Compiler v3.33.0, Hailo Model Zoo v2.17.0 and HailoRT v4.23.0. My workspace is located in /local/workspace inside the container. My model is yolov11m-seg-640..onnx, and I am using a folder train/ for calibration images.

I tried compiling the model through Hailo Model Zoo using hailomz compile with various configurations, for example:

hailomz compile --ckpt unified_yolov11m-seg-640-0.2.onnx --hw-arch hailo8l --calib-path train/ --classes 7 --yaml custom/yolov11m_seg.yaml

my yaml:
yolov11m_seg.yaml
network:
network_name: yolov11m_seg
network_type: segmentation
num_classes: 7
input_shape: [1, 3, 640, 640]

paths:
model_path: /local/workspace/unified_yolov11m-seg-640-0.2.onnx

parser:
start_node_names: [“images”]
end_node_names: [“output0”, “output1”]

postprocess:
type: yolov8_seg
conf_threshold: 0.25
iou_threshold: 0.45

This produced multiple YAML-related errors, such as missing keys like network_name or start_node_shapes, and made it clear that Model Zoo does not accept custom segmentation ONNX models through a user-defined YAML file. The YAML structure I attempted to provide did not match the strict Model Zoo schema.

I attempted to convert the model manually using the Dataflow Compiler:

hailo parser onnx unified_yolov11m-seg-640-0.2.onnx --start-node-names images --end-node-names output0 output1 --input-shapes images=1,3,640,640 --output yolov11m_seg.har

However, the hailo binary inside the container is actually the HailoRT CLI, not the Dataflow Compiler parser tool. As a result, flags like --input-shapes or --output are not recognized. This suggests that the correct Dataflow Compiler executable is not being picked up by PATH inside the Suite Docker environment.

I tried locating the correct DFC command inside the container by checking which hailo or inspecting the virtualenv bin directory, but it seems the environment points primarily to HailoRT tools rather than the compiler tools required for the ONNX→HAR→Optimize→HEF workflow.

My goal is to generate a valid .hef for the YOLO11m segmentation model for use on the Raspberry Pi AI Kit.(if it possible, i check official supported segmentation modles in hailo model zoo and see only yolov8, it means yolo11 not supported?)

I can provide additional information (ONNX graph, node names, shapes) if needed.

omria · December 11, 2025, 8:33am

Hey @Ivan_Hostar,

Welcome to the Hailo Community!

YOLOv11 object detection is available in the Hailo Model Zoo, but YOLOv11 segmentation isn’t something we currently offer as a ready-made flow.

For yolov11m-seg, there’s no official support at the moment, but many users successfully compile models that aren’t included in the Zoo. What I’d recommend is going through the standard flow—hailo parser, hailo optimize, and hailo compile—and making sure you provide the correct .alls file along with the right start/end node names and any other details needed for compilation.

Also, can you run hailo -h and share the output? That’ll help me confirm that DFC is installed correctly on your side.

Ivan_Hostar · December 11, 2025, 1:19pm

Hello, thank you for you answer, i am using docker, which i download from https://hailo.ai/developer-zone/software-downloads

**

this is hailo -h:**


[info] No GPU chosen and no suitable GPU found, falling back to CPU.
[info] Current Time: 12:58:46, 12/11/25
[info] CPU: Architecture: x86_64, Model: AMD Ryzen 7 7700 8-Core Processor, Number Of Cores: 16, Utilization: 0.4%
[info] Memory: Total: 124GB, Available: 104GB
[info] System info: OS: Linux, Kernel: 6.8.0-88-generic
[info] Hailo DFC Version: 3.33.0
[info] HailoRT Version: 4.23.0
[info] PCIe: No Hailo PCIe device was found
[info] Running hailo -h
usage: hailo [-h] [–version]
{fw-update,ssb-update,fw-config,udp-rate-limiter,fw-control,fw-logger,scan,sensor-config,run,benchmark,monitor,parse-hef,measure-power,tutorial,analyze-noise,compiler,params-csv,parser,profiler,optimize,visualizer,har,join,har-onnx-rt,runtime-profiler,dfc-studio,help}
…

Hailo Command Line Utility

positional arguments:
{fw-update,ssb-update,fw-config,udp-rate-limiter,fw-control,fw-logger,scan,sensor-config,run,benchmark,monitor,parse-hef,measure-power,tutorial,analyze-noise,compiler,params-csv,parser,profiler,optimize,visualizer,har,join,har-onnx-rt,runtime-profiler,dfc-studio,help}
Hailo utilities aimed to help with everything you need
fw-update           Firmware update tool
ssb-update          Second stage boot update tool
fw-config           Firmware configuration tool
udp-rate-limiter    Limit the UDP rate
fw-control          Useful firmware control operations
fw-logger           Download fw logs to a file
scan                Scans for devices (Ethernet or PCIE)
sensor-config       Sensor configuration tool
run                 Run a compiled network
benchmark           Measure basic performance on compiled network
monitor             Monitor of networks - Presents information about the running networks. To enable monitor, set in the application process the environment variable ‘HAILO_MONITOR’ to
1.
parse-hef           Parse HEF to get information about its components
measure-power       Measures power consumption
tutorial            Runs the tutorials in jupyter notebook
analyze-noise       Analyze network quantization noise
compiler            Compile Hailo model to HEF binary files
params-csv          Convert translated params to csv
parser              Translate network to Hailo network
profiler            Hailo models Profiler
optimize            Optimize model
visualizer          HAR visualization tool
har                 Query and extract information from Hailo Archive file
join                Join two Hailo models to a single model
har-onnx-rt         Generates ONNX-Runtime model including pre/post processing
runtime-profiler    Hailo Runtime Profiler
dfc-studio          Start DFC Studio
help                show the list of commands

options:
-h, --help            show this help message and exit
–version             show program’s version number and exit

*

also i try manyally convert via notebooks, i run in terminal “hailo tutorial”, and parse model, then optimize, then try to compile

parse onnx will successfull here`s the notebook output:

runner = ClientRunner(hw_arch=chosen_hw_arch)

hn, npz = runner.translate_onnx_model(
model=onnx_path,
net_name=“yolov11m_seg”,
start_node_names=[“images”],
end_node_names=[
“/model.23/Concat_4”,
“/model.23/Concat”,
“/model.23/proto/cv3/act/Mul”
],
net_input_shapes={“images”: [1, 3, 640, 640]},
)

[info] Translation started on ONNX model yolov11m_seg
[info] Restored ONNX model yolov11m_seg (completion time: 00:00:00.13)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:00.55)
[info] Start nodes mapped from original model: 'images': 'yolov11m_seg/input_layer1'.
[info] End nodes mapped from original model: '/model.23/Concat_4', '/model.23/Concat', '/model.23/proto/cv3/act/Mul'.
[info] Translation completed on ONNX model yolov11m_seg (completion time: 00:00:02.02)

Optimisation takes time, but also successfull:

Slightly increase optimization level for segmentation networks

model_optimization_flavor(optimization_level=2)
“”"

runner.load_model_script(alls)

Perform optimization using your calibration dataset

runner.optimize(calib_dataset)

Save quantized model

quantized_model_har_path = “yolov11m_seg_quantized.har”
runner.save_har(quantized_model_har_path)

print(“Saved:”, quantized_model_har_path)

[info] Loading model script commands to yolov11m_seg from string
[info] Found model with 3 input channels, using real RGB images for calibration instead of sampling random data.
[info] Starting Model Optimization
[info] Using default compression level of 1
[info] Model received quantization params from the hn
[info] MatmulDecompose skipped
[info] Starting Mixed Precision
[info] Assigning 4bit weights to layer yolov11m_seg/conv22 with 2359.30k parameters
[info] Assigning 4bit weights to layer yolov11m_seg/conv32 with 2359.30k parameters
[info] Ratio of weights in 4bit is 0.21
[info] Model Optimization Algorithm Mixed Precision is done (completion time is 00:00:00.42)
[info] LayerNorm Decomposition skipped
[info] Starting Statistics Collector
[info] Using dataset with 64 entries for calibration

Calibration: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 64/64 [01:24<00:00,  1.33s/entries]

[info] Model Optimization Algorithm Statistics Collector is done (completion time is 00:01:25.67)
[info] Starting Fix zp_comp Encoding
[info] Model Optimization Algorithm Fix zp_comp Encoding is done (completion time is 00:00:00.00)
[info] Starting Matmul Equalization
[info] Model Optimization Algorithm Matmul Equalization is done (completion time is 00:00:00.02)
[info] Starting MatmulDecomposeFix
[info] Model Optimization Algorithm MatmulDecomposeFix is done (completion time is 00:00:00.00)
[info] activation fitting started for yolov11m_seg/reduce_sum_softmax1/act_op
[info] Finetune encoding skipped
[info] Bias Correction skipped
[info] Adaround skipped
[warning] Quantization-Aware Fine-Tuning:	Dataset didn't have enough data for dataset_size of 1024 	Quantizing using calibration size of 978
[info] Starting Quantization-Aware Fine-Tuning
[info] Using dataset with 978 entries for finetune
Epoch 1/4
122/122 ━━━━━━━━━━━━━━━━━━━━ 2104s 16s/step - _distill_loss_yolov11m_seg/concat26: 0.1934 - _distill_loss_yolov11m_seg/concat27: 0.1521 - _distill_loss_yolov11m_seg/conv108: 0.2182 - _distill_loss_yolov11m_seg/conv67: 0.1900 - _distill_loss_yolov11m_seg/conv77: 0.2091 - total_distill_loss: 0.9628
Epoch 2/4
122/122 ━━━━━━━━━━━━━━━━━━━━ 1986s 16s/step - _distill_loss_yolov11m_seg/concat26: 0.2034 - _distill_loss_yolov11m_seg/concat27: 0.1575 - _distill_loss_yolov11m_seg/conv108: 0.2328 - _distill_loss_yolov11m_seg/conv67: 0.1854 - _distill_loss_yolov11m_seg/conv77: 0.2136 - total_distill_loss: 0.9925
Epoch 3/4
122/122 ━━━━━━━━━━━━━━━━━━━━ 1982s 16s/step - _distill_loss_yolov11m_seg/concat26: 0.1894 - _distill_loss_yolov11m_seg/concat27: 0.1474 - _distill_loss_yolov11m_seg/conv108: 0.2142 - _distill_loss_yolov11m_seg/conv67: 0.1748 - _distill_loss_yolov11m_seg/conv77: 0.2067 - total_distill_loss: 0.9326
Epoch 4/4
122/122 ━━━━━━━━━━━━━━━━━━━━ 1983s 16s/step - _distill_loss_yolov11m_seg/concat26: 0.1690 - _distill_loss_yolov11m_seg/concat27: 0.1330 - _distill_loss_yolov11m_seg/conv108: 0.1871 - _distill_loss_yolov11m_seg/conv67: 0.1604 - _distill_loss_yolov11m_seg/conv77: 0.1871 - total_distill_loss: 0.8367
[info] Model Optimization Algorithm Quantization-Aware Fine-Tuning is done (completion time is 02:14:17.65)
[info] Starting Layer Noise Analysis

Full Quant Analysis: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [02:18<00:00, 69.02s/iterations]

[info] Model Optimization Algorithm Layer Noise Analysis is done (completion time is 00:02:19.87)
[info] Output layers signal-to-noise ratio (SNR): measures the quantization noise (higher is better)
[info] 	yolov11m_seg/output_layer3 SNR:	17.71 dB
[info] 	yolov11m_seg/output_layer2 SNR:	15.0 dB
[info] 	yolov11m_seg/output_layer1 SNR:	13.25 dB
[info] Model Optimization is done
[info] Saved HAR to: /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_tutorials/notebooks/yolov11m_seg_quantized.har
Saved: yolov11m_seg_quantized.har

**

Here i have the .har archive, which i trying to compile
**

[info] To achieve optimal performance, set the compiler_optimization_level to "max" by adding performance_param(compiler_optimization_level=max) to the model script. Note that this may increase compilation time.
[info] Loading network parameters
[info] Starting Hailo allocation and compilation flow
[info] Building optimization options for network layers...
[info] Successfully built optimization options - 41s 145ms
[info] Trying to compile the network in a single context
[info] Single context flow failed: Recoverable single context error
[info] Building optimization options for network layers...
[info] Successfully built optimization options - 53s 193ms
[error] Mapping Failed (allocation time: 53s)

[error] Failed to produce compiled graph


No successful assignments: concat27 errors:
	Agent infeasible



---------------------------------------------------------------------------
BackendAllocatorException                 Traceback (most recent call last)
Cell In [4], line 1
----> 1 hef = runner.compile()
      3 file_name = f"{model_name}.hef"
      4 with open(file_name, "wb") as f:

File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py:911, in ClientRunner.compile(self)
    899 def compile(self):
    900     """
    901     DFC API for compiling current model to Hailo hardware.
    902 
   (...)
    909 
    910     """
--> 911     return self._compile()

File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py:16, in allowed_states.<locals>.wrap.<locals>.wrapped_func(self, *args, **kwargs)
     12 if self._state not in states:
     13     raise InvalidStateException(
     14         f"The execution of {func.__name__} is not available under the state: {self._state.value}",
     15     )
---> 16 return func(self, *args, **kwargs)

File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py:1128, in ClientRunner._compile(self, fps, mapping_timeout, allocator_script_filename)
   1122         self._logger.warning(
   1123             f"Taking model script commands from {allocator_script_filename} and ignoring "
   1124             f"previous allocation script commands",
   1125         )
   1126     self.load_model_script(allocator_script_filename)
-> 1128 serialized_hef = self._sdk_backend.compile(fps, self.model_script, mapping_timeout)
   1130 self._auto_model_script = self._sdk_backend.get_auto_alls()
   1131 self._state = States.COMPILED_SLIM_MODEL if orig_state in SLIM_STATES else States.COMPILED_MODEL

File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py:1852, in SdkBackendCompilation.compile(self, fps, allocator_script, mapping_timeout)
   1850 def compile(self, fps, allocator_script=None, mapping_timeout=None):
   1851     self._model.fill_default_quantization_params(logger=self._logger)
-> 1852     hef, mapped_graph_file = self._compile(fps, allocator_script, mapping_timeout)
   1853     # TODO: https://hailotech.atlassian.net/browse/SDK-31038
   1854     if not SDKPaths().is_internal:

File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py:1846, in SdkBackendCompilation._compile(self, fps, allocator_script, mapping_timeout)
   1840 if not model_params and self.requires_quantized_weights:
   1841     raise BackendRuntimeException(
   1842         "Model requires quantized weights in order to run on HW, but none were given. "
   1843         "Did you forget to quantize?",
   1844     )
-> 1846 hef, mapped_graph_file, auto_alls = self.hef_full_build(fps, mapping_timeout, model_params, allocator_script)
   1847 self._auto_alls = auto_alls
   1848 return hef, mapped_graph_file

File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py:1822, in SdkBackendCompilation.hef_full_build(self, fps, mapping_timeout, params, allocator_script)
   1820 config_paths = ConfigPaths(self._hw_arch, self._model.name)
   1821 config_paths.set_stage("inference")
-> 1822 auto_alls, self._hef_data, self._integrated_graph = allocator.create_mapping_and_full_build_hef(
   1823     config_paths.get_path("network_graph"),
   1824     config_paths.get_path("mapped_graph"),
   1825     config_paths.get_path("compilation_output_proto"),
   1826     params=params,
   1827     allocator_script=allocator_script,
   1828     compiler_statistics_path=config_paths.get_path("compiler_statistics"),
   1829     nms_metadata=self._nms_metadata,
   1830     har=self.har,
   1831     alls_ignore_invalid_cmds=self._alls_ignore_invalid_cmds,
   1832 )
   1834 return self._hef_data, config_paths.get_path("mapped_graph"), auto_alls

File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/allocator/hailo_tools_runner.py:764, in HailoToolsRunner.create_mapping_and_full_build_hef(self, network_graph_path, output_path, compilation_output_proto, agent, strategy, auto_mapping, params, expected_output_tensor, expected_pre_acts, network_inputs, network_outputs, allocator_script, allocator_script_mode, compiler_statistics_path, nms_metadata, har, alls_ignore_invalid_cmds)
    759 if self.hn.net_params.clusters_placement != [[]]:
    760     assert (
    761         len(self.hn.net_params.clusters_placement) <= self._number_of_clusters
    762     ), "Number of clusters in layer placements is larger than allowed number of clusters"
--> 764 self.call_builder(
    765     network_graph_path,
    766     output_path,
    767     compilation_output_proto=compilation_output_proto,
    768     agent=agent,
    769     strategy=strategy,
    770     exit_point=BuilderExitPoint.POST_CAT,
    771     params=params,
    772     expected_output_tensor=expected_output_tensor,
    773     expected_pre_acts=expected_pre_acts,
    774     network_inputs=network_inputs,
    775     network_outputs=network_outputs,
    776     allocator_script=allocator_script,
    777     allocator_script_mode=allocator_script_mode,
    778     compiler_statistics_path=compiler_statistics_path,
    779     nms_metadata=nms_metadata,
    780     har=har,
    781     alls_ignore_invalid_cmds=alls_ignore_invalid_cmds,
    782 )
    784 return self._auto_alls, self._output_hef_data, self._output_integrated_pb_graph

File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/allocator/hailo_tools_runner.py:696, in HailoToolsRunner.call_builder(self, network_graph_path, output_path, blind_deserialize, **kwargs)
    694 sys.excepthook = _hailo_tools_exception_hook
    695 try:
--> 696     self.run_builder(network_graph_path, output_path, **kwargs)
    697 except BackendInternalException:
    698     try:

File /local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/allocator/hailo_tools_runner.py:570, in HailoToolsRunner.run_builder(self, network_graph_filename, output_filename, compilation_output_proto, agent, strategy, exit_point, params, expected_output_tensor, expected_pre_acts, network_inputs, network_outputs, allocator_script, allocator_script_mode, compiler_statistics_path, is_debug, nms_metadata, har, alls_ignore_invalid_cmds)
    568 compiler_msg = e.hailo_tools_error
    569 if compiler_msg:
--> 570     raise e.internal_exception("Compilation failed:", hailo_tools_error=compiler_msg) from None
    571 else:
    572     raise e.internal_exception("Compilation failed with unexpected crash") from None

BackendAllocatorException: Compilation failed: No successful assignments: concat27 errors:
	Agent infeasible

GPT says:
Compiler:
All available optimization passes were applied.
An initial attempt to compile the model using a single context (full-graph mode) failed due to size constraints.
Then proceeded with a more granular multi-context partitioning strategy.

Despite multiple retries, the compiler is unable to partition the concat27 layer. The layer cannot be decomposed in a way that simultaneously:
• fits the required intermediate activations into the cluster’s SRAM, and
• meets the bandwidth and context-switch limitations.

but i dont understand what should i do to fix this.

Sorry for the long message. I’m not very proficient in ML, and I really need some help.) Thanks!

Arthur_Reilly · December 14, 2025, 11:04pm

Hi Ivan. I’m new to this too, but I encountered almost the same issue when compiling a custom-trained version of the Yolov11m model. I got a lot of help from GPT-5 via Microsoft CoPilot, so I had the instructions I was given distilled into a guide which I’ve included below. I can’t guarantee that it will be foolproof in your case because you’re using different baseline versions of the Hailo environment that I did.

BTW - when I started down this path, it was recommended to use Compiler version 3.31. I had to modify the Linux kernel I was using, and got other advice from GPT-5 along the way.

Cover Page

Title: YOLOv11m Compilation on Hailo with Fudge Factor (Docker Workflow)

Date: December 14, 2025

Abstract: Compiling large YOLO models on Hailo hardware can fail due to allocator infeasibility, especially at concat layers. This guide provides a reproducible Docker workflow for YOLOv11m using a “fudge factor” adjustment to quantization parameters. The method relaxes allocator constraints, enabling successful compilation while maintaining accuracy.

Guide Contents

Workflow Diagram

Docker start (mount volume)
          |
          v
Apply fudge factor (adjust HAR)
          |
          v
Model script (optimization + timeout)
          |
          v
Compile HAR → HEF output
          |
          v
Validate (accuracy & performance)

Prerequisites

Hailo SDK installed in a Docker image (hailo-sdk:latest).
Quantized YOLOv11m HAR file (yolov11m_quantized.har).
Recommended environment: Compiler v3.31, Driver v3.21.

Step 1: Start Docker with volume mount

# Start Docker container with current directory mounted to /workspace

docker run -it --rm 
  -v "$(pwd)":/workspace 
  hailo-sdk:latest /bin/bash

Step 2: Apply the fudge factor

Create /workspace/fudge_har.py with consistent indentation and no extra line breaks:

import hailo_sdk_client as hailo

def apply_fudge_factor_to_har(input_har: str, output_har: str, scale_factor: float = 1.01, zero_point_shift: int = 0) -> None:
    """
    Apply a small adjustment to quantized tensor parameters in a HAR to relax
    allocator constraints and improve the chances of successful compilation.
    """
    runner = hailo.ClientRunner(har=input_har)

    for tensor in runner.model.quantized_tensors:
        tensor.scale *= scale_factor
        tensor.zero_point = max(0, min(255, tensor.zero_point + zero_point_shift))

    runner.save_har(output_har)
    print(f"Adjusted HAR saved to: {output_har}")


if __name__ == "__main__":
    apply_fudge_factor_to_har(
        input_har="/workspace/yolov11m_quantized.har",
        output_har="/workspace/yolov11m_fudged.har",
        scale_factor=1.01,
        zero_point_shift=1,
    )

Run:

python3 /workspace/fudge_har.py

Step 3: Model script

Create /workspace/model_script.py with consistent formatting:

def model_script(runner):
    """
    Minimal model script to give the compiler more flexibility
    and time to find a feasible mapping.
    """
    runner.model_script.performance_param(compiler_optimization_level="max")
    runner.model_script.performance_param(mapping_timeout=600)

Step 4: Compile the adjusted HAR

import hailo_sdk_client as hailo

runner = hailo.ClientRunner(har="/workspace/yolov11m_fudged.har")
runner.load_model_script("/workspace/model_script.py")

hef = runner.compile()

with open("/workspace/yolov11m.hef", "wb") as f:
    f.write(hef)

print("Compilation complete: /workspace/yolov11m.hef")

Step 5: Validate

Run inference with the .hef to confirm allocator success.
Compare accuracy against the original quantized model.
Expect <1% drop if fudge factor ≤2%.

Troubleshooting Appendix

Allocator still fails:
- Increase scale_factor (e.g., 1.02).
- Try zero_point_shift=0 or -1.
- Increase mapping_timeout (e.g., 1200).
Environment stability:
- Use compiler v3.31 + driver v3.21.
Files not persisting:
- Ensure HAR/HEF paths are under /workspace.

Quick Command Sequence

# Run all steps in sequence inside Docker

docker run -it --rm -v "$(pwd)":/workspace hailo-sdk:latest /bin/bash
python3 /workspace/fudge_har.py
python3 - <<'EOF'
import hailo_sdk_client as hailo
runner = hailo.ClientRunner(har="/workspace/yolov11m_fudged.har")
runner.load_model_script("/workspace/model_script.py")
hef = runner.compile()
with open("/workspace/yolov11m.hef", "wb") as f:
    f.write(hef)
print("Compilation complete: /workspace/yolov11m.hef")
EOF

Ivan_Hostar · December 15, 2025, 9:00am

Hi Arthur, I really appreciate your help. I spent the whole week trying to find a solution. I even switched to YOLOv8. I will try this method and hope it works.

Sam_Galpin · December 15, 2025, 12:22pm

Continuing the discussion from Converting YOLOv11 segmentation to hef:

I am trying to compile custom yolo11n object detection model with hailo 25_10 suite. I am running Ubuntu 24.04 from a USB SSD. Hardware is Dell Alienware Area 51 tower - 64 GB ram and 5090 GPU. Basic GPU driver off the ubuntu install, nvidia container toolkit. All works fine with Ultralytics container, container nbody benchmark, … seems to run the Hailo_2025_01 suite ok with same hailomz command. Also runs fine if I delete the --gpus all line from the shell script (at Optimize 0 so FPS is low running on my Rpi5)

Here is what happens:

(hailo_virtualenv) hailo@SGAIWS:/local/workspace$ ls /local/shared_with_docker
best4.onnx doc images
(hailo_virtualenv) hailo@SGAIWS:/local/workspace$ hailomz compile yolov11n --ckpt=/local/shared_with_docker/best4.onnx --hw-arch hailo8 --calib-path /local/shared_with_docker/images --classes 1 --performance
[info] No GPU chosen, Selected GPU 0
Traceback (most recent call last):
File “/local/workspace/hailo_virtualenv/bin/hailomz”, line 33, in
sys.exit(load_entry_point(‘hailo-model-zoo’, ‘console_scripts’, ‘hailomz’)())
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/main.py”, line 122, in main
run(args)
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/main.py”, line 101, in run
from hailo_model_zoo.main_driver import compile, evaluate, optimize, parse, profile
File “/local/workspace/hailo_model_zoo/hailo_model_zoo/main_driver.py”, line 10, in
from hailo_sdk_client import ClientRunner, InferenceContext
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_sdk_client/init.py”, line 29, in
import hailo_model_optimization # noqa: F401
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/init.py”, line 53, in
tf.constant(0.0) + 1.0
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py”, line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File “/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/tensorflow/python/framework/ops.py”, line 6002, in raise_from_not_ok_status
raise core._status_to_exception(e) from None # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InternalError: {{function_node _wrapped__AddV2_device/job:localhost/replica:0/task:0/device:GPU:0}} ‘cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast(stream), params, nullptr)’ failed with ‘CUDA_ERROR_INVALID_HANDLE’ [Op:AddV2] name:
(hailo_virtualenv) hailo@SGAIWS:/local/workspace$

Arthur_Reilly · December 15, 2025, 5:21pm

Hi Ivan. No problem, glad to help. It’s not easy to find a distilled set of instructions for this stuff. By the way, I don’t know whether you’re trying to train a customized model before compilation (that’s quite a journey on its own), but if not there is a precompiled, ready-to-use, version of yolov8m_seg available for download: https://hailo-model-zoo.s3.eu-west-2.amazonaws.com/ModelZoo/Compiled/v2.17.0/hailo8l/yolov8m_seg.hef

Topic		Replies	Views
Unable to compile Quantized HAR file to HEF for YOLOv8n model General	8	561	August 26, 2024
Raspberry Pi 5 + Hailo8L for instance segmentation using yolo v5 General raspberry-pi	18	715	February 24, 2025
Creating Custom Hef using DFC/Model Zoo Guides	8	4841	November 8, 2024
Convert to hef file for yolov11 model General hailort , error	8	498	July 11, 2025
HEF version not supported, you will need to provide a config file General hailo8	8	199	December 16, 2025