Guide to using the DFC to convert a modified YoloV11 on Google Colab

trieut415 · December 8, 2024, 1:15am

This guide will walk you through the end to end process of how to retrain and compile YoloV11 from a PyTorch file into a HEF file executable by the Hailo accelerators using Google Colab. I decided to use colab as it’s a managed environment and should be accesible to everyone, although this does introduce some complexities. This also works for other Yolo versions, keep in mind that YoloV8 through YoloV11 use “equivalent architectures” at least in the eyes of the DFC.

Preparing our dataset: Yolo expects custom datasets to be formatted as such, so make sure you have a dataset that follows these criteria, I wrote my own python script to parse the dataset into the file structure indicated below:

dataset
- train
  - images
  - labels
- val
  - images
  - labels
- test
  - images
  - labels

Additionally, you may have to make a data.yaml file that points to your data

data_yaml_content = """
train: /path/to/dataset/train/
val: train: /path/to/dataset/val/

nc: 1 # number of classes
names: ['eartag'] #name of the class
"""

Generating a Custom YAML file
Since I’m simplifying the YoloV11 architecture, I made a custom YAML file, you can find the YAML files here. You can skip this section if you plan to use the default version of YoloV11.

I navigated to “11”, and copied yolo11.yaml locally, then make my modifications to the architecture there. Ensure that all the layer numbers match up, or else you will get a magnitude of errors when trying to retrain.

You can find more information here: Ultralytics

Installing YoloV11
Run these commands to install YoloV11:

#Installing the python package
!pip install ultralytics 

#Verifying the installation
!pip show ultralytics 
import ultralytics
ultralytics.checks()
from ultralytics import YOLO
from IPython.display import Image

Now to retrain:

from ultralytics import YOLO
detection_dataset = /path/to/dataset
"""
Note the model size (n,s,m,l,x) is automatically parsed
by Ultralytics. If you save your file as yolo11n.yaml, it will 
retrain the n version of YoloV11, same goes for yolo11s.yaml
which I will be using.
""" 

model = YOLO('/path/to/yolo11s.yaml') 
model.train(data=f"{detection_dataset}/data.yaml", epochs=100, imgsz=640)

Can also run validation on your model here:

!yolo task=detect mode=val \
model="/path/to/YoloRetrained/weights/best.pt" \
    data="/path/to/dataset.yaml"

Converting our .PT file into ONNX
Now that you have your model, lets export it to ONNX first, which is a format Hailo can compile.

!yolo export model=path/to/best.pt format=onnx  # export custom trained model

You can also do this by using:

import torch

# Load our model into our environment
checkpoint = torch.load('/path/to/best.pt')
model = checkpoint['model']

model = model.float()
model.eval()

# Dummy input in FP32
dummy_input = torch.randn(16, 3, 640, 640, dtype=torch.float)

# Export to ONNX
torch.onnx.export(
    model,
    dummy_input,
    "modified_run_3.onnx",
    export_params=True,
    opset_version=11,  # Adjust opset version if needed
    do_constant_folding=True,
    input_names=['input'],
    output_names=['output']
)
print("ONNX model exported successfully!")

Now, to verify that it’s a valid model:

import onnx
import onnxruntime as ort
import torch

# Load the ONNX model
onnx_model = onnx.load("modified_run_3.onnx")
onnx.checker.check_model(onnx_model)
print("ONNX model is valid!")

# Test the ONNX model with ONNX Runtime
dummy_input = torch.randn(16, 3, 640, 640).numpy()
ort_session = ort.InferenceSession("modified_run_3.onnx")
outputs = ort_session.run(None, {"input": dummy_input})
print(outputs[0])

You should get something like this:

Installing the DFC
Installing all necessary site packages

!sudo apt-get update
!sudo apt-get install -y python3-dev python3-distutils python3-tk libfuse2 graphviz libgraphviz-dev

# Will need a venv to install the DFC in
!pip install --upgrade pip virtualenv
!virtualenv my_env

Note: You will run into errors if you don’t use the venv, I was getting the error below further into the compilation process due to not being in a venv:

TypeError: expected str, bytes or os.PathLike object, not NoneType

Install the DFC file, I’m using the newest version:

#Installing the WHL file for Hailo DFC
!my_env/bin/pip install /content/hailo_dataflow_compiler-3.29.0-py3-none-linux_x86_64.whl

# Making sure it's installed properly
!my_env/bin/hailo --version

Should get something like this:

Identifying appropriate end node names
This step is extremely important, as it can be pretty frustrating to debug later on, so make sure you have the right end node names when parsing.

The first step the parsing process is to tell the tool which layers/nodes to expect the output from, aka (end nodes). Once we identify which end nodes are the correct ones by uploading our onnx file to Netron, we can continue with the first step of our compilation, parsing.

In our case, from YoloV8 to YoloV11 all use the “same architecture” in the eyes of the compiling tool, where for each feature map output, there are two end nodes per map. In our modified version of Yolo, we removed one of the feature maps, so now we have two remaining ones, meaning we have four total end nodes, instead of six.

To identify the Yolo’s end nodes, they are the nodes right before the post-processing operations at the very bottom of the model, shown below:

And here is the other branch leading to the concat operation at the end in the image above.

If you have any trouble with identifying them, the parsing tool should recommend end nodes to choose if you specify “output” as the end node, however, verify them with this process above, there are also several other articles on the Hailo forum denoting how to find the proper end nodes.

Step 1. Parsing our ONNX file
This is the script I used to parse my model, I saved this script locally:

from hailo_sdk_client import ClientRunner

# Define the ONNX model path and configuration
onnx_path = "/content/modified_run_3.onnx"
onnx_model_name = "modified_run_3_renamed"
chosen_hw_arch = "hailo8"  # Specify the target hardware architecture

# Initialize the ClientRunner
runner = ClientRunner(hw_arch=chosen_hw_arch)

# Use the recommended end node names for translation
end_node_names = [
    "/model.14/cv2.0/cv2.0.2/Conv",  # P4 regression_layer
    "/model.14/cv3.0/cv3.0.2/Conv",  # P4 cls_layer
    "/model.14/cv2.1/cv2.1.2/Conv",  # P5 regression_layer
    "/model.14/cv3.1/cv3.1.2/Conv",  # P5 cls_layer,
]

try:
    # Translate the ONNX model to Hailo's format
    hn, npz = runner.translate_onnx_model(
        onnx_path,
        onnx_model_name,
        end_node_names=end_node_names,
        net_input_shapes={"input": [16, 3, 640, 640]},  # Adjust input shapes if needed
    )
    print("Model translation successful.")
except Exception as e:
    print(f"Error during model translation: {e}")
    raise

# Save the Hailo model HAR file
hailo_model_har_name = f"{onnx_model_name}_hailo_model.har"
try:
    runner.save_har(hailo_model_har_name)
    print(f"HAR file saved as: {hailo_model_har_name}")
except Exception as e:
    print(f"Error saving HAR file: {e}")

Then I ran this with:

!my_env/bin/python translate_model.py

You should get something like this

Step 2: Optimizing our model
For this step, we will have to make a custom .alls file (model script), a NMS config, and calibrated data.

First, since Hailo has their own renaming process, we have to find the new names of our end nodes. Below is code to print out the dictionary of layers and operations stored by the .har file, find the end node names identified by output_layers_order.

from hailo_sdk_client import ClientRunner

# Load the HAR file
har_path = "modified_run_3_renamed_hailo_model.har"

runner = ClientRunner(har=har_path)

from pprint import pprint

try:
    # Access the HailoNet as an OrderedDict
    hn_dict = runner.get_hn()  # Or use runner._hn if get_hn() is unavailable
    print("Inspecting layers from HailoNet (OrderedDict):")

    # Pretty-print each layer
    for key, value in hn_dict.items():
        print(f"Key: {key}")
        pprint(value)
        print("\n" + "="*80 + "\n")  # Add a separator between layers for clarity

except Exception as e:
    print(f"Error while inspecting hn_dict: {e}")

Expected output:

Now, you can scroll through the output to verify which layers correspond to which end node in your ONNX model. In this dict, each layer is stored under a new name, and it’s original name is a key within the layer under ‘original_names’. You will need this when generating a NMS file for your model, you can find examples NMS configs here.

Here is how I did it:

import json
import os
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive/', force_remount=True)

# Updated NMS layer configuration dictionary
nms_layer_config = {
    "nms_scores_th": 0.3,
    "nms_iou_th": 0.7,
    "image_dims": [640, 640],
    "max_proposals_per_class": 25,
    "classes": 1,
    "regression_length": 16,
    "background_removal": False,
    "background_removal_index": 0,
    "bbox_decoders": [
        {
            "name": "bbox_decoder23",
            "stride": 16,
            "reg_layer": "conv23",
            "cls_layer": "conv26"
        },
        {
            "name": "bbox_decoder38",
            "stride": 32,
            "reg_layer": "conv38",
            "cls_layer": "conv41"
        }
    ]
}

# Path to save the updated JSON configuration
output_dir = "/save/path/"
os.makedirs(output_dir, exist_ok=True)  # Create the directory if it doesn't exist
output_path = os.path.join(output_dir, "nms_layer_config.json")

# Save the updated configuration as a JSON file
with open(output_path, "w") as json_file:
    json.dump(nms_layer_config, json_file, indent=4)

print(f"NMS layer configuration saved to {output_path}")

After this, I made calibration data for the optimization step.

import numpy as np
from PIL import Image
import os
from google.colab import drive

# Mounting Google Drive
drive.mount('/content/drive/', force_remount=True)

# Paths to directories and files
image_dir = '/input/path'
output_dir = '/path/to/output_dir'
os.makedirs(output_dir, exist_ok=True)  # Create the directory if it doesn't exist

# File paths for saving calibration data
calibration_data_path = os.path.join(output_dir, "calibration_data.npy")
processed_data_path = os.path.join(output_dir, "processed_calibration_data.npy")

# Initialize an empty list for calibration data
calib_data = []

# Process all image files in the directory
for img_name in os.listdir(image_dir):
    img_path = os.path.join(image_dir, img_name)
    if img_name.lower().endswith(('.jpg', '.jpeg', '.png')):
        img = Image.open(img_path).resize((640, 640))  # Resize to desired dimensions
        img_array = np.array(img) / 255.0  # Normalize to [0, 1]
        calib_data.append(img_array)

# Convert the calibration data to a NumPy array
calib_data = np.array(calib_data)

# Save the normalized calibration data
np.save(calibration_data_path, calib_data)
print(f"Normalized calibration dataset saved with shape: {calib_data.shape} to {calibration_data_path}")

# Scale the normalized data back to [0, 255]
processed_calibration_data = calib_data * 255.0

# Save the processed calibration data
np.save(processed_data_path, processed_calibration_data)
print(f"Processed calibration dataset saved with shape: {processed_calibration_data.shape} to {processed_data_path}")

Now, we’re finally ready to optimize it with this script, you can find sample .alls files here, I referenced yolo10nms.json as a base to create my alls file.

Note that the change_output_activation applied to my CLS_layer, you can go back and verify this with Netron like specified above.

import os
from hailo_sdk_client import ClientRunner

# Define your model's HAR file name
model_name = "modified_run_3_renamed"
hailo_model_har_name = f"{model_name}_hailo_model.har"
hailo_model_har_name = "modified_run_3_renamed_hailo_model.har"

# Ensure the HAR file exists
assert os.path.isfile(hailo_model_har_name), "Please provide a valid path for the HAR file"

# Initialize the ClientRunner with the HAR file
runner = ClientRunner(har=hailo_model_har_name)

# Define the model script to add a normalization layer
# Normalization for [0, 1] range
alls = \"\"\"
normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
change_output_activation(conv26, sigmoid)
change_output_activation(conv41, sigmoid)
nms_postprocess("/content/nms_layer_config.json", meta_arch=yolov8, engine=cpu)
performance_param(compiler_optimization_level=max)
\"\"\"

# Load the model script into the ClientRunner
runner.load_model_script(alls)

# Define a calibration dataset
# Replace 'calib_dataset' with the actual dataset you're using for calibration
# For example, if it's a directory of images, prepare the dataset accordingly
calib_dataset = "/content/processed_calibration_data.npy"

# Perform optimization with the calibration dataset
runner.optimize(calib_dataset)

# Save the optimized model to a new Quantized HAR file
quantized_model_har_path = f"{model_name}_quantized_model.har"
runner.save_har(quantized_model_har_path)

print(f"Quantized HAR file saved to: {quantized_model_har_path}")

Now running it:

!my_env/bin/python optimize_model.py

Expected output:

Compiling our model

Now for the final step, compilation, this is a local script I made.

from hailo_sdk_client import ClientRunner

# Define the quantized model HAR file
model_name = "modified_run_3_renamed"
quantized_model_har_path = f"{model_name}_quantized_model.har"

# Initialize the ClientRunner with the HAR file
runner = ClientRunner(har=quantized_model_har_path)
print("[info] ClientRunner initialized successfully.")

# Compile the model
try:
    hef = runner.compile()
    print("[info] Compilation completed successfully.")
except Exception as e:
    print(f"[error] Failed to compile the model: {e}")
    raise
file_name = f"{model_name}.hef"
with open(file_name, "wb") as f:
    f.write(hef)

Now run:

!my_env/bin/python compile_model.py

Expected output, something like this:

And there you go! That should be the entire compilation process with colab! For actual inference steps, I’m doing object detection with a Raspberry Pi camera, and picamera2 just released new examples with the Hailo accelerators that you can just run on startup. You can find them here.

marcory · December 8, 2024, 9:05pm

@trieut415 thanks for your post! Using the DFC in Colab definitely makes things more user-friendly. However, I encountered a problem when I tried to use it after installing DFC in my virtual environment.

After running the following to check the installation:

!venv/bin/hailo --version

I received the correct installation message:

[info] Current Time: 20:48:50, 12/08/24
[info] CPU: Architecture: x86_64, Model: AMD EPYC 7B12, Number Of Cores: 2, Utilization: 79.7%
[info] Memory: Total: 12GB, Available: 10GB
[info] System info: OS: Linux, Kernel: 6.1.85+
[info] Hailo DFC Version: 3.29.0
[info] HailoRT Version: Not Installed
[info] PCIe: No Hailo PCIe device was found

But when I moved ahead with the script:

from hailo_sdk_client import ClientRunner

I ran into this issue:

ModuleNotFoundError: No module named 'hailo_sdk_client'

It seems like the hailo_sdk_client module is either not installed or not properly recognized in my Colab environment. Any ideas what i did wrong? Is there are any additional installation steps that I might have missed? Thanks in advance!

trieut415 · December 9, 2024, 3:43am

Hello,

Sorry to hear that, did you make sure to run the script from within your venv? I just re-ran my cells just to see if I got this error. You should get hailo_sdk_client from installing:

!my_env/bin/pip install /content/hailo_dataflow_compiler-3.29.0-py3-none-linux_x86_64.whl

This being said, you also have to run your scripts locally from that version of python, so I just did this for the parsing, optimization, and compilation step.

with open("translate_model.py", "w") as f:
    f.write("""
# Script should be inside the comments
""")

Then:

!my_env/bin/python translate_model.py

Hopefully that solves your issue. Let me know if you need additional help.

marcory · December 9, 2024, 11:09am

Thanks again, this solved the issue

ahmedcr_net · December 21, 2024, 12:16am

could you upload the colab?

trieut415 · December 21, 2024, 12:32am

Sorry I can’t currently do that, there is some code and resources in my notebook that aren’t open-source yet since I’m doing this for a company project. I can create a general methodology to follow that doesn’t contain company specific references, will do that soon. However if you have any questions in the meantime I would be happy to answer it.

ahmedcr_net · December 21, 2024, 1:03am

I understand and appreciate your response.

Perhaps what we need the most is the COLAB section, after having the best.pt (the trained personal model), then converted to ONNX and then the HEF file.

trieut415 · December 21, 2024, 5:25am

Sounds good!

I’ll provide a general template in a day or two. Shouldn’t take me too long.

trieut415 · December 23, 2024, 6:08am

Let me know if there’s any other questions, I left most of the code there, mostly taken from the Hailo docs, but left the file names as placeholders.

Make sure to replace these when following the process, and an important note during the optimization step, I left my NMS config and .alls code in there, and you can find examples of those in the Hailo Model Zoo on github, my end nodes will be different than yours since I modified YoloV11.

ahmedcr_net · December 26, 2024, 3:42pm

Estimated Trieut415 once again thanking you for your time and attention

I attempt to process the default model of Ultralityc version YOLO11s (pre-trained with COCO) to convert to HEF. YOLO11s

1. Analyzing the ONNX model in NETRON, I obtained the structure.

2. Editing the code section, I think the final nodes are:
File: translate_model.py

“Define end nodes based on the YOLOv11 structure”
“Ensure two nodes per feature map output”

end_node_names = [
“/model.23/cv2.0/cv2.0.2/Conv”, # First feature map regression layer
“/model.23/cv3.0/cv3.0.2/Conv”, # First feature map classification layer
“/model.23/cv2.1/cv2.1.2/Conv”, # Second feature map regression layer
“/model.23/cv3.1/cv3.1.2/Conv”, # Second feature map classification layer
“/model.23/cv2.2/cv2.2.2/Conv”, # Third feature map regression layer
“/model.23/cv3.2/cv3.2.2/Conv” # Third feature map classification layer
]

3. Sample .alls files in the Hailo model zoo I adjusted as follows:

alls = “”"
normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
change_output_activation(yolov11_model/conv51, sigmoid)
change_output_activation(yolov11_model/conv54, sigmoid)
change_output_activation(yolov11_model/conv62, sigmoid)
change_output_activation(yolov11_model/conv65, sigmoid)
change_output_activation(yolov11_model/conv77, sigmoid)
change_output_activation(yolov11_model/conv80, sigmoid)
nms_postprocess(“/content/nms_layer_config.json”, meta_arch=yolov8, engine=cpu)
performance_param(compiler_optimization_level=max)
“”"

4. Creating a NMS layer config
Adjusted code was as follows:

“Updated NMS layer configuration dictionary”
nms_layer_config = {
“nms_scores_th”: 0.2,
“nms_iou_th”: 0.7,
“image_dims”: [640, 640],
“max_proposals_per_class”: 100,
“classes”: 80,
“regression_length”: 16,
“background_removal”: False,
“background_removal_index”: 0,
“bbox_decoders”: [
{
“name”: “bbox_decoder1”,
“stride”: 8,
“reg_layer”: “conv51”,
“cls_layer”: “conv54”
},
{
“name”: “bbox_decoder2”,
“stride”: 16,
“reg_layer”: “conv62”,
“cls_layer”: “conv65”
},
{
“name”: “bbox_decoder3”,
“stride”: 32,
“reg_layer”: “conv77”,
“cls_layer”: “conv80”
}
]
}

5. !my_env/bin/python optimize_model.py
When I run the code, I get:

[info] ParsedPerformanceParam command, setting optimization_level(max=2)
[info] Loading model script commands to yolov11_model from string
[info] ParsedPerformanceParam command, setting optimization_level(max=2)
[info] Starting Model Optimization
[warning] Reducing optimization level to 0 (the accuracy won’t be optimized and compression won’t be used) because there’s less data than the recommended amount (1024), and there’s no available GPU
[warning] Running model optimization with zero level of optimization is not recommended for production use and might lead to suboptimal accuracy results
[info] Model received quantization params from the hn
[info] Starting Mixed Precision
[info] Mixed Precision is done (completion time is 00:00:01.31)
[info] LayerNorm Decomposition skipped
[info] Starting Statistics Collector
[info] Using dataset with 55 entries for calibration
Calibration: 100% 55/55 [02:35<00:00, 2.82s/entries]
[info] Statistics Collector is done (completion time is 00:02:38.62)
[info] Starting Fix zp_comp Encoding
[info] Fix zp_comp Encoding is done (completion time is 00:00:00.00)
[info] Starting Matmul Equalization
Traceback (most recent call last):
File “/content/optimize_model.py”, line 28, in
runner.optimize(calib_dataset)
File “/content/my_env/lib/python3.10/site-packages/hailo_sdk_common/states/states.py”, line 16, in wrapped_func
return func(self, *args, **kwargs)
File “/content/my_env/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py”, line 2093, in optimize
self._optimize(calib_data, data_type=data_type, work_dir=work_dir)
File “/content/my_env/lib/python3.10/site-packages/hailo_sdk_common/states/states.py”, line 16, in wrapped_func
return func(self, *args, **kwargs)
File “/content/my_env/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py”, line 1935, in _optimize
self._sdk_backend.full_quantization(calib_data, data_type=data_type, work_dir=work_dir)
File “/content/my_env/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py”, line 1045, in full_quantization
self._full_acceleras_run(self.calibration_data, data_type)
File “/content/my_env/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py”, line 1229, in _full_acceleras_run
optimization_flow.run()
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 306, in wrapper
return func(self, *args, **kwargs)
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 326, in run
step_func()
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 250, in wrapped
result = method(*args, **kwargs)
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/tools/subprocess_wrapper.py”, line 124, in parent_wrapper
func(self, *args, **kwargs)
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 345, in step1
self.core_quantization()
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 250, in wrapped
result = method(*args, **kwargs)
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 402, in core_quantization
self._matmul_equalization()
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py”, line 250, in wrapped
result = method(*args, **kwargs)
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py”, line 582, in _matmul_equalization
algo.run()
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/optimization_algorithm.py”, line 50, in run
return super().run()
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/algorithm_base.py”, line 150, in run
self._run_int()
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/matmul_equalization/matmul_equalization.py”, line 123, in _run_int
self.equalize_input_paths(normal_subflow, transpose_subflow, layer)
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/matmul_equalization/matmul_equalization.py”, line 154, in equalize_input_paths
set_scales_forward(self._model, normal_flow, layer.groups, factors)
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/matmul_equalization/matmul_equalization.py”, line 332, in set_scales_forward
scales, zp_points = get_scales_output(in_layer, groups, factors, transpose)
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/matmul_equalization/matmul_equalization.py”, line 359, in get_scales_output
scales, zp = find_the_best_zp(
File “/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/matmul_equalization/optimal_zp_finder.py”, line 237, in find_the_best_zp
result = minimize(objective, x0, method=“SLSQP”, constraints=cons)
File “/content/my_env/lib/python3.10/site-packages/scipy/optimize/_minimize.py”, line 722, in minimize
res = _minimize_slsqp(fun, x0, args, jac, bounds,
File “/content/my_env/lib/python3.10/site-packages/scipy/optimize/_slsqp_py.py”, line 431, in _minimize_slsqp
slsqp(m, meq, x, xl, xu, fx, c, g, a, acc, majiter, mode, w, jw,
ValueError: failed to initialize intent(inout) array – expected elsize=8 but got 4

I did not achieve the expected result in this section of the code, perhaps it is due to an error in the previous phases that I cannot identify

I appreciate any recommendations or insights you can provide in advance to achieve the conversion of YOLO11s to the HEF structure.

Unfortunately, HAILO’s support has been slow to propose a solution; they need to improve this aspect.

trieut415 · December 26, 2024, 9:49pm

Give me a little do verify the full conversion, I’ll adjust the notebook to work with the default version of YoloV11.

trieut415 · December 28, 2024, 2:28am

Found your issue, in the .alls file, you should only be using change_output_activation with the cls layers. I adjusted the notebook to reflect the changes, but here is my .alls.

normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
change_output_activation(conv54, sigmoid)
change_output_activation(conv65, sigmoid)
change_output_activation(conv80, sigmoid)
nms_postprocess("/content/nms_layer_config.json", meta_arch=yolov8, engine=cpu)
performance_param(compiler_optimization_level=max)

ahmedcr_net · December 29, 2024, 11:41am

Dear @trieut415 . I appreciate your response again

I have included the correction you suggested in the previous post. It seems that something is still wrong.

I could observe that the RAM consumption in Colab increased significantly during the calibration process with the images. Do you run Colab with a CPU, GPU, or v2-8 TPU environment? In CPU and GPU mode, the RAM consumption reached the maximum 12GB and did not complete the process, which raises the question of what should be the appropriate number of images for calibration. I used 750 images (COCO) in the v2-8 TPU environment, which allows for a lot of RAM.

The error message I keep receiving is:

!my_env/bin/python optimize_model.py

[info] ParsedPerformanceParam command, setting optimization_level(max=2)
[info] Loading model script commands to yolov11_model from string
[info] ParsedPerformanceParam command, setting optimization_level(max=2)
[info] Starting Model Optimization
[warning] Reducing optimization level to 0 (the accuracy won't be optimized and compression won't be used) because there's less data than the recommended amount (1024), and there's no available GPU
[warning] Running model optimization with zero level of optimization is not recommended for production use and might lead to suboptimal accuracy results
[info] Model received quantization params from the hn
[info] Starting Mixed Precision
[info] Mixed Precision is done (completion time is 00:00:01.26)
[info] LayerNorm Decomposition skipped
[info] Starting Statistics Collector
[info] Using dataset with 64 entries for calibration
Calibration: 100% 64/64 [02:51<00:00,  2.68s/entries]
[info] Statistics Collector is done (completion time is 00:02:54.81)
[info] Starting Fix zp_comp Encoding
[info] Fix zp_comp Encoding is done (completion time is 00:00:00.00)
[info] Starting Matmul Equalization
Traceback (most recent call last):
  File "/content/optimize_model.py", line 26, in <module>
    runner.optimize(calib_dataset)
  File "/content/my_env/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/content/my_env/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 2093, in optimize
    self._optimize(calib_data, data_type=data_type, work_dir=work_dir)
  File "/content/my_env/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/content/my_env/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 1935, in _optimize
    self._sdk_backend.full_quantization(calib_data, data_type=data_type, work_dir=work_dir)
  File "/content/my_env/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 1045, in full_quantization
    self._full_acceleras_run(self.calibration_data, data_type)
  File "/content/my_env/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 1229, in _full_acceleras_run
    optimization_flow.run()
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 306, in wrapper
    return func(self, *args, **kwargs)
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 326, in run
    step_func()
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped
    result = method(*args, **kwargs)
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/tools/subprocess_wrapper.py", line 124, in parent_wrapper
    func(self, *args, **kwargs)
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 345, in step1
    self.core_quantization()
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped
    result = method(*args, **kwargs)
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 402, in core_quantization
    self._matmul_equalization()
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/tools/orchestator.py", line 250, in wrapped
    result = method(*args, **kwargs)
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 582, in _matmul_equalization
    algo.run()
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/optimization_algorithm.py", line 50, in run
    return super().run()
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/algorithm_base.py", line 150, in run
    self._run_int()
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/matmul_equalization/matmul_equalization.py", line 123, in _run_int
    self.equalize_input_paths(normal_subflow, transpose_subflow, layer)
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/matmul_equalization/matmul_equalization.py", line 154, in equalize_input_paths
    set_scales_forward(self._model, normal_flow, layer.groups, factors)
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/matmul_equalization/matmul_equalization.py", line 332, in set_scales_forward
    scales, zp_points = get_scales_output(in_layer, groups, factors, transpose)
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/matmul_equalization/matmul_equalization.py", line 359, in get_scales_output
    scales, zp = find_the_best_zp(
  File "/content/my_env/lib/python3.10/site-packages/hailo_model_optimization/algorithms/matmul_equalization/optimal_zp_finder.py", line 237, in find_the_best_zp
    result = minimize(objective, x0, method="SLSQP", constraints=cons)
  File "/content/my_env/lib/python3.10/site-packages/scipy/optimize/_minimize.py", line 722, in minimize
    res = _minimize_slsqp(fun, x0, args, jac, bounds,
  File "/content/my_env/lib/python3.10/site-packages/scipy/optimize/_slsqp_py.py", line 431, in _minimize_slsqp
    slsqp(m, meq, x, xl, xu, fx, c, g, a, acc, majiter, mode, w, jw,
ValueError: failed to initialize intent(inout) array -- expected elsize=8 but got 4

In the Colab notebook I have included the import of the pre-trained YOLO11s model (Ultralytic), as well as the DFC installer and the COCO images I mentioned.

If you have any other idea about the problem, I would like to try it.

marcory · December 29, 2024, 5:50pm

@ahmedcr_net , I had the same issue.

Using a high-RAM runtime during execution helped me resolve this issue.

My calibrationset contains 1024 images, as recommended in the DFC tutorial (but it seems that 64 images is the minimum).

Furthermore, I ran the Colab notebook without a GPU or TPU, as it appears Colab isn’t utilizing them.

trieut415 · December 29, 2024, 6:21pm

Thanks for this response, I forgot to mention I’ve been running this all on the T4 High-Ram. However, there’s still a small issue of the optimization process where the DFC tool doesn’t recognize the GPU, from this log in the optimization step:

[warning] Reducing optimization level to 0 (the accuracy won’t be optimized and compression won’t be used) because there’s less data than the recommended amount (1024), and there’s no available GPU.

I haven’t had time to look into this yet, so just a heads up. However in my testing on with the RPI AI Camera and HQ Camera, the models themselves still work and provide generally accurate results, but will look into that when I get a chance.

ahmedcr_net · January 2, 2025, 5:57pm

Dear @marcory @trieut415

First of all, wishing you a successful New Year 2025

The COLAB environment I ultimately used was T4 High-Ram.
I have included the correction suggested by @trieut415 Guide to using the DFC to convert a modified YoloV11 on Google Colab - #14 by trieut415; I also had to make sure that the versions of Numpy were ==1.23.5 and Scipy ==1.10.1, adding the code:

!my_env/bin/pip uninstall -y numpy scipy
!my_env/bin/pip install numpy==1.23.3 scipy==1.10.1

The section of the code where it was failing, this time completed the process.

!my_env/bin/python optimize_model.py

But now I observe that Colab is unstable, even though it continues processing the code:

This image is constantly displayed, and after refreshing the webpage, it shows the progress again, but at some point, it freezes, and the environment appears empty (loses the information).

kasun.thushara.1800 · January 2, 2025, 5:57pm

This is superb work! Hats off! Using the information above, I tried to create a Colab notebook for YOLOv8. I modified the endpoints in nms.json and used Roboflow to fetch the data. However, during the final compilation steps, my Colab window kept showing an “Aw, Snap! Memory Error.” Have you ever experienced this? I checked my Task Manager, and the memory usage for Chrome was quite high (4–5 GB) while compiling.

trieut415 · January 2, 2025, 6:27pm

@ahmedcr_net @kasun.thushara.1800

Hey guys, I got this same error too, this isn’t actually an error with the conversion process, or the CPU/GPU used to run it. It’s due to the output of the final compilation step of compile_model.py, which makes your browser unstable due to such a large amount of output being stored in memory running from that one code block in the final compilation step. But if you got to this step that means everything is working!

Also to your point of your data being lost, Colab finishes the entire cell before terminating your runtime, so even if you are disconnected from the webpage, the cell still runs and your data isn’t lost until that cell finishes executing. Then after your cell is done executing, there is a small window of inactive time that google gives you before actually terminating your runtime, which then deletes your environment.

If you continously reload the webpage, which is what I did until the conversion process was completed, which was awfully tedious, I was just nervous and wanted to make sure the output was right. Everytime I reloaded the page you could see that the code cell was still running, and I was getting new output, then it would crash, and I repeated this until it was finished.

So to “solve” this, there’s a couple workarounds:

You could add a command at the end of the script compile_model.py that saves it to your google drive !cp model_name.hef path/to/drive/folder so then you could just let it run without “monitoring it”.
Do what I did, reload the webpage a bunch of times, then navigate to /content/model_name.hef and download it from file manager in colab.
I didn’t really look into suppressing output, it didn’t work by using the little dropdown with the three buttons when a code cell runs. But I’m sure there’s some other CLI argument to suppress it from Hailo’s SDK.

trieut415 · January 2, 2025, 6:51pm

@ahmedcr_net @kasun.thushara.1800

Forgot to mention one small thing, all your information/data isn’t lost during the compilation, colab just doesn’t refresh the runtime variables in the file manager tab when you reload the webpage until after the cell is done executing, but they are still there, just a problem with the UI.

Then when it is done running, your .hef should pop up in the file manager along with all of your other old files!

Let me know if anything else is needed, but you guys should pretty much be through the entire process. You guys are running into a lot of issues that I also struggled with, this was a super long process for me to figure out of the caveats with doing it through colab.

The one thing I have yet to look into / would be helpful to have more heads look into it, would be during the optimization step, using @ahmedcr_net 's screenshot.

The warning here where it reduces optimization to 0 because there’s no available GPU, not sure why it’s not recognized during the process. I haven’t had much time to look into this, so other insights would be helpful.

ahmedcr_net · January 6, 2025, 10:18am

Hello everyone.

Regarding the issue observed in colab.

This happens when we run the code !my_env/bin/python compile_model.py, but if instead of directly running the content of the compile.py code, the following code is executed

import os
import sys
import logging
import psutil
import time
import datetime
from threading import Thread
from pathlib import Path
from IPython.display import clear_output, display, Image
import subprocess
import queue
import shutil
import glob

# Configuración de directorios y logs
class Config:
    def __init__(self):
        self.log_dir = Path('logs')
        self.timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
        self.plots_dir = self.log_dir / f'plots_{self.timestamp}'
        self.log_file = self.log_dir / f'compilation_log_{self.timestamp}.txt'

        # Crear directorios necesarios
        self.log_dir.mkdir(exist_ok=True)
        self.plots_dir.mkdir(exist_ok=True)

        # Configurar logging
        logging.basicConfig(
            level=logging.INFO,
            format='%(asctime)s - %(levelname)s - %(message)s',
            handlers=[
                logging.FileHandler(self.log_file),
                logging.StreamHandler()
            ]
        )

class ResourceMonitor:
    def __init__(self):
        self.config = Config()
        self.compile_output = []

    @staticmethod
    def format_bytes(bytes_value):
        for unit in ['B', 'KB', 'MB', 'GB']:
            if bytes_value < 1024:
                return f"{bytes_value:.2f}{unit}"
            bytes_value /= 1024
        return f"{bytes_value:.2f}TB"

    def log_output(self, line):
        with open(self.config.log_file, 'a', encoding='utf-8') as f:
            timestamp = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
            f.write(f"[{timestamp}] {line}\n")

    def create_progress_bar(self, percentage, length=25):
        filled = int(length * percentage / 100)
        return f"{'█' * filled}{'░' * (length - filled)}"

    def print_status(self, cpu, mem, disk):
        clear_output(wait=True)
        current_time = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')

        # Cabecera
        print(f"\n{'═' * 50}")
        print(f"  MONITOR DE RECURSOS - {current_time}")
        print(f"{'═' * 50}\n")

        # Estado de Compilación
        if self.compile_output:
            print("ESTADO DE COMPILACIÓN (últimas 5 líneas):")
            print(f"{'─' * 50}")
            for line in self.compile_output[-5:]:
                print(f"  {line}")
            print(f"{'─' * 50}\n")

        # Uso de Recursos
        print("USO DE RECURSOS:")
        print(f"CPU:  {self.create_progress_bar(cpu)} {cpu:>5.1f}%")
        print(f"RAM:  {self.create_progress_bar(mem.percent)} {mem.percent:>5.1f}%")
        print(f"DISK: {self.create_progress_bar(disk.percent)} {disk.percent:>5.1f}%")

        # Detalles de Memoria
        print("\nDETALLES DE MEMORIA:")
        print(f"  Total: {self.format_bytes(mem.total):>10}")
        print(f"  Usado: {self.format_bytes(mem.used):>10}")
        print(f"  Libre: {self.format_bytes(mem.available):>10}")

    def monitor_resources(self):
        while True:
            try:
                cpu = psutil.cpu_percent(interval=1)
                mem = psutil.virtual_memory()
                disk = psutil.disk_usage('/')
                self.print_status(cpu, mem, disk)
            except Exception as e:
                logging.error(f"Error en monitoreo: {str(e)}")
            time.sleep(2)

    def read_output(self, pipe, queue):
        try:
            for line in pipe:
                line = line.strip()
                if line:  # Solo procesar líneas no vacías
                    queue.put(line)
                    self.log_output(line)
                    self.compile_output.append(line)
                    if len(self.compile_output) > 100:  # Mantener solo últimas 100 líneas
                        self.compile_output.pop(0)
        finally:
            if hasattr(pipe, 'close'):
                pipe.close()

    def run(self):
        try:
            print("\n=== INICIANDO MONITOREO DE SISTEMA ===")
            print(f"Logs: {self.config.log_file}")
            print(f"Plots: {self.config.plots_dir}\n")

            output_queue = queue.Queue()

            # Iniciar thread de monitoreo
            monitor_thread = Thread(target=self.monitor_resources, daemon=True)
            monitor_thread.start()

            # Configurar y ejecutar proceso de compilación
            env_path = Path('my_env/bin/python')
            compile_script = Path('compile_model.py')

            if not env_path.exists():
                raise FileNotFoundError(f"Entorno virtual no encontrado en {env_path}")
            if not compile_script.exists():
                raise FileNotFoundError(f"Script no encontrado en {compile_script}")

            process = subprocess.Popen(
                [str(env_path), str(compile_script)],
                stdout=subprocess.PIPE,
                stderr=subprocess.PIPE,
                text=True,
                bufsize=1
            )

            # Iniciar threads para stdout y stderr
            for pipe in [process.stdout, process.stderr]:
                Thread(target=self.read_output,
                      args=(pipe, output_queue),
                      daemon=True).start()

            # Esperar a que termine el proceso
            return_code = process.wait()

            if return_code != 0:
                print(f"\nError en la compilación (código {return_code})")
            else:
                print("\nCompilación completada exitosamente")

        except KeyboardInterrupt:
            print("\nDetención manual del monitoreo")
            if 'process' in locals():
                process.terminate()
        except Exception as e:
            logging.error(f"Error en la ejecución: {str(e)}")
            raise
        finally:
            print(f"\nLogs guardados en: {self.config.log_file}")
            print(f"Gráficos guardados en: {self.config.plots_dir}")
            sys.exit(0)

if __name__ == "__main__":
    monitor = ResourceMonitor()
    monitor.run()

You will obtain the following result.

This way Colab does not hang, and you will be able to see how the iterations progress to obtain the model in the HEF (Hailo Executable Format) format

I would recommend installing the Tab Auto Refresh extension in your Chrome browser, which allows the browser to refresh/reload automatically at certain intervals to prevent Colab from disconnecting.

In my case, I set the browser to reload every 5 minutes

Topic		Replies	Views
Compilation of yolo11n to hef fromat General dfc , raspberry-pi , hailo8 , compilation	3	573	January 23, 2025
How to run custom model(Yolov8) in hailo8l(raspberrypi5+hailo8l) General dfc , hailo8	12	4882	September 16, 2024
YoloV11 On Raspberry Pi General raspberry-pi , hailo8	3	495	March 7, 2025
How to modify yolov8n.yaml file when compiling hailomz General dfc , hailo8 , error	2	458	November 2, 2024
How to interpret the YOLO outputs? General	5	187	April 16, 2025

Guide to using the DFC to convert a modified YoloV11 on Google Colab

Related topics