Need Urgent Conversion

Hello, first of all I want to apologize for making this unusual request to anyone. I am in a desperate state where I need to finish my engineering graduation project before it’s due this weekend. All I want to is to convert my .onnx model that is based on a YOLOv5 to a .hef model.

I have been trying to convert it for the past week and I could not even reach the conversion stage, due to my limited hardware, I am even limited by internet connection. I know I am asking for a lot, but I would wish for anyone to create a .hef model from this dataset: Drone Dataset by Ayushkumawat Object Detection Dataset and Pre-Trained Model by Drone Detection

Here’s the YOLOv8 .onnx model: original.onnx - Google Drive

I know it’s considered begging at this point, but I would’ve not reached this state if I weren’t so desperate.

So please, if there’s anyone that can do this even in their spare time, I will be in their debt, and I will be extremely thankful.

Thank you.

Hi @kryptonicrevolution
If you provide the pytorch checkpoint (.pt) file. I can convert it to a .hef file for you. We have developed a compiler stack to convert the pt file to the hef file. I will create a guide soon for this process so everyone else can use the same process.

3 Likes

I have sent you a dm, I will also provide the link to the checkpoint file here:

This is based on YOLOv5, not YOLOv8, sorry for the typo.

Hi @kryptonicrevolution
The compilation tool that we have, works with yolov8 architecture at this point.
Do you need yolov5s for any specific reason or yolov8n would also work for your purposes?

YOLOv8 would also work! Here is the link for the .pt model: originalYV8.pt - Google Drive

Hi @kryptonicrevolution
What is your device? Hailo8 or Hailo8L?

1 Like

The device I am using is the Hailo-8L.

Hi @kryptonicrevolution
I have compiled the yolov8s pt file to hef.
drone hef

The script that I used is straightforward to automate. Here is the script


import argparse
from hailo_sdk_client import ClientRunner


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Run backend test')
    parser.add_argument('--device-type', choices=["hailo8", "hailo8r", "hailo8l", "hailo15h", "hailo15m", "hailo15l", "hailo10h"], default="hailo8", help='Device types array')
    parser.add_argument('--model-path', type=str, help='Device types array')
    parser.add_argument('--output-path', type=str, default=None, help='Device types array')
    parser.add_argument('--end-node-names', nargs='+', help='array of output/end node names')
    parser.add_argument('--calibration-npy-path', type=str, default="compiler/runtime/hailo_calibration_data.npy", help='Image width')
    parser.add_argument('--optimization-level', type=int, default=0, help='Optimization level')
    parser.add_argument('--compression-level', type=int, default=0, help='Compression level')
    parser.add_argument('--compiler-optimization-level', type=int, default=2, help='Compiler optimization level -- max = 2')

    args = parser.parse_args()

    runner = ClientRunner(hw_arch=args.device_type)
    # extract extension from model path
    ext = args.model_path.rsplit('.', maxsplit=1)[-1]
    model_name = args.model_path.split('/')[-1].split('.')[0]
    if args.output_path is None:
        # replace extension with .hef
        args.output_path = args.model_path.replace(f".{ext}", ".hef")

    if ext == "onnx":
        hn, npz = runner.translate_onnx_model(
            model=args.model_path,
            net_name=model_name,
            end_node_names=args.end_node_names,
        )
    elif ext == "tflite":
        hn, npz = runner.translate_tf_model(
            model_path=args.model_path,
            net_name=model_name,
            end_node_names=args.end_node_names,
        )

    batch_size = 32
    alls_lines = [
        "normalization_in = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])\n",
        f"model_optimization_flavor(optimization_level={args.optimization_level}, compression_level={args.compression_level}, batch_size={batch_size})\n",
        f"performance_param(compiler_optimization_level={args.compiler_optimization_level})\n",
    ]

    runner.load_model_script("".join(alls_lines))
    runner.optimize_full_precision(args.calibration_npy_path)
    runner.optimize(args.calibration_npy_path)
    hef = runner.compile()

    with open(args.output_path, "wb") as f:
        f.write(hef)

However, the Yolov8 postprocessor is not part of it.
When I tried to compile it with the postprocessor, I got this error.

hailo_sdk_client.sdk_backend.sdk_backend_exceptions.AllocatorScriptParserException: Cannot infer bbox conv layers automatically. Please specify the bbox layer in the json configuration file.

which means NMS needs a config JSON similar to the following link which we have not yet integrated into our compile process

https://github.com/hailo-ai/hailo_model_zoo/blob/9f1bb27570757e6398a6bfe44545e5f02a26e017/hailo_model_zoo/cfg/postprocess_config/yolov8s_bbox_decoding_only_nms_config.json

At this point, I tested the model in our cloud inference and it works.


Here is a link to try the cloud inference.

At this point, the model provides 6 outputs, 3 bounding box heads, and 3 corresponding class probabilities. Given these outputs, a yolov8 postprocessor is needed to provide the final detection results.
I hope this will help.

Thank you so much for the help! I had coded myself a python program to leverage the .hef model along with the .json files to use an IP camera as video feed for detection and tracking:

import cv2
import time
import degirum as dg
import degirum_tools
from copy import deepcopy

# --- Configuration ---
THRESHOLD = 1  # Minimum confidence threshold (adjust as needed)

# --- Model and file locations ---
model_name = "yolov8s_drone--640x640_quant_hailort_hailo8l_1"
zoo_url = "/home/hasan/yolov8s_drone--640x640_quant_hailort_hailo8l_1"
inference_host_address = "@local"
token = ""  # Not required for local inference

# --- Load the custom drone model ---
model = dg.load_model(
    model_name=model_name,
    inference_host_address=inference_host_address,
    zoo_url=zoo_url,
    token=token
)

# --- Define the IP camera URL ---
video_source = "http://172.20.10.4:4747/video"

# --- Function to test stream connectivity ---
def test_stream(url):
    cap = cv2.VideoCapture(url)
    if not cap.isOpened():
        cap.release()
        return False
    ret, _ = cap.read()
    cap.release()
    return ret

# --- Helper function for filtering detections ---
def filter_detections(detections, threshold):
    filtered = []
    for detection in detections:
        # Uncomment the following line to debug the keys of each detection:
        # print("Detection keys:", detection.keys())
        # Try "score" first; if not found, try "confidence"
        score = detection.get("score", detection.get("confidence", 0))
        if score >= threshold:
            filtered.append(detection)
    return filtered

# Initial connectivity check
if not test_stream(video_source):
    print(f"Error: Unable to connect to IP camera at {video_source}")
    exit(1)

# --- Main loop with reconnection and threshold filtering ---
while True:
    try:
        with degirum_tools.Display("Drone Tracking") as output_display:
            print("Connected to IP camera. Starting stream...")
            # The predict_stream function reads frames from the IP camera and returns inference result objects.
            for inference_result in degirum_tools.predict_stream(model, video_source):
                # Use deepcopy to avoid modifying a read-only property.
                filtered_inference_result = deepcopy(inference_result)
                if hasattr(inference_result, "results"):
                    # Apply threshold filtering using the helper function.
                    filtered_inference_result.__dict__['results'] = filter_detections(inference_result.results, THRESHOLD)
                output_display.show(filtered_inference_result)
                # Allow the user to exit by pressing 'q' or 'x'
                if cv2.waitKey(1) & 0xFF in [ord('q'), ord('x')]:
                    raise KeyboardInterrupt
    except KeyboardInterrupt:
        print("Exiting on user request.")
        break
    except Exception as e:
        print("Stream error encountered:", e)
        print("Attempting to reconnect in 2 seconds...")
        time.sleep(2)
        continue

Thing is, the .hef model works perfectly, but when I run it, it displays random drone detection boxes, does this have something to do with the .hef model? or is it because I used a small model like yolov8s?

Hi @kryptonicrevolution
Are you filtering out all detections with score <1?

Well actually I did also try thresholds of 0.5, 0.6, up to 1, but it’s still the same issue, as if the threshold value doesn’t do anything or contribute anything. I thought by implementing a threshold variable it might help in reducing false positives. But for some reason it still did not work.

Maybe it has something to do with the .hef file itself?

Hi @kryptonicrevolution
Before testing on video streams, did you test it on a few images to make sure model is integrated correctly? Probably good to test few images side by side on pytorch and the hef file. Also, if you have a validation dataset, you can evaluate the map to see if accuracy of compiled model has not degraded too much from original performance.

I haven’t tried it on images yet, I will shortly, and as for the validation images, I do have a validation dataset, but how will I be able to evaluate the map and see the accuracy of the compiled model? Let’s say that is degraded, how will I be able to improve it?

Hi @kryptonicrevolution
We will soon publish a user guide on evaluating the model so that you can test the accuracy. From our experience, object detection models do not lose much accuracy when compiled with proper settings.

I see, thanks for the help!

Also, will the detection accuracy increase if the .pt model was originally a yolov8m instead of a yolov8s? Since you guys have compiled the yolov8s version for this drone detection project, is it possible to edit the .json or the code that i provided to increase the detection by any way?

Hi @kryptonicrevolution
When you trained your yolo8s model, what was your final map50 and map50:95? this should give some idea of how good the model is supposed to be.

I actually don’t know, I just trained it with ultralytics hub and a dataset from roboflow universe. I don’t even know what these terms mean :sweat_smile:

Here’s the link to the roboflow universe dataset link:

I am extremely new to machine learning and object detection, sadly I do not have much time to learn most of this stuff due to the extreme tight time schedule and that I need to finish a working prototype within the next few days :confused: