Training yolo models with imgsz >640px

user115 · March 17, 2025, 12:31am

Hi,
I am facing an issue when training YOLO models with imgsz values above 640px. My detections are correct, but the bounding boxes are shifted.

I came across this post, where someone faced the same problem: Trouble running custom yolo.hef models with imgz = 1088 - #2 by omria

Honestly, I don’t understand what is happening here and what @omria tried to explain and where I could apply this corresponding to the dataset I use (WIDERFACE), which contains images with wildly different dimensions.

What I understand so far:
According to Ultralytics, using higher imgsz values shouldn’t be a problem. The imgsz parameter during training just fixes one side of the input size of the model. For example, an imgsz=1024 simply means that the width is fixed (e.g. 1024 x XXXX), which would mean that the aspect ratio of the training data is preserved and that letterboxing and padding are automatically applied.

This is my basic standard, training command:

!yolo task=detect mode=train model=yolov8n.pt data=/root/datasets/datasets/UPSCALED_WIDERFACE_YOLO/yolodataset.yaml epochs=15 batch=4 imgsz=1024 plots=True device=0,1

My questions are:

Why do these issues (shifted bounding boxes) only become apparent when using imgsz values above 640px?
Is it possible that this discrepancy stems from a configuration in the toolchain that I possibly could change?
From my understanding, there can’t be a one size fits all ratio or padding value for varying training data like WIDERFACE.
- Why should this be considered in the compilation or inference process in the first place with a model trained on such varied data?
- And why does inference work correctly for a trained model with imgsz= 640 and wildly different training data, where also padding and letterboxing seem to be applied during training (standard ultralytics configuration)?
Has anyone encountered and resolved this issue by modifying the compile-time configuration or the post-processing pipeline?

I don’t get it…

The good news is that I could live with the status quo. Because resizing the higher output dimensions (camera feed) to the model’s expected input dimension is at the moment sufficient.

But still: I would appreciate clarification on what is happening here. Any material is welcome.

Thank you!

user115 · March 17, 2025, 1:18am

Solved. I found out that I had to change the model specific nms_config in the modelzoo. Thanks, @Omer.

Topic		Replies	Views
Trouble running custom yolo.hef models with imgz = 1088 General hailo8	1	47	January 27, 2025
Resize Layer compiled into the HEF file General hef , network	1	244	January 24, 2024
Yolov8m and Yolov8L HEF compilation errors General	1	143	January 12, 2025
Compiling onnx to hef trained on 1088 General dfc	2	25	January 20, 2025
Convert yolo11 square images model to a hef that works with rectangular images General	1	35	March 1, 2025

Training yolo models with imgsz >640px

Related topics