catastropic degradation after optimizing Yolov5 with VisDrone data

I’ve tested this using both hailomz and the dfc and experienced the same issue - complete degradation of the finetuned yolov5s.onnx model. The target hardware is hailo8-L

I more or less followed this guide on retraining yolov5 with a slight variation in that I only used 2 classes, not 3.
My yaml looked like below - I moved background to the end of the list (unlike shown in the guide) because it seemed to be causing some conflicts during training.

train: /workspace/local/datasets/coco-2017/train/images # train images (relative to 'path')  1500 images
val: /workspace/local/datasets/coco-2017/val/images # val images (relative to 'path')  1500 images
# number of classes
nc: 3
# class names
names: ['person', 'car', 'background']

There were some extra steps not mentioned in the guide, such as editing the alls and nms config, but overall the coco ‘finetuned’ model worked, best.onnx was getting around 45-50 mAP50 on the 2-class COCO subset val data.

Once int8 optimised I plotted the boxes on some images from visdrone2019 (since that is the end goal) and I got some valid detections (though noticed if I ignored the conf score I’d get huge boxes spanning the entie image, but only at <<0.05 conf)

Since it wasn’t finetuned on that dataset, I wasn’t expecting great results but it seems to work at least - so I simply adapted the dataset yaml to the VisDrone subset. Then finetuned on VisDrone instead of the coco subset:

/workspace/yolov5# python train.py --img 640 --batch 16 --epochs 50 --data ../local/data_files/VisDrone.yaml --weights yolov5s.pt --cfg models/yolov5s.yaml

this is the output - complete degradation it seems.

I’m quite puzzled as to how I can get some reasonable results with the quantized coco-retrained model, yet the VisDrone-retrained model is completely lobotomized even at FP-Optimization.

Has anyone here any experience with VisDrone, or more generally small object detection and YOLOv5 (or any other model) that suffered such degradation which was absent in the same model for a different dataset?

Any suggestions on next steps?

I explored pretty much all of the options suggested in the docs as follows:

    1. Make sure there are at least 1024 images in the calibration dataset and machine with a GPU. :white_check_mark:
    1. Validate the accuracy of the model using SDK_FP_OPTIMIZED emulator to ensure pre and post processing are used correctly. :white_check_mark: - FP optimized is also degraded much the same way
    1. Usage of BatchNormalization :white_check_mark: - nothing in the docs on this, but I can see in the model arch there are BatchNorm nodes.
    1. Run the layer noise analysis tool to identify the source of degradation. :white_check_mark: my entire model came in under 10dB, I still tried to ‘fix’ the worse performing layers, and ran again, which made the entire model even worse than before.
    1. If you have used compression_level, lower its value :cross_mark: haven’t tried this yet
    1. Configure higher optimization_level in the model script :white_check_mark: - level 4 is just as bad
    1. Configure 16-bit output. :cross_mark: - seems pointless given FP-Optimized results are as bad
    1. Configure 16-bit on specific layers that are sensitive for quantization. Note that if the activation function is :cross_mark: - as above
    1. Try to run with activation clipping :white_check_mark: - again, this does nothing to improve it
    1. Use more data and longer optimization process in Finetune :white_check_mark: - I’m using post_quantization_optimization(finetune, policy=enabled, learning_rate=0.0001, epochs=8, dataset_size=4000)
    1. Use different loss type in Finetune :cross_mark: - given how bad perofrmance is this is not high in my list of attempted fixes unless someone has experience with good returns here
    1. Use quantization aware training (QAT). For more information see QAT Tutorial :cross_mark: - at this point, thinking this is my only realistic option?

since posting this, I came across this comment regarding anchors - which reminded me that when when running python train.py ... I got the following warning

Analyzing anchors... Best Possible Recall (BPR) = 0.8737. Attempting to generate improved anchors, please wait..
WARNING: Extremely small objects found. 19703 of 106396 labels are < 3 pixels in width or height.
Running kmeans for 9 anchors on 106164 points...
thr=0.25: 0.9997 best possible recall, 6.98 anchors past thr
n=9, img_size=640, metric_all=0.439/0.794-mean/best, past_thr=0.517-mean: 3,5,  4,9,  6,8,  6,13,  10,11,  8,19,  17,18,  13,28,  26,40
Evolving anchors with Genetic Algorithm: fitness = 0.8067: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:02<00:00, 384.28it/s]
thr=0.25: 0.9997 best possible recall, 7.43 anchors past thr
n=9, img_size=640, metric_all=0.467/0.806-mean/best, past_thr=0.529-mean: 2,4,  3,7,  4,7,  5,11,  7,10,  6,16,  10,14,  11,23,  20,29
New anchors saved to model. Update model *.yaml to use these anchors in the future.

So the 40-50 mAP@50 I was seeing with these retrained yolov5s models were using these anchors. It is no suprise I get such terrible performance when converting to a HAR model, since it is using the anchors suited for the COCO dataset. I’m currently attempting to update the anchors, but not 100% sure how, e.g. should I update the base file, add an anchors: key to the yolov5.yaml?

Hey @natsayin_nahin,

You got it right about the anchors. Have you managed to generate them for your dataset?
You can use this script from the YOLOv5 repo to calculate the best anchors for your data.

To your question:
Yes, add the anchors: key with your values directly to your model’s YAML file (like yolov5.yaml). You don’t need to update the base file; just put the anchors into the specific YAML you’re using for your training or conversion.

Let me know if you get stuck or want to double-check your anchors setup!