The detection performance of the YOLOV8S model that self trained two categories and converted it to HEF is very poor. I have put the conversion process at a later stage and hope to receive help. Thank you very much
My compilation command is as follows:
hailomz compile --ckpt best.onnx --calib-path phone_clothes_val/ --yaml ./hailo_model_zoo/hailo_model_zoo/cfg/networks/yolov8s.yaml --classes 2 --hw-arch hailo8
The content of the yolov8.alls file is as follows:
normalization1 = normalization([127.5, 127.5, 127.5], [128.0, 128.0, 128.0])
change_output_activation(conv42, sigmoid)
change_output_activation(conv53, sigmoid)
change_output_activation(conv63, sigmoid)
model_optimization_flavor(optimization_level=2)
nms_postprocess(“../../postprocess_config/yolov8s_nms_config.json”, meta_arch=yolov8, engine=cpu)
The content of the yolov8s.yaml file is as follows:
base:
- base/yolov8.yaml
postprocessing:
device_pre_post_layers:
nms: true
hpp: true
network:
network_name: yolov8s
paths:
network_path:- /home/bbibm/Project/03_mading/ultralytics-main/runs/detect/phone_clothes/weights/best.onnx
alls_script: yolov8s.alls
url: https://hailo-model-zoo.s3.eu-west-2.amazonaws.com/ObjectDetection/Detection-COCO/yolo/yolov8s/2023-02-02/yolov8s.zip
info:
task: object detection
input_shape: 640x640x3
output_shape: 2x5x100
operations: 28.6G
parameters: 11.2M
framework: pytorch
training_data: coco train2017
validation_data: coco val2017
eval_metric: mAP
full_precision_result: 44.58
source: GitHub - ultralytics/ultralytics: Ultralytics YOLO 🚀
license_url: ultralytics/LICENSE at main · ultralytics/ultralytics · GitHub
license_name: AGPL-3.0
parser:
nodes:- null
-
- /model.22/cv2.0/cv2.0.2/Conv
- /model.22/cv3.0/cv3.0.2/Conv
- /model.22/cv2.1/cv2.1.2/Conv
- /model.22/cv3.1/cv3.1.2/Conv
- /model.22/cv2.2/cv2.2.2/Conv
- /model.22/cv3.2/cv3.2.2/Conv
- /home/bbibm/Project/03_mading/ultralytics-main/runs/detect/phone_clothes/weights/best.onnx
The content of the yolov8s_nms_config.json file is as follows:
{
“nms_scores_th”: 0.2,
“nms_iou_th”: 0.7,
“image_dims”: [
640,
640
],
“max_proposals_per_class”: 100,
“classes”: 80,
“regression_length”: 16,
“background_removal”: false,
“background_removal_index”: 0,
“bbox_decoders”: [
{
“name”: “bbox_decoder41”,
“stride”: 8,
“reg_layer”: “conv41”,
“cls_layer”: “conv42”
},
{
“name”: “bbox_decoder52”,
“stride”: 16,
“reg_layer”: “conv52”,
“cls_layer”: “conv53”
},
{
“name”: “bbox_decoder62”,
“stride”: 32,
“reg_layer”: “conv62”,
“cls_layer”: “conv63”
}
]
}
The output result is:
Start run for network yolov8s …
Initializing the hailo8 runner…
[info] Translation started on ONNX model yolov8s
[info] Restored ONNX model yolov8s (completion time: 00:00:00.29)
[info] Extracted ONNXRuntime meta-data for Hailo model (completion time: 00:00:01.51)
[info] NMS structure of yolov8 (or equivalent architecture) was detected.
[info] In order to use HailoRT post-processing capabilities, these end node names should be used: /model.22/cv2.0/cv2.0.2/Conv /model.22/cv3.0/cv3.0.2/Conv /model.22/cv2.1/cv2.1.2/Conv /model.22/cv3.1/cv3.1.2/Conv /model.22/cv2.2/cv2.2.2/Conv /model.22/cv3.2/cv3.2.2/Conv.
[info] Start nodes mapped from original model: ‘images’: ‘yolov8s/input_layer1’.
[info] End nodes mapped from original model: ‘/model.22/cv2.0/cv2.0.2/Conv’, ‘/model.22/cv3.0/cv3.0.2/Conv’, ‘/model.22/cv2.1/cv2.1.2/Conv’, ‘/model.22/cv3.1/cv3.1.2/Conv’, ‘/model.22/cv2.2/cv2.2.2/Conv’, ‘/model.22/cv3.2/cv3.2.2/Conv’.
[info] Translation completed on ONNX model yolov8s (completion time: 00:00:02.85)
[info] Appending model script commands to yolov8s from string
[info] Added nms postprocess command to model script.
[info] Saved HAR to: /home/bbibm/Project/03_mading/hailo_v/yolov8s.har
Preparing calibration data…
[info] Loading model script commands to yolov8s from /home/bbibm/Project/03_mading/hailo_v/hailo_model_zoo/hailo_model_zoo/cfg/alls/generic/yolov8s.alls
[info] Loading model script commands to yolov8s from string
[info] Starting Model Optimization
[info] Model received quantization params from the hn
[info] MatmulDecompose skipped
[info] Starting Mixed Precision
[info] Model Optimization Algorithm Mixed Precision is done (completion time is 00:00:00.75)
[info] LayerNorm Decomposition skipped
[info] Starting Statistics Collector
[info] Using dataset with 64 entries for calibration
Calibration: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 64/64 [00:39<00:00, 1.64entries/s]
[info] Model Optimization Algorithm Statistics Collector is done (completion time is 00:00:42.67)
[info] Starting Fix zp_comp Encoding
[info] Model Optimization Algorithm Fix zp_comp Encoding is done (completion time is 00:00:00.00)
[info] Matmul Equalization skipped
[info] Finetune encoding skipped
[info] Bias Correction skipped
[info] Adaround skipped
[info] Starting Quantization-Aware Fine-Tuning
[warning] Dataset is larger than expected size. Increasing the algorithm dataset size might improve the results
[info] Using dataset with 1024 entries for finetune
Epoch 1/4
128/128 [==============================] - 327s 1s/step - total_distill_loss: 4.4894 - _distill_loss_yolov8s/conv41: 0.2064 - _distill_loss_yolov8s/conv42: 0.6920 - _distill_loss_yolov8s/conv52: 0.2222 - _distill_loss_yolov8s/conv53: 0.6396 - _distill_loss_yolov8s/conv62: 0.2488 - _distill_loss_yolov8s/conv63: 1.3986 - _distill_loss_yolov8s/conv46: 0.3972 - _distill_loss_yolov8s/conv35: 0.2671 - _distill_loss_yolov8s/conv57: 0.4176
Epoch 2/4
128/128 [==============================] - 161s 1s/step - total_distill_loss: 7.0507 - _distill_loss_yolov8s/conv41: 0.4138 - _distill_loss_yolov8s/conv42: 0.9390 - _distill_loss_yolov8s/conv52: 0.4335 - _distill_loss_yolov8s/conv53: 0.9305 - _distill_loss_yolov8s/conv62: 0.5666 - _distill_loss_yolov8s/conv63: 1.4972 - _distill_loss_yolov8s/conv46: 0.7388 - _distill_loss_yolov8s/conv35: 0.5749 - _distill_loss_yolov8s/conv57: 0.9563
Epoch 3/4
128/128 [==============================] - 161s 1s/step - total_distill_loss: 6.5922 - _distill_loss_yolov8s/conv41: 0.4091 - _distill_loss_yolov8s/conv42: 0.9982 - _distill_loss_yolov8s/conv52: 0.4148 - _distill_loss_yolov8s/conv53: 1.0001 - _distill_loss_yolov8s/conv62: 0.5205 - _distill_loss_yolov8s/conv63: 1.0000 - _distill_loss_yolov8s/conv46: 0.7590 - _distill_loss_yolov8s/conv35: 0.6152 - _distill_loss_yolov8s/conv57: 0.8753
Epoch 4/4
128/128 [==============================] - 161s 1s/step - total_distill_loss: 6.3592 - _distill_loss_yolov8s/conv41: 0.3663 - _distill_loss_yolov8s/conv42: 0.9990 - _distill_loss_yolov8s/conv52: 0.3890 - _distill_loss_yolov8s/conv53: 1.0001 - _distill_loss_yolov8s/conv62: 0.4769 - _distill_loss_yolov8s/conv63: 1.0000 - _distill_loss_yolov8s/conv46: 0.7297 - _distill_loss_yolov8s/conv35: 0.5925 - _distill_loss_yolov8s/conv57: 0.8055
[info] Model Optimization Algorithm Quantization-Aware Fine-Tuning is done (completion time is 00:13:34.82)
[info] Starting Layer Noise Analysis
Full Quant Analysis: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [03:08<00:00, 94.44s/iterations]
[info] Model Optimization Algorithm Layer Noise Analysis is done (completion time is 00:03:14.92)
[info] Model Optimization is done
[info] Saved HAR to: /home/bbibm/Project/03_mading/hailo_v/yolov8s.har
[info] Loading model script commands to yolov8s from /home/bbibm/Project/03_mading/hailo_v/hailo_model_zoo/hailo_model_zoo/cfg/alls/generic/yolov8s.alls
[info] To achieve optimal performance, set the compiler_optimization_level to “max” by adding performance_param(compiler_optimization_level=max) to the model script. Note that this may increase compilation time.
[info] Loading network parameters
[info] Starting Hailo allocation and compilation flow
[info] Adding an output layer after conv41
[info] Adding an output layer after conv42
[info] Adding an output layer after conv52
[info] Adding an output layer after conv53
[info] Adding an output layer after conv62
[info] Adding an output layer after conv63
[info] Using Single-context flow
[info] Resources optimization guidelines: Strategy → GREEDY Objective → MAX_FPS
[info] Resources optimization params: max_control_utilization=75%, max_compute_utilization=75%, max_compute_16bit_utilization=75%, max_memory_utilization (weights)=75%, max_input_aligner_utilization=75%, max_apu_utilization=75%
[info] Using Single-context flow
[info] Resources optimization guidelines: Strategy → GREEDY Objective → MAX_FPS
[info] Resources optimization params: max_control_utilization=75%, max_compute_utilization=75%, max_compute_16bit_utilization=75%, max_memory_utilization (weights)=75%, max_input_aligner_utilization=75%, max_apu_utilization=75%
Validating context_0 layer by layer (100%)
● Finished
[info] Solving the allocation (Mapping), time per context: 59m 59s
Context:0/0 Iteration 4: Trying parallel mapping…
cluster_0 cluster_1 cluster_2 cluster_3 cluster_4 cluster_5 cluster_6 cluster_7 prepost
worker0 * * * * * * * * V
worker1 V V V V V V V V V
worker2 * * * * * * * * V
worker3 V V V V V V V V V
00:08
Reverts on cluster mapping: 0
Reverts on inter-cluster connectivity: 0
Reverts on pre-mapping validation: 0
Reverts on split failed: 0
[info] Iterations: 4
Reverts on cluster mapping: 0
Reverts on inter-cluster connectivity: 0
Reverts on pre-mapping validation: 1
Reverts on split failed: 0
[info] ±----------±--------------------±--------------------±-------------------+
[info] | Cluster | Control Utilization | Compute Utilization | Memory Utilization |
[info] ±----------±--------------------±--------------------±-------------------+
[info] | cluster_0 | 62.5% | 75% | 53.1% |
[info] | cluster_1 | 56.3% | 50% | 23.4% |
[info] | cluster_2 | 75% | 71.9% | 61.7% |
[info] | cluster_3 | 87.5% | 57.8% | 39.8% |
[info] | cluster_4 | 75% | 57.8% | 73.4% |
[info] | cluster_5 | 62.5% | 76.6% | 51.6% |
[info] | cluster_6 | 100% | 73.4% | 50% |
[info] | cluster_7 | 81.3% | 40.6% | 47.7% |
[info] ±----------±--------------------±--------------------±-------------------+
[info] | Total | 75% | 62.9% | 50.1% |
[info] ±----------±--------------------±--------------------±-------------------+
[info] Successful Mapping (allocation time: 41s)
[info] Compiling context_0…
[info] Bandwidth of model inputs: 9.375 Mbps, outputs: 4.22974 Mbps (for a single frame)
[info] Bandwidth of DDR buffers: 0.0 Mbps (for a single frame)
[info] Bandwidth of inter context tensors: 0.0 Mbps (for a single frame)
[info] Building HEF…
[info] Successful Compilation (compilation time: 28s)
[info] Saved HAR to: /home/bbibm/Project/03_mading/hailo_v/yolov8s.har
HEF file written to yolov8s.hef