I’ve trained a yolov5-p2 model on VisDrone and have some decent results (mAP@50 = 0.6)
But I’m unable to port this over to Hailo.
onnx output looks like so:
I use that to obtain this code fro parsing:
runner = ClientRunner(hw_arch=chosen_hw_arch)
hn, npz = runner.translate_onnx_model(
onnx_path,
onnx_model_name,
start_node_names=["/model.0/conv/Conv"],
end_node_names=["/model.31/m.3/Conv",
"/model.31/m.2/Conv",
"/model.31/m.1/Conv",
"/model.31/m.0/Conv"],
net_input_shapes={"/model.0/conv/Conv": [1, 3, 640, 640]},
)
This succeeds and when I check my runner.get_hn, I’ll have these output layers:
output layers in parsed model
('yolov5np2_visdrone/output_layer1',
OrderedDict([('type', 'output_layer'),
('input', ['yolov5np2_visdrone/conv89']),
('output', []),
('input_shapes', [[-1, 160, 160, 18]]),
('output_shapes', [[-1, 160, 160, 18]]),
('original_names', ['out']),
('compilation_params', {}),
('quantization_params', {}),
('transposed', False),
('engine', 'nn_core'),
('io_type', 'standard')])),
('yolov5np2_visdrone/output_layer2',
OrderedDict([('type', 'output_layer'),
('input', ['yolov5np2_visdrone/conv99']),
('output', []),
('input_shapes', [[-1, 80, 80, 18]]),
('output_shapes', [[-1, 80, 80, 18]]),
('original_names', ['out']),
('compilation_params', {}),
('quantization_params', {}),
('transposed', False),
('engine', 'nn_core'),
('io_type', 'standard')])),
('yolov5np2_visdrone/output_layer3',
OrderedDict([('type', 'output_layer'),
('input', ['yolov5np2_visdrone/conv111']),
('output', []),
('input_shapes', [[-1, 40, 40, 18]]),
('output_shapes', [[-1, 40, 40, 18]]),
('original_names', ['out']),
('compilation_params', {}),
('quantization_params', {}),
('transposed', False),
('engine', 'nn_core'),
('io_type', 'standard')])),
('yolov5np2_visdrone/output_layer4',
OrderedDict([('type', 'output_layer'),
('input', ['yolov5np2_visdrone/conv121']),
('output', []),
('input_shapes', [[-1, 20, 20, 18]]),
('output_shapes', [[-1, 20, 20, 18]]),
('original_names', ['out']),
('compilation_params', {}),
('quantization_params', {}),
('transposed', False),
('engine', 'nn_core'),
('io_type', 'standard')]))])
================================================================================
My next step was creating an nms config, which I did so by first extracting the anchors from the onnx model. With 4 output nodes I assumed I’d need 4 decoders. This is how my config file looks:
nms config file
{
"nms_scores_th": 0.001,
"nms_iou_th": 0.6,
"image_dims": [
640,
640
],
"max_proposals_per_class": 100,
"background_removal": false,
"classes": 1,
"bbox_decoders": [
{
"name": "bbox_decoder89",
"w": [
2.01172,
2.68945,
4.41016
],
"h": [
3.97266,
5.95312,
5.50000
],
"stride": 8,
"encoded_layer": "conv89"
},
{
"name": "bbox_decoder99",
"w": [
3.53125,
5.31641,
5.08594
],
"h": [
8.80469,
8.64062,
12.39062
],
"stride": 16,
"encoded_layer": "conv99"
},
{
"name": "bbox_decoder111",
"w": [
8.03906,
6.73438,
9.27344
],
"h": [
10.12500,
15.82812,
17.54688
],
"stride": 32,
"encoded_layer": "conv111"
},
{
"name": "bbox_decoder121",
"w": [
11.79688,
15.61719,
34.09375
],
"h": [
21.31250,
29.06250,
36.06250
],
"stride": 64,
"encoded_layer": "conv121"
}
]
}
Finally the alls script:
alls script
alls = """
normalization1 = normalization([0.0, 0.0, 0.0], [255.0, 255.0, 255.0])
change_output_activation(sigmoid)
model_optimization_config(calibration, batch_size=8, calibset_size=64)
post_quantization_optimization(finetune, policy=enabled, learning_rate=0.00001, epochs=4, batch_size=8, dataset_size=1024)
nms_postprocess("/local/shared_with_docker/visdrone/postprocess_config/yolov5np2v6_nms_config_custom.json", yolov5, engine=cpu)
performance_param(compiler_optimization_level=max)
allocator_param(width_splitter_defuse=disabled)
# """
However when I run inference, even though I get some expected total detection counts, the detections are all wrong, getting mAP@50 = 0
such as this image:
Any ideas where I’m going wrong here?