Hello.
I try to convert Whisper onnx model to hax format (target format is hef).
I use command from hailomz:
hailomz optimize --ckpt en.onnx --yaml onnx.yaml --hw-arch "hailo8l"
My yaml file:
---
name: "whisper_model"
network:
network_name: "whisper"
input_layers:
- "mel"
output_layers:
- "n_layer_cross_k"
- "n_layer_cross_v"
inputs:
- name: "mel"
shape: ["n_audio", 80, "T"]
data_type: "float32"
outputs:
- name: "n_layer_cross_k"
shape: [12, "n_audio", "T", 768]
data_type: "float32"
- name: "n_layer_cross_v"
shape: [12, "n_audio", "T", 768]
data_type: "float32"
model:
framework: "onnx"
opset_version: 12
path: "model.onnx"
format: "onnx"
quantization:
mode: "symmetric"
target: "performance"
calib_set:
path: "./calib_data"
type: "folder"
parser:
type: "onnx"
start_node_shapes:
tensor_shape:
mel: [1, 80, 3000]
end_node_shapes:
tensor_shapes:
n_layer_cross_k: [12, 1, 3000, 768]
n_layer_cross_v: [12, 1, 3000, 768]
nodes:
- name: "mel"
inputs: ["mel"]
outputs: ["onnx::Conv_2370"]
- name: "n_layer_cross_k"
inputs: ["onnx::Conv_2370"]
outputs: ["n_layer_cross_k"]
- name: "n_layer_cross_v"
inputs: ["onnx::Conv_2371"]
outputs: ["n_layer_cross_v"]
I read inputs and outputs from onnx model:
Inputs:
Name: mel, Type: tensor_type {
elem_type: 1
shape {
dim {
dim_param: "n_audio"
}
dim {
dim_value: 80
}
dim {
dim_param: "T"
}
}
}
, Shape: [0, 80, 0]
Outputs:
Name: n_layer_cross_k, Type: tensor_type {
elem_type: 1
shape {
dim {
dim_value: 12
}
dim {
dim_param: "n_audio"
}
dim {
dim_param: "T"
}
dim {
dim_value: 768
}
}
}
, Shape: [12, 0, 0, 768]
Name: n_layer_cross_v, Type: tensor_type {
elem_type: 1
shape {
dim {
dim_value: 12
}
dim {
dim_param: "n_audio"
}
dim {
dim_param: "T"
}
dim {
dim_value: 768
}
}
}
, Shape: [12, 0, 0, 768]
What is wrong with yaml file?
I get error:
raise Exception(f"Encountered error during parsing: {err}") from None Exception: Encountered error during parsing: net_input_shapes must be a dictionary for multiple input networks