Hi! For some custom trained yolov8s models I encounter the following error in the hailomz compilation process:
File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/engine/training.py", line 1284, in train_function *
return step_function(self, iterator)
File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/engine/training.py", line 1268, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/keras/engine/training.py", line 1249, in run_step **
outputs = model.train_step(data)
File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/acceleras/model/distiller.py", line 122, in train_step
distillation_loss, components_dict = self.distillation_loss_fn(self.teacher, self.student)
File "/local/workspace/hailo_virtualenv/lib/python3.10/site-packages/hailo_model_optimization/algorithms/finetune/deep_distill_loss.py", line 84, in compare
if teacher.layers[layer_name].num_outputs == 1:
KeyError: 'yolov8s/ne_activation_conv53'
The output layers are of yolov8s are ‘conv42’, ‘conv53’, ‘conv63’, but apparently the student sometimes receives a new output layer ‘ne_activation_conv53’
After some investigation, I discovered that the words “ne_activation_” are added by the function _create_successors() in
\local\workspace\hailo_virtualenv\lib\python3.10\site-packages\hailo_model_optimization\algorithms\neg_exponent_fixer\layer_splitter.py
which is ultimately called by fix_output() in
\local\workspace\hailo_virtualenv\lib\python3.10\site-packages\hailo_model_optimization\algorithms\neg_exponent_fixer\neg_exp_fixer.py
if some conditions (connected to negative_exponent in the Dataflow Compiler Documentation) are satisfied. So I think what happens is: the ‘student model’ gets an additional output layer while the ‘teacher model’ does not - this then breaks the loss computation.
I assume Hailo might want to fix this bug
By the way, the error can be avoided by adding the following line to the alls script:
model_optimization_config(negative_exponent, layers=[conv42, conv53, conv63], split_threshold=1, rank=0)
because this forces that the negative_exponent issue does not add new layers for the output layers conv42, conv53, conv63.