Argmax 16bit output

Luca_Gessi · April 27, 2025, 10:13pm

Hi!
I am using Hailo 8 and I am having a problem forcing 16 bit output on argmax layer.
I have a dense layer which produces a tensor with 5000 values and argmax finds maximum in that dimension.
The problem is that the compilation works but during inference it crashes saying that 8bit output is not enough.
I have tried to force 16 bit output on argmax layer (the name I read using Hailo profiler) but nothing change. I used model script with quantization Param.
It finds the node and compiles fine but the connection from argmax to output node remain 8 bit although argmax layer itself seems to became 16 bit from 8.
Is it possible to use it 16 bit? How?
Thank you

shashi · April 28, 2025, 7:22pm

Hi @Luca_Gessi
Welcome to the Hailo community.
Is the argmax layer towards the end of the model or is it a layer somewhere in the middle? If it is toward the end, you can try specifying the preceding node as input and run argmax on cpu as it is not a very expensive op.

Luca_Gessi · April 29, 2025, 4:29am

Hi Shashi.
Thank you for your response.
Yes, it’s true that it could be done on CPU but the CPU we are talking about is an arm one which I would like to leave as much as I can free to do other task. We have to achieve very high FPS so I want to do the most I can on Hailo 8.
The argmax is the last layer of the network.
Is this a limitation of Hailo 8? It is really strange to have just 256 output value.
Do you want other information?
Can you confirm that argmax can work with 16 bit output data?
Thank you

shashi · April 29, 2025, 4:48am

Hi @Luca_Gessi
I think someone from Hailo team can answer if argmax can work with 16 bit output data. I was just suggesting this as an alternative, but it looks like in your case this is not a viable alternative.

nina-vilela · May 5, 2025, 7:43am

Hi @Luca_Gessi,

Could you clarify what you mean by the inference crashing? It would be helpful to have the error messages too.

Luca_Gessi · May 5, 2025, 9:36am

Hi nina-vilela.
I share below the output when I tryed to run the model:

[HailoRT] [error] CHECK failed - Output format type UINT8 can’t represent possible range 5000 for Argmax op
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_OPERATION(6)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_OPERATION(6)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_OPERATION(6)
[HailoRT CLI] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_OPERATION(6)
[HailoRT CLI] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_OPERATION(6)
[HailoRT CLI] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_OPERATION(6) - Error failed running inference
[HailoRT CLI] [error] CHECK_SUCCESS failed with status=HAILO_INVALID_OPERATION(6) - Error while running inference

After this I tryed to force 16bit argmax output but it did not work. It seems that also using model script the output of argmax remain 8 bit.
I am using HailoRT Version: 4.21.0 on target board.
Thank you

Luca_Gessi · May 8, 2025, 9:35am

@nina-vilela , do you have any ideas?

Topic		Replies	Views
Can I run inference on a model that was quantized to have fully 16-bit weights? General python , cpp	0	271	April 9, 2024
Inquiry About Obtaining Tensor Values from a Specific Layer Using HailoRT C API General hailo8	9	49	March 7, 2025
BF16 Not supported? General hailo8	1	28	May 6, 2025
Unable to access output when using hailortcli to run the pre-compiled fast-depth model on a RPI5 General raspberry-pi , hailo8	1	41	January 31, 2025
Hailo Compiler Error \| Output Height < 2 is unsupported General compiler , error	4	71	October 31, 2024

Argmax 16bit output

Related topics