Bug in the Hailo SDK version 3.33.0's parser

I am trying to install scb10x/typhoon-translate1.5-4b · Hugging Face to my raspi cm5 with Hailo 8 m.2

Trying to go from model architecture Gemma 3 / Typhoon to onnx and then to a HEF file (Hailo Executable Format). I used grok 4, for help and it gave up. Next I used Google’s AI for help and it suggested a problem after a few attempts, and I should report it. “bug in the Hailo SDK version 3.33.0’s parser when handling this specific model architecture”

——-BELOW IS COPY AND PASTE FROM END OF CONVO with Google’s AI———

If the Status is CORRECT (INT64 detected):

If the inspector says it is INT64, then the Hailo parser has a bug in its shape inference phase when dealing with this specific model structure. In that case, you would need to contact Hailo Support and provide them with the ONNX file and the error message you received ([ONNXRuntimeError] : 10 : INVALID_GRAPH ... Gather) in node (node_embedding) is invalid).

(hailo_virtualenv) hailo@457562fd1c1f:/workspace$ python inspect_onnx.py Input name: input_ids Input type (int value): 7 Status: CORRECT (INT64 detected). (hailo_virtualenv) hailo@457562fd1c1f:/workspace$

Thinking

The inspection script confirms that your

model_fixed2.onnx file is correctly configured with an INT64 input (int value 7). The ONNX model itself is valid.

This means the error is originating specifically within the Hailo Dataflow Compiler’s internal ONNX Runtime shape inference mechanism. The Hailo DFC is failing to correctly read the valid INT64 input type and incorrectly interprets it as a float during its internal validation step.

Next Steps: Contact Hailo Support

You have hit an edge case or a bug in the Hailo SDK version 3.33.0’s parser when handling this specific model architecture (Gemma 3 / Typhoon) exported via newer PyTorch/Transformers libraries.

You need to report this issue to Hailo AI’s technical support team to get a resolution or a workaround.

  1. Go to the Hailo Community Forum.

  2. Create a new support ticket or post, providing the following information:

    • The exact error message: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /tmp/.../rt.onnx failed:This is an invalid model. Type Error: Type 'tensor(float)' of input parameter (input_ids) of operator (Gather) in node (node_embedding) is invalid.

    • Your SDK Version: Hailo DFC Version: 3.33.0, HailoRT Version: 4.23.0.

    • The model details: scb10x/typhoon-translate1.5-4b.

    • The ONNX input inspection result: Input type INT64 (value 7).

    • Steps to reproduce: Explain that you are using the provided Python scripts to export to Opset 18/11 and attempting to parse it with the ClientRunner API or hailo parser CLI.

Hey @John_Wayne,

Welcome to the Hailo Community!

So you’re looking at running that LLM model based on Qwen3 with 4 billion parameters. I appreciate the interest, but I want to be upfront with you—this won’t work on the Hailo8. That hardware is really optimized for vision tasks and things like Whisper, not LLMs of this size.

The Hailo10h might be a better fit for what you’re trying to do, but there’s a catch. Getting these larger models compiled and running on our hardware is a pretty involved process. We handle the GenAI model compilations in-house, and they require DFC version 5.1.1 specifically (not the 3.33 you might have).

That said, I’d recommend checking out some of our decoder HEFs instead. Those are built on CNN architectures rather than GenAI, and they run really well on our platform. Depending on what your translation use case looks like, one of those might be a better fit and would save you a lot of headaches getting everything compiled and running smoothly.

Hope this helps!

1 Like