Yolox-s to onnx to compiled .hef process flow

I have been struggling for almost 2 weeks to get a custom trained yolox-s model to compile and run on a hailo 8l. I have done my own research and tried using Claude code but still not successful. Following is a summary (with the help of Claude Code)

Questions for Hailo Support

Primary Question

How should YOLOX models with Focus layers be compiled for Hailo-8L?

The Hailo Model Zoo includes YOLOX, so there must be a standard workflow. Are there:

  • Special compilation flags needed?
  • Required ONNX export settings?
  • A recommended way to handle the Focus layer?

Specific Questions

  1. Does the Hailo SDK support YOLOX Focus layers?

    • If yes, what’s the correct compilation procedure?
    • If no, how should the ONNX be modified?
  2. Is this a known issue with 640×640 input resolution?

    • The batch dimension appears to be confused with the spatial dimension (both are 640)
    • Would a different input size (e.g., 608×608) avoid this issue?
  3. What ONNX opset version is recommended for YOLOX?

    • Currently using opset 11
    • Would opset 13 or another version help?
  4. Should we use the CLI compiler instead of Python SDK?

    hailo parser onnx --batch-size 1 ...
    
  5. Is there a working YOLOX compilation example?

    • Specifically for custom-trained models (not just Model Zoo pre-compiled)
    • With 3 output heads format

Problem

When compiling a custom YOLOX-S model for Hailo-8L, the resulting HEF expects batch_size=640 instead of batch_size=1, causing inference to fail.

Error Message

[HailoRT] [error] CHECK failed - Memory size of vstream does not match the frame count!
(Expected 786432000, got 1228800)

Analysis:

  • Expected: 786,432,000 bytes = 1,228,800 × 640
  • Got: 1,228,800 bytes (single frame: 640 × 640 × 3 pixels)
  • The HEF is expecting 640 frames instead of 1 frame

Environment

  • Hardware: Hailo-8L
  • Software: Hailo AI Suite 2025-10 (Docker)
    • HailoRT: 4.23.0
    • Dataflow Compiler: v3.33.0
    • Model Zoo: v2.17.0
  • Model: Custom YOLOX-S trained on 4 classes
  • Input Shape: [1, 3, 640, 640] (verified correct in ONNX)

Root Cause Analysis

ONNX Model Structure

The ONNX export contains a Focus layer at the network input (nodes 4-30):

  • 6 Slice operations that split the input spatially
  • 1 Concat operation that combines them

This is standard YOLOX architecture, but it appears to confuse the Hailo SDK’s batch dimension parser.

Evidence

[Node 4] Slice: /backbone/backbone/stem/Slice
  Inputs: ['images', ...]
[Node 9] Slice: /backbone/backbone/stem/Slice_1
  Inputs: ['/backbone/backbone/stem/Slice_output_0', ...]
...
[Node 30] Concat: /backbone/backbone/stem/Concat
  Inputs: [Slice outputs]

The Hailo compiler appears to misinterpret these Slice operations and treats the spatial dimension (640) as the batch dimension.

Diagnostic Test Results

:white_check_mark: ONNX Model Verification

$ python3 check_onnx_shape.py models/yolox_s_3concat.onnx
Input shape: [1, 3, 640, 640]  ✓ CORRECT
Output shapes:
  [1, 80, 80, 9]   ✓ CORRECT
  [1, 40, 40, 9]   ✓ CORRECT
  [1, 20, 20, 9]   ✓ CORRECT

:cross_mark: Hailo HEF Batch Size

$ python3 check_hef_batch_size.py yolox_s_3output.hef
Input shape: (640, 640, 3)
Expected memory: 1,228,800 bytes  ✓ Correct for single frame
ACTUAL memory required: 786,432,000 bytes  ❌ 640x too large!

:white_check_mark: Batch=640 Workaround Test

Sending 640 frames at once succeeds:

batch_data = np.stack([frame] * 640, axis=0)  # Shape: (640, 640, 640, 3)
output = infer_pipeline.infer(batch_data)  # ✓ WORKS!

This confirms the HEF is compiled with batch=640.


Attempted Solutions (All Failed)

1. :cross_mark: Explicit batch_size=1 in compilation

runner.compile(batch_size=1)  # Parameter ignored

2. :cross_mark: Explicit net_input_shapes parameter

net_input_shapes = {'images': [1, 3, 640, 640]}
runner.translate_onnx_model(..., net_input_shapes=net_input_shapes)  # Still produces batch=640

3. :cross_mark: Auto-detect shape (removed net_input_shapes)

runner.translate_onnx_model(...)  # No shape parameter - still batch=640

4. :cross_mark: Correct output node names

Used actual ONNX output names: ['output_stride8', 'output_stride16', 'output_stride32']
Still batch=640.

5. :cross_mark: Disable Focus layer in YOLOX config

self.use_focus = False  # In custom_yolox_s.py
# Focus layer still appears in exported ONNX - appears to be in checkpoint weights

Hey @user262,

Welcome to the community!

The Focus layer is part of the standard YOLOX architecture and should work fine with our toolchain. We’ve had plenty of users successfully compile YOLOX models, As long as you export to ONNX correctly and use the right config, you should be good.

About that batch=640 issue:

This is definitely unusual and not something we commonly see reported. My guess is there’s a mismatch somewhere between your ONNX export and your YAML configuration. I’ve seen similar issues pop up with other YOLO variants where node names or input shapes didn’t line up properly between the model and config file.

On the 640×640 resolution:

I haven’t seen any issues specifically with 640×640 causing batch dimension problems. That’s a pretty standard size for YOLO models. The issue is more likely in how the model was exported or configured.

Here’s what I’d suggest trying:

Inspect your ONNX model in Netron to verify node names/shapes, update your YAML config to match exactly, then compile using the --yaml flag with your custom config.

If these don’t help , we recommend moving to DFC for custom models as you will write a alls from scratch and this has a lot less issues with custom models!

Hope this helps! Let me know how it goes.

Hello Omria, Thank you for the response and I will start from the beginning by inspecting and comparing the nodes between the example yolox and my custom model.