Hailo-10H on RPi5: Undocumented API findings + DFC conversion failures with Transformer-based models (SwinV2/ViT/ConvNeXt)

Hailo-10H on Raspberry Pi 5: Undocumented API findings + DFC conversion failures with Transformer-based models

I have been developing a local AI image management application (eauesque / YU AI Manager) integrating Hailo-10H (AI HAT 2 for Raspberry Pi 5) with HailoRT v5.2.0 and DFC v5.2.0. This post shares both undocumented findings from low-level API development and specific DFC conversion failures, in hopes that Hailo engineers can provide guidance.


What I have implemented (all using low-level hailo_platform API)

All working features use pre-compiled HEF files from the official Model Zoo. I intentionally avoided hailo-apps and hailo-ollama, instead building directly on hailo_platform wheel:

  • CLIP semantic search โ€” VDevice.create_infer_model() + uint8 dequantization pipeline
  • YOLO object detection โ€” same InferModel API
  • LLM / VLM chat โ€” hailo_platform.genai.LLM / VLM
  • Whisper speech-to-text โ€” hailo_platform.genai.Speech2Text
  • VDevice exclusive-access device manager โ€” automatic switching between CLIP / YOLO / LLM / VLM / S2T on a single VDevice (hailo-apps has no equivalent)
  • Multi-backend fallback โ€” Hailo โ†’ CoreML โ†’ ONNX Runtime, transparent auto-switching
  • LAN distributed inference โ€” work-stealing parallel tagging across multiple machines

Undocumented behaviors I had to discover by trial and error

All of the following were resolved through error messages and source code inspection, as no documentation existed:

  1. InferModel API is the correct API โ€” The legacy VStreams API (InferVStreams, ConfigureParams.create_from_hef) returns HAILO_NOT_IMPLEMENTED on Hailo-10H. This is not documented anywhere.
  2. Output buffers must be uint8 โ€” Allocating float32 buffers causes buffer size mismatch. You must allocate uint8 and dequantize afterward.
  3. input() / output() are properties, not methods โ€” Inconsistent with other parts of the API.
  4. quant_info retrieval โ€” infer_model.output().quant_info provides scale / zero_point for dequantization. No documentation exists for this.
  5. hailo-ollama exclusivity โ€” VDevice usage requires stopping hailo-ollama first. The resulting error message does not indicate the cause clearly.

Iโ€™m sharing these in case they are useful to other developers or to Hailo for documentation improvements.


DFC conversion failures: Transformer-based models (March 2026, DFC v5.2.0)

I attempted to convert WD-Tagger models (Danbooru tag classification) from ONNX to HEF. All three failed at the parser stage, before reaching optimization:

Model Size Error Stage
wd-swinv2-tagger-v3 446 MB IndexError in _convert_axes_to_nhwc Pre-optimization
wd-vit-tagger-v3 362 MB Same Pre-optimization
wd-convnext-tagger-v3 377 MB UnsupportedShuffleLayerError Pre-optimization

500 calibration images were prepared but never reached the quantization stage.

Root cause (as I understand it): The DFC ONNX parser cannot handle LayerNormalization (multi-dimensional axis conversion) and certain Transpose patterns. These are fundamental building blocks of SwinV2, ViT, and ConvNeXt architectures โ€” the majority of models developed since 2022.

I note that CLIP ViT exists in the Model Zoo as a working HEF, which suggests Hailo may have applied internal graph transformations that are not available to end users through DFC.


Questions / feature requests

  1. Is there any plan to support LayerNormalization and general Transpose patterns in DFC? These are required for essentially all Transformer-based vision models.
  2. Is an ONNX Runtime Execution Provider for Hailo-10H under consideration? This would be the most developer-friendly solution โ€” eliminating the conversion step entirely. For comparison, Ryzen AI (XDNA) requires only ort.InferenceSession("model.onnx", providers=["DmlExecutionProvider"]). The absence of an equivalent for Hailo-10H is a significant barrier.
  3. Is there any workaround or additional tooling for converting SwinV2 / ViT / ConvNeXt models that is not publicly documented?

Any guidance from Hailo engineers would be greatly appreciated.


Environment: Raspberry Pi 5 (aarch64), AI HAT 2, HailoRT v5.2.0, DFC v5.2.0 (x86_64 Linux), Python 3.11
Project: GitHub - eauesque/yu_ai_manager: AI-generated image metadata manager โ€” browse, search, tag and rate your Stable Diffusion / NovelAI / ComfyUI outputs. Quart + SQLite + TypeScript WebUI with Tauri desktop app. ยท GitHub

1 Like