Hailo-10H on Raspberry Pi 5: Undocumented API findings + DFC conversion failures with Transformer-based models
I have been developing a local AI image management application (eauesque / YU AI Manager) integrating Hailo-10H (AI HAT 2 for Raspberry Pi 5) with HailoRT v5.2.0 and DFC v5.2.0. This post shares both undocumented findings from low-level API development and specific DFC conversion failures, in hopes that Hailo engineers can provide guidance.
What I have implemented (all using low-level hailo_platform API)
All working features use pre-compiled HEF files from the official Model Zoo. I intentionally avoided hailo-apps and hailo-ollama, instead building directly on hailo_platform wheel:
- CLIP semantic search โ
VDevice.create_infer_model()+ uint8 dequantization pipeline - YOLO object detection โ same InferModel API
- LLM / VLM chat โ
hailo_platform.genai.LLM/VLM - Whisper speech-to-text โ
hailo_platform.genai.Speech2Text - VDevice exclusive-access device manager โ automatic switching between CLIP / YOLO / LLM / VLM / S2T on a single VDevice (hailo-apps has no equivalent)
- Multi-backend fallback โ Hailo โ CoreML โ ONNX Runtime, transparent auto-switching
- LAN distributed inference โ work-stealing parallel tagging across multiple machines
Undocumented behaviors I had to discover by trial and error
All of the following were resolved through error messages and source code inspection, as no documentation existed:
- InferModel API is the correct API โ The legacy VStreams API (
InferVStreams,ConfigureParams.create_from_hef) returnsHAILO_NOT_IMPLEMENTEDon Hailo-10H. This is not documented anywhere. - Output buffers must be uint8 โ Allocating float32 buffers causes
buffer size mismatch. You must allocate uint8 and dequantize afterward. input()/output()are properties, not methods โ Inconsistent with other parts of the API.quant_inforetrieval โinfer_model.output().quant_infoprovidesscale/zero_pointfor dequantization. No documentation exists for this.- hailo-ollama exclusivity โ VDevice usage requires stopping hailo-ollama first. The resulting error message does not indicate the cause clearly.
Iโm sharing these in case they are useful to other developers or to Hailo for documentation improvements.
DFC conversion failures: Transformer-based models (March 2026, DFC v5.2.0)
I attempted to convert WD-Tagger models (Danbooru tag classification) from ONNX to HEF. All three failed at the parser stage, before reaching optimization:
| Model | Size | Error | Stage |
|---|---|---|---|
| wd-swinv2-tagger-v3 | 446 MB | IndexError in _convert_axes_to_nhwc |
Pre-optimization |
| wd-vit-tagger-v3 | 362 MB | Same | Pre-optimization |
| wd-convnext-tagger-v3 | 377 MB | UnsupportedShuffleLayerError |
Pre-optimization |
500 calibration images were prepared but never reached the quantization stage.
Root cause (as I understand it): The DFC ONNX parser cannot handle LayerNormalization (multi-dimensional axis conversion) and certain Transpose patterns. These are fundamental building blocks of SwinV2, ViT, and ConvNeXt architectures โ the majority of models developed since 2022.
I note that CLIP ViT exists in the Model Zoo as a working HEF, which suggests Hailo may have applied internal graph transformations that are not available to end users through DFC.
Questions / feature requests
- Is there any plan to support
LayerNormalizationand generalTransposepatterns in DFC? These are required for essentially all Transformer-based vision models. - Is an ONNX Runtime Execution Provider for Hailo-10H under consideration? This would be the most developer-friendly solution โ eliminating the conversion step entirely. For comparison, Ryzen AI (XDNA) requires only
ort.InferenceSession("model.onnx", providers=["DmlExecutionProvider"]). The absence of an equivalent for Hailo-10H is a significant barrier. - Is there any workaround or additional tooling for converting SwinV2 / ViT / ConvNeXt models that is not publicly documented?
Any guidance from Hailo engineers would be greatly appreciated.
Environment: Raspberry Pi 5 (aarch64), AI HAT 2, HailoRT v5.2.0, DFC v5.2.0 (x86_64 Linux), Python 3.11
Project: GitHub - eauesque/yu_ai_manager: AI-generated image metadata manager โ browse, search, tag and rate your Stable Diffusion / NovelAI / ComfyUI outputs. Quart + SQLite + TypeScript WebUI with Tauri desktop app. ยท GitHub