Incorrect input/output of custom compiled HEF

I compiled PIPs BasicEncoder net into the HEF file, but the network did not meet my expectations.

BasicEncoder worked on 320x320x3 images normalized to [-1, 1] interval and created 40x40x128 representation of that image.
I created a calibration set that is normalized (deliberately ommited normalization in model_script.alls for debugging purposes) so I’d expect the net to have input and output layers to be of floating point type. Also while the input layer is named pips/input_layer1 (the same as in pips_compiled_model.html produced by hailo profiler pips_compiled.har) the output layer seem to be pips/conv22 (the one just before pips/output_layer1).
I checked that using the command:

docker-user@quczer-seagle:/mnt/ml-infra$ hailo parse-hef pips.hef
[info] Current Time: 11:44:12, 12/20/24
[info] CPU: Architecture: x86_64, Model: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz, Number Of Cores: 16, Utilization: 0.4%
[info] Memory: Total: 30GB, Available: 18GB
[info] System info: OS: Linux, Kernel: 6.8.0-49-generic
[info] Hailo DFC Version: 3.29.0
[info] HailoRT Version: 4.19.0
[info] PCIe: No Hailo PCIe device was found
[info] Running `hailo parse-hef pips.hef`
(hailo) Running command 'parse-hef' with 'hailortcli'
Architecture HEF was compiled for: HAILO8L
Network group name: pips, Multi Context - Number of contexts: 9
    Network name: pips/pips
        VStream infos:
            Input  pips/input_layer1 UINT8, NHWC(320x320x3)
            Output pips/conv22 UINT8, FCR(40x40x128)

I looked at the 1D distribution of outputs and indeed it is of by some factor (GT - ground truth, hailo - net output, hailo_scaled output manyally rescaled)

Another problem is that both pips/input_layer1 and pips/conv22 are of UINT8 type, not FLOAT32 which was expected. I can then manually change it when creating hailo_platform.ConfiguredInferModel but this is a hint that something is wrong.

Do you have an idea why is the output layer missing and if that’s a problem?

Thanks in advance,
Michał

Hey @quczer,

Why are the input and output layers UINT8?

By default, the Hailo Dataflow Compiler (DFC) quantizes all model inputs and outputs to UINT8 for optimal performance on edge devices. This behavior is described in the Dataflow Compiler User Guide, specifically in section 1.3 (“Deployment Process”) and section 5.3 (“Model Optimization”). Unless you override it during inference, the compiler assumes quantized I/O.


Why is pips/output_layer1 missing?

This usually happens when the ONNX export doesn’t explicitly mark output_layer1 as an output. The compiler then picks the last operational layer—pips/conv22 in your case—as the default output.

Fix: Define the output during ONNX export like this:

torch.onnx.export(
    model,
    input_tensor,
    "pips.onnx",
    input_names=["pips/input_layer1"],
    output_names=["pips/output_layer1"]
)

Then check the result with:

hailo parser pips.onnx

Optional: Higher precision quantization

If needed, you can request higher-precision quantization (like 16-bit activations) using:

quantization_param(pips/output_layer1, precision_mode=a16_w16)

This won’t give you full float32 precision but can help reduce quantization loss.


Dequantize to float32 at runtime

HailoRT lets you configure VStreams to automatically dequantize tensors, so your application receives float32 outputs:

InputVStreamParams.make(network_group, quantized=False, format_type=FormatType.FLOAT32)
OutputVStreamParams.make(network_group, quantized=False, format_type=FormatType.FLOAT32)

Let me know if anything’s unclear or if you need help with a specific setup.

Hi, thanks for the reply.

I don’t think these commands for model_script.alls are valid - I get

docker-user@quczer-seagle:/mnt/ml-infra$ hailo compiler /mnt/ml-infra/tmp/optimized.har --hw-arch hailo8l --output-dir /mnt/ml-infra/tmp --model-script /mnt/ml-infra/tmp/model_script.alls
[info] Current Time: 13:46:01, 02/04/25
[info] CPU: Architecture: x86_64, Model: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz, Number Of Cores: 16, Utilization: 0.4%
[info] Memory: Total: 30GB, Available: 23GB
[info] System info: OS: Linux, Kernel: 6.8.0-52-generic
[info] Hailo DFC Version: 3.29.0
[info] HailoRT Version: 4.19.0
[info] PCIe: No Hailo PCIe device was found
[info] Running `hailo compiler /mnt/ml-infra/tmp/optimized.har --hw-arch hailo8l --output-dir /mnt/ml-infra/tmp --model-script /mnt/ml-infra/tmp/model_script.alls`
[info] Loading model script commands to pips from /mnt/ml-infra/tmp/model_script.alls
Traceback (most recent call last):
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/script_parser/model_script_parser.py", line 372, in parse_script
    script_grammar.parseString(input_script, parseAll=True)
  File "/home/docker-user/.local/lib/python3.10/site-packages/pyparsing.py", line 1955, in parseString
    raise exc
  File "/home/docker-user/.local/lib/python3.10/site-packages/pyparsing.py", line 3814, in parseImpl
    raise ParseException(instring, loc, self.errmsg, self)
pyparsing.ParseException: Expected end of text, found 'i'  (at char 0), (line:1, col:1)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/docker-user/.local/bin/hailo", line 8, in <module>
    sys.exit(main())
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/tools/cmd_utils/main.py", line 111, in main
    ret_val = client_command_runner.run()
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_platform/tools/hailocli/main.py", line 64, in run
    return self._run(argv)
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_platform/tools/hailocli/main.py", line 104, in _run
    return args.func(args)
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/tools/compiler_cli.py", line 52, in run
    self._compile(
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/tools/compiler_cli.py", line 79, in _compile
    runner.load_model_script(alls)
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func
    return func(self, *args, **kwargs)
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py", line 477, in load_model_script
    self._sdk_backend.load_model_script_from_file(model_script, append)
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 391, in load_model_script_from_file
    self._script_parser.parse_script_from_file(model_script_path, nms_config, append)
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/script_parser/model_script_parser.py", line 302, in parse_script_from_file
    return self.parse_script(f.read(), append, nms_config_file)
  File "/home/docker-user/.local/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/script_parser/model_script_parser.py", line 380, in parse_script
    raise BackendScriptParserException(f"Parsing failed at:\n{e.markInputline()}")
hailo_sdk_client.sdk_backend.sdk_backend_exceptions.BackendScriptParserException: Parsing failed at:
>!<input_layer1=layer(data_type=float32)

Can you point me to the documentation where input_layer = ... and output_layer = ... commands are described?

Best regards,
Michał

The float32 input and output layer codes you wrote are not available on the Model Script documentation page. (https://hailo.ai/developer-zone/documentation/dataflow-compiler-v3-30-0/?sp_referrer=sdk%2Fmodel_optimization.html%23model-modification-commands#optimization-related-model-script-commands)
Where did you find these? Also, when I tried this, I got the same error as @quczer .
Also, @quczer, did you solve the problem?
I can’t give float32 input and output.

Hey @John_Doe,

Welcome to the Hailo Community!

I’ve updated my solution—please give it a try and let me know if it resolves your issue. Looking forward to your update.