Network Graph Compilation Failure

Hi together,

I ran into a roadblock when compiling my optimzied semantic segmentation model.

This is the error code from the Tutorial Notebook that I followed.

Does Anyone (Admins) have an Idea with this kind of error and can point me in the right direction?
Thank you!

[info] To achieve optimal performance, set the compiler_optimization_level to “max” by adding performance_param(compiler_optimization_level=max) to the model script. Note that this may increase compilation time.
[info] Loading network parameters
[info] Starting Hailo allocation and compilation flow
[info] Finding the best partition to contexts…
[info] Iteration #1 - Contexts: 3
[info] Iteration #2 - Contexts: 3
[info] Iteration #3 - Contexts: 3
[info] Iteration #4 - Contexts: 3
[info] Iteration #5 - Contexts: 3
[info] Iteration #6 - Contexts: 3
[info] Iteration #7 - Contexts: 3
[info] Iteration #8 - Contexts: 3
[info] Iteration #9 - Contexts: 3
[info] Iteration #10 - Contexts: 3
[info] Iteration #11 - Contexts: 3
[info] Iteration #12 - Contexts: 3
[info] Iteration #13 - Contexts: 3
[info] Iteration #14 - Contexts: 3
[info] Iteration #15 - Contexts: 3
[info] Iteration #16 - Contexts: 3
[info] Iteration #17 - Contexts: 3
[info] Iteration #18 - Contexts: 3
[info] Iteration #19 - Contexts: 3
[info] Iteration #20 - Contexts: 3
[info] Iteration #21 - Contexts: 3
[info] Iteration #22 - Contexts: 3
[info] Iteration #23 - Contexts: 3
[info] Iteration #24 - Contexts: 3
[info] Iteration #25 - Contexts: 3
[info] Iteration #26 - Contexts: 3
[info] Iteration #27 - Contexts: 3
[info] Iteration #28 - Contexts: 3
[info] Iteration #29 - Contexts: 3
[info] Iteration #30 - Contexts: 3
[info] Iteration #31 - Contexts: 3
[info] Iteration #32 - Contexts: 3
[info] Iteration #33 - Contexts: 3
[info] Iteration #34 - Contexts: 3
[info] Iteration #35 - Contexts: 3
[info] Iteration #36 - Contexts: 3
[info] Iteration #37 - Contexts: 3
[info] Iteration #38 - Contexts: 3
[info] Iteration #39 - Contexts: 3
[info] Iteration #40 - Contexts: 3
[info] Iteration #41 - Contexts: 3
[info] Iteration #42 - Contexts: 3
[info] Iteration #43 - Contexts: 3
[info] Iteration #44 - Contexts: 3
[info] Iteration #45 - Contexts: 3
[info] Iteration #46 - Contexts: 3
[info] Iteration #47 - Contexts: 3
[info] Iteration #48 - Contexts: 3
[info] Iteration #49 - Contexts: 3
[info] Iteration #50 - Contexts: 3
[info] Iteration #51 - Contexts: 3

[error] Failed to produce compiled graph

compiler: …/src/allocator/network_graph_appender.cpp:396: Status<network_graph::ShortcutNetworkNode*> allocator::NetworkGraphAppender::AddShortcut(std::string, network_graph::NetworkNode&, std::vector<network_graph::NetworkNode*>): Assertion src_node.output_format() == (*first_succ)->input_format()' failed. compiler: ../src/allocator/network_graph_appender.cpp:396: Status<network_graph::ShortcutNetworkNode*> allocator::NetworkGraphAppender::AddShortcut(std::string, network_graph::NetworkNode&, std::vector<network_graph::NetworkNode*>): Assertion src_node.output_format() == (*first_succ)->input_format()’ failed.


BackendAllocatorException Traceback (most recent call last)
Cell In[4], line 1
----> 1 hef = runner.compile()
3 file_name = f"{model_name}.hef"
4 with open(file_name, “wb”) as f:

File ~/miniconda3/envs/hailoenv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py:886, in ClientRunner.compile(self)
874 def compile(self):
875 “”"
876 DFC API for compiling current model to Hailo hardware.
877
(…)
884
885 “”"
→ 886 return self._compile()

File ~/miniconda3/envs/hailoenv/lib/python3.10/site-packages/hailo_sdk_common/states/states.py:16, in allowed_states..wrap..wrapped_func(self, *args, **kwargs)
12 if self._state not in states:
13 raise InvalidStateException(
14 f"The execution of {func.name} is not available under the state: {self._state.value}",
15 )
—> 16 return func(self, *args, **kwargs)

File ~/miniconda3/envs/hailoenv/lib/python3.10/site-packages/hailo_sdk_client/runner/client_runner.py:1094, in ClientRunner._compile(self, fps, mapping_timeout, allocator_script_filename)
1088 self._logger.warning(
1089 f"Taking model script commands from {allocator_script_filename} and ignoring "
1090 f"previous allocation script commands",
1091 )
1092 self.load_model_script(allocator_script_filename)
→ 1094 serialized_hef = self._sdk_backend.compile(fps, self.model_script, mapping_timeout)
1096 self._auto_model_script = self._sdk_backend.get_auto_alls()
1097 self._state = States.COMPILED_SLIM_MODEL if orig_state in SLIM_STATES else States.COMPILED_MODEL

File ~/miniconda3/envs/hailoenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py:1669, in SdkBackendCompilation.compile(self, fps, allocator_script, mapping_timeout)
1667 def compile(self, fps, allocator_script=None, mapping_timeout=None):
1668 self._model.fill_default_quantization_params(logger=self._logger)
→ 1669 hef, mapped_graph_file = self._compile(fps, allocator_script, mapping_timeout)
1670 # TODO: Jira
1671 if not SDKPaths().is_internal:

File ~/miniconda3/envs/hailoenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py:1663, in SdkBackendCompilation._compile(self, fps, allocator_script, mapping_timeout)
1657 if not model_params and self.requires_quantized_weights:
1658 raise BackendRuntimeException(
1659 "Model requires quantized weights in order to run on HW, but none were given. "
1660 “Did you forget to quantize?”,
1661 )
→ 1663 hef, mapped_graph_file, auto_alls = self.hef_full_build(fps, mapping_timeout, model_params, allocator_script)
1664 self._auto_alls = auto_alls
1665 return hef, mapped_graph_file

File ~/miniconda3/envs/hailoenv/lib/python3.10/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py:1639, in SdkBackendCompilation.hef_full_build(self, fps, mapping_timeout, params, allocator_script)
1637 config_paths = ConfigPaths(self._hw_arch, self._model.name)
1638 config_paths.set_stage(“inference”)
→ 1639 auto_alls, self._hef_data, self._integrated_graph = allocator.create_mapping_and_full_build_hef(
1640 config_paths.get_path(“network_graph”),
1641 config_paths.get_path(“mapped_graph”),
1642 config_paths.get_path(“compilation_output_proto”),
1643 params=params,
1644 allocator_script=allocator_script,
1645 compiler_statistics_path=config_paths.get_path(“compiler_statistics”),
1646 nms_metadata=self._nms_metadata,
1647 har=self.har,
1648 alls_ignore_invalid_cmds=self._alls_ignore_invalid_cmds,
1649 )
1651 return self._hef_data, config_paths.get_path(“mapped_graph”), auto_alls

File ~/miniconda3/envs/hailoenv/lib/python3.10/site-packages/hailo_sdk_client/allocator/hailo_tools_runner.py:761, in HailoToolsRunner.create_mapping_and_full_build_hef(self, network_graph_path, output_path, compilation_output_proto, agent, strategy, auto_mapping, params, expected_output_tensor, expected_pre_acts, network_inputs, network_outputs, allocator_script, allocator_script_mode, compiler_statistics_path, nms_metadata, har, alls_ignore_invalid_cmds)
756 if self.hn.net_params.clusters_placement != []:
757 assert (
758 len(self.hn.net_params.clusters_placement) <= self._number_of_clusters
759 ), “Number of clusters in layer placements is larger than allowed number of clusters”
→ 761 self.call_builder(
762 network_graph_path,
763 output_path,
764 compilation_output_proto=compilation_output_proto,
765 agent=agent,
766 strategy=strategy,
767 exit_point=BuilderExitPoint.POST_CAT,
768 params=params,
769 expected_output_tensor=expected_output_tensor,
770 expected_pre_acts=expected_pre_acts,
771 network_inputs=network_inputs,
772 network_outputs=network_outputs,
773 allocator_script=allocator_script,
774 allocator_script_mode=allocator_script_mode,
775 compiler_statistics_path=compiler_statistics_path,
776 nms_metadata=nms_metadata,
777 har=har,
778 alls_ignore_invalid_cmds=alls_ignore_invalid_cmds,
779 )
781 return self._auto_alls, self._output_hef_data, self._output_integrated_pb_graph

File ~/miniconda3/envs/hailoenv/lib/python3.10/site-packages/hailo_sdk_client/allocator/hailo_tools_runner.py:693, in HailoToolsRunner.call_builder(self, network_graph_path, output_path, blind_deserialize, **kwargs)
691 sys.excepthook = _hailo_tools_exception_hook
692 try:
→ 693 self.run_builder(network_graph_path, output_path, **kwargs)
694 except BackendInternalException:
695 try:

File ~/miniconda3/envs/hailoenv/lib/python3.10/site-packages/hailo_sdk_client/allocator/hailo_tools_runner.py:569, in HailoToolsRunner.run_builder(self, network_graph_filename, output_filename, compilation_output_proto, agent, strategy, exit_point, params, expected_output_tensor, expected_pre_acts, network_inputs, network_outputs, allocator_script, allocator_script_mode, compiler_statistics_path, is_debug, nms_metadata, har, alls_ignore_invalid_cmds)
567 raise e.internal_exception(“Compilation failed:”, hailo_tools_error=compiler_msg) from None
568 else:
→ 569 raise e.internal_exception(“Compilation failed with unexpected crash”) from None
570 finally:
571 if self._output_integrated_pb_graph is None and self._output_hef_data is None:

BackendAllocatorException: Compilation failed with unexpected crash

Does any of the moderators have a clue to point me in the right direction?
We are stuck here and without the possibility to submitt a technical ticket we do not know how to proceed.
Thank you

Hey @d.gentner ,

It seems like you’ve encountered the error “src_node.output_format() == (first_succ)->input_format()’ failed” during model compilation. This usually indicates a mismatch between the output format of one layer and the input format of the next layer in your model. Here are a few things you can try to resolve this issue:

  1. Use the Hailo Profiler to identify tensor format inconsistencies:

    hailo profiler your_model.har
    

    Check the output and make sure the layer before and after the failing operation (shortcut in this case) have matching formats.

  2. The error appears to be related to a shortcut connection (skip connection) being added incorrectly. Try explicitly defining the tensor format for the shortcut layers. If the shortcut is unnecessary, consider removing it or breaking it into smaller connections.

  3. Set the compiler optimization level to max in your model script:

    performance_param(compiler_optimization_level="max")
    

    This can help resolve performance optimization conflicts that may be causing the issue.

Let me know if you need any help modifying your model script or if you have any other questions!