Failed during compilation with "map at"

smat · August 28, 2024, 1:51pm

Here are my logs the error message is pretty unhelpful. I am guessing its some memory error but not sure? I removed quite a bit of logs to get below post limit.

[info] To achieve optimal performance, set the compiler_optimization_level to "max" by adding performance_param(compiler_optimization_level=max) to the model script. Note that this may increase compilation time.
[info] Loading network parameters
[info] Starting Hailo allocation and compilation flow
[info] Finding the best partition to contexts...
e[?25l[info] Iteration #1 - Contexts: 4
...
[info] Iteration #57 - Contexts: 4
e[?25h
[info] Using Multi-context flow
[info] Resources optimization guidelines: Strategy -> GREEDY Objective -> MAX_FPS
[info] Resources optimization params: max_control_utilization=60%, max_compute_utilization=60%, max_compute_16bit_utilization=60%, max_memory_utilization (weights)=60%, max_input_aligner_utilization=60%, max_apu_utilization=60%
[info] Solving the allocation (Mapping), time per context: 59m 59s

[info] context_0 (context_0):
Iterations: 4
Reverts on cluster mapping: 0
Reverts on inter-cluster connectivity: 0
Reverts on pre-mapping validation: 0
Reverts on split failed: 0
[info] context_1 (context_1):
Iterations: 4
Reverts on cluster mapping: 0
Reverts on inter-cluster connectivity: 0
Reverts on pre-mapping validation: 0
Reverts on split failed: 0
[info] context_2 (context_2):
Iterations: 4
Reverts on cluster mapping: 0
Reverts on inter-cluster connectivity: 0
Reverts on pre-mapping validation: 0
Reverts on split failed: 0
[info] context_3 (context_3):
Iterations: 4
Reverts on cluster mapping: 0
Reverts on inter-cluster connectivity: 0
Reverts on pre-mapping validation: 0
Reverts on split failed: 0
[info] context_0 utilization: 
[info] +-----------+---------------------+---------------------+--------------------+
[info] | Cluster   | Control Utilization | Compute Utilization | Memory Utilization |
[info] +-----------+---------------------+---------------------+--------------------+
[info] | cluster_0 | 100%                | 25%                 | 30.5%              |
[info] | cluster_1 | 43.8%               | 10.9%               | 14.8%              |
[info] | cluster_4 | 56.3%               | 14.1%               | 20.3%              |
[info] | cluster_5 | 56.3%               | 15.6%               | 32.8%              |
[info] +-----------+---------------------+---------------------+--------------------+
[info] | Total     | 64.1%               | 16.4%               | 24.6%              |
[info] +-----------+---------------------+---------------------+--------------------+
[info] context_1 utilization: 
[info] +-----------+---------------------+---------------------+--------------------+
[info] | Cluster   | Control Utilization | Compute Utilization | Memory Utilization |
[info] +-----------+---------------------+---------------------+--------------------+
[info] | cluster_0 | 50%                 | 15.6%               | 18.8%              |
[info] | cluster_1 | 75%                 | 20.3%               | 27.3%              |
[info] | cluster_4 | 100%                | 29.7%               | 30.5%              |
[info] | cluster_5 | 75%                 | 21.9%               | 31.3%              |
[info] +-----------+---------------------+---------------------+--------------------+
[info] | Total     | 75%                 | 21.9%               | 27%                |
[info] +-----------+---------------------+---------------------+--------------------+
[info] context_2 utilization: 
[info] +-----------+---------------------+---------------------+--------------------+
[info] | Cluster   | Control Utilization | Compute Utilization | Memory Utilization |
[info] +-----------+---------------------+---------------------+--------------------+
[info] | cluster_0 | 68.8%               | 17.2%               | 21.9%              |
[info] | cluster_1 | 50%                 | 12.5%               | 17.2%              |
[info] | cluster_4 | 100%                | 25%                 | 36.7%              |
[info] | cluster_5 | 37.5%               | 9.4%                | 14.8%              |
[info] +-----------+---------------------+---------------------+--------------------+
[info] | Total     | 64.1%               | 16%                 | 22.7%              |
[info] +-----------+---------------------+---------------------+--------------------+
[info] context_3 utilization: 
[info] +-----------+---------------------+---------------------+--------------------+
[info] | Cluster   | Control Utilization | Compute Utilization | Memory Utilization |
[info] +-----------+---------------------+---------------------+--------------------+
[info] | cluster_0 | 68.8%               | 17.2%               | 22.7%              |
[info] | cluster_1 | 6.3%                | 1.6%                | 2.3%               |
[info] | cluster_4 | 62.5%               | 18.8%               | 19.5%              |
[info] +-----------+---------------------+---------------------+--------------------+
[info] | Total     | 34.4%               | 9.4%                | 11.1%              |
[info] +-----------+---------------------+---------------------+--------------------+
[warning] Failed dump best-effort allocation proto: map::at
[info] Successful Mapping (allocation time: 1m 50s)
[info] Compiling context_0...
[info] Compiling context_1...
[info] Compiling context_2...
[info] Compiling context_3...
[info] Bandwidth of model inputs: 15.8203 Mbps, outputs: 47.4609 Mbps (for a single frame)
[info] Bandwidth of DDR buffers: 0.0 Mbps (for a single frame)
[info] Bandwidth of inter context tensors: 221.045 Mbps (for a single frame)
[error] Compilation failed with error: 'map::at'
e[?25h
[error] Failed to produce compiled graph
map::at

smat · September 3, 2024, 10:17pm

Any way to get more informative log on this, or how to debug this so I know how to change the model so it will compile?

Nadav · September 4, 2024, 4:55am

Hi @smat,
On the running directory you should see a hailo.core.log, it’s a bit cryptic, but we know how to get stuff out of it.

smat · September 6, 2024, 1:24am

What should I look for? I do see these logs now that you mention them.

Nadav · September 6, 2024, 8:29am

Look from the bottom up for an error. You can DM me that file, and I’ll look as well.

l.hammond · November 5, 2024, 6:09pm

Hi @Nadav I’m having the same issue and tried DM’ing you. There is no error in my core log.

Nadav · November 6, 2024, 10:58am

Hi @l.hammond - can you share a few details on the model that you are compiling like how many parameters, is it a CNN or a transformer based, how many context were on the hef?

l.hammond · November 6, 2024, 7:47pm

Hi @Nadav -
The model we’re currently trying to compile is a fine-tuned CNN-based YOLOv8n model from Ultralytics with 3.2M parameters. Input shape is (1,3,576,384) with a dtype of float32. According to hailo.core.log, the whole model fits in a single context

l.hammond · November 7, 2024, 2:30pm

@Nadav now I’m not sure on the number of contexts. Where do I check to know for sure?

l.hammond · November 11, 2024, 3:51pm

Hello @Nadav do you have any recommendations?

Nadav · November 12, 2024, 1:56pm

The model and it’s hyper parameters seems legite. Would you allow that I will try to compile that on my end?

l.hammond · November 12, 2024, 10:05pm

@Nadav cool … what all would you need. the har?

Nadav · November 13, 2024, 6:21am

onnx is better, so we can test with different end-nodes etc. Har can also work

l.hammond · November 15, 2024, 3:49pm

I can send it to you. I just realized that this model was trained by a colleague on a yolo model that did not come from Hailo’s model zoo. Should that matter?

l.hammond · November 22, 2024, 6:47pm

Hi @Nadav I DM’d you a google drive link

Nadav · November 26, 2024, 10:30am

Hammond,

Can you please tell me the md5sum of the file? It seems that the proto is corrupted.

I’ve downloaded the file twice I get the same error after unzipping it.

l.hammond · November 26, 2024, 1:56pm

Hi @Nadav I checksummed what I have locally and then redownloaded and checksummed and they all match. So are you suggesting that it’s the .onnx itself that has some corruption? Is there something I can do on my side to try reproducing. Also, is it required to start from a model that is in your model zoo? Ultimately I am going to be doing multistream via tappas … do I need to start from the yolo8vm model that is starred in the model zoo?

23e5f91261471c51e79766c24955d520 best_hailo.onnx
5358847da563fbb3364ed23589cfc950 best_hailo.onnx-gdrive.tar.bz
5358847da563fbb3364ed23589cfc950 best_hailo.onnx.tar.bz.orig

Nadav · November 28, 2024, 10:41am

Hi Hammond,

Yes, the MD5SUM of the onnx is the same as I have, but I am unable to even open it with a Netron.

You can start from the Zoo, is it the easiest starting point for a model that we have there. If it’s a model that we don’t have there, using the DFC directly is your way to go. The reason it’s easier, since the “recipe” is already baked in to the package.

Topic		Replies	Views
Network Graph Compilation Failure General dfc , raspberry-pi , hailo8	2	113	February 26, 2025
Failed to parse the model into Hef General	2	82	March 10, 2025
Performance Flow requires automatic resource utilization General	2	261	October 6, 2024
There was an error compiling the quantified model into a hex file General	8	130	February 17, 2025
「Resolver didn't find possible solution」Mapping Failed General	4	163	February 17, 2025

Failed during compilation with "map at"

Related topics