Hi.
I’m working with Hailo-8 and experimenting with semantic segmentation models.
I noticed that while some models can be compiled and run at 640 x 640 input resolution, others fail during HEF compilation due to context/mapping issues.
Currently, I’m comparing PIDNet-M and Segformer0/1 models.
What I understand so far (please correct me if wrong)
From my experiments, it seems that:
-
The limiting factor is not host RAM or GPU memory
-
But rather:
-
activation peak size
-
control graph complexity
-
inter-context routing / internal field limits
-
-
Especially models with multi-scale decoder + large concat operations (e.g. SegFormer) scale poorly with input resolution
On the other hand, CNN-based models like PIDNet, which use add-based fusion and smaller activation peaks, seem more Hailo-friendly at higher resolutions.
So my question is;
1.Input resolution vs memory allocation
-
How exactly does input resolution affect:
-
on-chip memory allocation
-
control graph size
-
and context partitioning?
-
-
Is the growth closer to linear, quadratic, or model-structure dependent?
2.Why 640×640 works for some models but not others
-
Is there a rule of thumb for why PIDNet-M @ 640×640 can sometimes compile,
while SegFormer-B0/B1 cannot, even though parameter counts may be similar? -
Is activation peak size the dominant factor here?
3.Best practices for HEF compilation at higher resolutions
- Are there recommended strategies when trying to push input resolution higher?
4.Does higher resolution affect the inference performance results? If so, how much does it influence the result?