Hello Hailo Community,
I encountered a critical issue while running a stress test on the Hailo-8L processor with a Raspberry Pi 5 using the 57.hef AI model. The error message I received is as follows:
[HailoRT] [critical] Got health monitor CPU ECC fatal event. memory_bitmap=4096
Test Details:
- Hardware:
- Hailo-8L
- Raspberry Pi 5
- AI Model: 57.hef (default model provided by Hailo)
- Command Used:
hailortcli run /var/hailo_integration_tool/thermal/HAILO8L/57.hef -t 172800 --measure-temp
Questions:
- What does the memory_bitmap=4096 indicate in the context of the ECC fatal event?
- Is this issue indicative of a hardware fault, or could it be related to software/firmware?
- Are there recommended steps to debug or resolve this issue?
- Has anyone else encountered similar errors during stress tests, and if so, how were they mitigated?
I would greatly appreciate any guidance or insights from the community regarding this issue. If further details are needed, I am happy to provide them.
Thank you for your support!