Hi
We are running the Hailo8 M.2 chip in our camera, and during stress testing, stressing CPU, GPU, Camera capture, and the Hailo8, we sometimes see the messages below in dmesg
.
[ 3023.174299] pcieport 0000:00:01.5: AER: Corrected error received: 0000:05:00.0
[ 3023.181596] hailo 0000:05:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 3023.190945] hailo 0000:05:00.0: device [1e60:2864] error status/mask=00000001/00006000
[ 3023.199055] hailo 0000:05:00.0: [ 0] RxErr (First)
It even seems like these message occur more frequently after we also started doing power/temperature measurements for the Hailo8. We’ve mostly run with v4.15, but we’ve also tested with the latest driver + firmware.
This is not something we see for other PCIe devices (eg. our sensor device), which we also stress.
Should we be worried about these messages? Do they indicate some issue with our setup? Or are they safe to ignore?