AI HAT+ on Raspberry Pi Failure. FW is not loaded to the device

Hello,

My AI HAT+ module was working initially and I was able to use Degirum SDK to run models like scdepthv3 and yolo8x without issues. However, an error began to occur when I tried to run both yolo8x and scdepthv3 simultaneously, and the Raspberry Pi stopped detecting the Hailo NPU.

After this, I tried multiple debugging steps, but the Hailo NPU failed to be detected. I also mistakenly removed the PCIe cable while the Pi was still running.

After the hot-unplug, hailortcli scan fails to connect unless I manually force a PCIe reset using:

bash

echo 1 | sudo tee /sys/bus/pci/devices/0001:01:00.0/reset

or

bash

echo 1 | sudo tee /sys/bus/pci/devices/0001:01:00.0/remove
echo 1 | sudo tee /sys/bus/pci/rescan
```

After running these commands, `/dev/hailo0` appears, but when I try to use the device I get:
```
$ hailortcli fw-control identify
[HailoRT] [warning] FW is not loaded to the device. Please load FW before using the device.
Executing on device: 0001:01:00.0
[HailoRT] [error] Ioctl HAILO_FW_CONTROL failed with 19. Read dmesg log for more info
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_DRIVER_OPERATION_FAILED(36) - Failed in fw_control
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_DRIVER_OPERATION_FAILED(36) - Failed to send fw control
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_DRIVER_OPERATION_FAILED(36)
[HailoRT CLI] [error] CHECK_SUCCESS failed with status=HAILO_DRIVER_OPERATION_FAILED(36)
Failed to execute on device: 0001:01:00.0. status= HAILO_DRIVER_OPERATION_FAILED(36)
```

**Kernel logs show:**
```
[    3.941930] hailo 0001:01:00.0: Writing file hailo/hailo8_fw.bin
[    4.010889] hailo 0001:01:00.0: File hailo/hailo8_fw.bin written successfully
[    4.011002] hailo 0001:01:00.0: File hailo/hailo8_board_cfg.bin written successfully
[    4.011535] hailo 0001:01:00.0: File hailo/hailo8_fw_cfg.bin written successfully
[    9.180756] hailo 0001:01:00.0: Timeout waiting for NNC firmware file, boot status 4294967295
[    9.180766] hailo 0001:01:00.0: Firmware load failed
[    9.180828] hailo 0001:01:00.0: Failed activating board -110
[    9.180850] hailo 0001:01:00.0: probe with driver hailo failed with error -110
```

When attempting resets, I also see:
```
[  215.394330] hailo 0001:01:00.0: not ready 1023ms after FLR; waiting
[  283.618729] hailo 0001:01:00.0: not ready 65535ms after FLR; giving up
[  611.492005] hailo 0001:01:00.0: Failed hailo pcie soft reset. err -110
[  686.409781] hailo 0001:01:00.0: Failed writing fw control to pcie
[  700.880031] hailo 0001:01:00.0: Device disconnected while opening device

I have tried:

Multiple complete power cycles (including 10+ minute waits with module removed)
Full reinstall of Hailo software packages

When I run sudo setpci -s 0001:01:00.0 0x0.L, it returns 28641e60 (the correct device ID), so the chip does respond to basic PCIe config reads. But the Neural Network Core won’t boot - it just returns boot status 0xFFFFFFFF.

Can anyone please help me?

Hey @Abcmouse ,

Welcome to the Hailo Community!

The Raspberry Pi is detecting the device fine and the driver loads the firmware without problems, but then the Neural Network Core never comes online. We’re seeing boot status stuck at 0xFFFFFFFF and hitting a timeout error (-110). There are also messages about “Firmware load failed” and “Failed activating board,” but the PCIe connection looks solid and the firmware files are writing properly.

What’s interesting is that you mentioned running one model works, then switching to another model works, but running them both at the same time causes failures. That tells me this isn’t a hardware problem—it’s more likely the Hailo service struggling to juggle multiple model instances. So let’s verify the hardware is actually okay first.

Here’s What I’d Try:

  1. Do a proper reset of the connection:

    • Power down the Pi completely and unplug it
    • Carefully pull out the AI HAT+ and reseat it properly
    • Check both ends of the PCIe ribbon cable and make sure they’re seated firmly
    • Plug everything back in and run these commands:
      lspci | grep Hailo
      lsmod | grep hailo_pci
      dmesg | grep -i hailo
      hailortcli scan 
      hailortcli fw-control identify
      
  2. Make sure your software is up to date:

    • If you haven’t recently, try reinstalling the hailo-all package ( 4,23 version)
  3. Test with just one model:

    • Run a single model and see if it completes without errors
    • Don’t try running multiple models yet
    • If one model works, your hardware is fine and we can focus on fixing how the service manages concurrent loads
    • Once we confirm this works, I can walk you through setting up the systemd service correctly

If You Still Hit the Same Errors:

If those “Firmware load failed” and “-110” timeout errors keep showing up after all this, there’s a chance we’re looking at actual hardware damage. You mentioned you hot-unplugged the PCIe ribbon cable while everything was powered on, and honestly, that’s pretty risky. That kind of thing can damage the ribbon cable itself, the Hailo chip, or the PCIe connector.

First, completely uninstall the software and start fresh. When you do this, make sure you also remove the driver files from the kernel—don’t just do a standard uninstall.

If that doesn’t solve it, run those diagnostic commands I mentioned earlier and send us the results. That’ll give us a much better picture of what’s going on.

Let me know how it goes!