Has anyone used the Hailo 8 with KVM and PCI passthrough?

In the VM I’ve got this far:

avid@ai:~$ ai/hailort/build/hailort/hailortcli/hailortcli scan
Hailo Devices:
[-] Device: 0000:06:00.0

and

[    1.837209] hailo_pci: loading out-of-tree module taints kernel.
[    1.838390] hailo_pci: module verification failed: signature and/or required key missing - tainting kernel
[    1.841062] hailo: Init module. driver version 4.18.0
[    1.841436] hailo 0000:06:00.0: Probing on: 1e60:2864...
[    1.841853] hailo 0000:06:00.0: Probing: Allocate memory for device extension, 11632
[    1.841951] input: PC Speaker as /devices/platform/pcspkr/input/input5
[    1.842689] hailo 0000:06:00.0: Probing: Device enabled
[    1.843907] hailo 0000:06:00.0: Probing: mapped bar 0 - 0000000058cf8fd0 16384
[    1.844483] hailo 0000:06:00.0: Probing: mapped bar 2 - 00000000fe068ad9 4096
[    1.844970] hailo 0000:06:00.0: Probing: mapped bar 4 - 00000000b7f8f4c0 16384
[    1.845488] hailo 0000:06:00.0: Probing: Setting max_desc_page_size to 4096, (page_size=4096)
[    1.846087] hailo 0000:06:00.0: Probing: Enabled 64 bit dma
[    1.846468] hailo 0000:06:00.0: Probing: Using userspace allocated vdma buffers
[    1.846999] hailo 0000:06:00.0: Disabling ASPM L0s 
[    1.847467] hailo 0000:06:00.0: Successfully disabled ASPM L0s 
[    1.847971] ACPI: button: Power Button [PWRF]
[    1.866410] hailo 0000:06:00.0: firmware: failed to load hailo/hailo8_board_cfg.bin (-2)
[    1.867158] firmware_class: See https://wiki.debian.org/Firmware for information about missing firmware
[    1.868002] hailo 0000:06:00.0: firmware: failed to load hailo/hailo8_board_cfg.bin (-2)
[    1.868743] hailo 0000:06:00.0: firmware: failed to load hailo/hailo8_fw_cfg.bin (-2)
[    1.869350] hailo 0000:06:00.0: firmware: failed to load hailo/hailo8_fw_cfg.bin (-2)
[    1.870220] hailo 0000:06:00.0: firmware: direct-loading firmware hailo/hailo8_fw.bin

So it’s not too far off.

doing a hailortcli fw-control identify hangs the VM.

I’m using libvirt/kvm with:

  <devices>
...
    <hostdev mode='subsystem' type='pci' managed='yes'>                                                                   
      <source>                                                                                                            
        <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>                                                  
      </source>                                                                                                           
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>                                         
    </hostdev>                                                                                                            
  </devices>

The VM starts eventually but I’m getting

[ 2348.723797] vfio-pci 0000:01:00.0: enabling device (0000 -> 0002)
[ 2350.067410] vfio-pci 0000:01:00.0: not ready 1023ms after FLR; waiting
[ 2351.123236] vfio-pci 0000:01:00.0: not ready 2047ms after FLR; waiting
[ 2353.266993] vfio-pci 0000:01:00.0: not ready 4095ms after FLR; waiting
[ 2357.618647] vfio-pci 0000:01:00.0: not ready 8191ms after FLR; waiting
[ 2366.065970] vfio-pci 0000:01:00.0: not ready 16383ms after FLR; waiting
[ 2382.704794] vfio-pci 0000:01:00.0: not ready 32767ms after FLR; waiting

I snipped the guest dmesg a bit much. It also has:

[    2.179770] hailo 0000:06:00.0: Firmware was loaded successfully
[    2.226824] hailo 0000:06:00.0: Probing: Added board 1e60-2864, /dev/hailo0

So the kernel is seeing the device clearly enough in the guest for the driver module to be happy.

And the vfio-pci issues are (obviously?) in the host.

so on the host

in the amd docs website under /r/en-US/pg302-qdma/Function-Level-Reset
(no links!!)

it says that

echo 1 > /sys/bus/pci/devices/$BDF/reset

Will do an FLR

If I rmmod all the vfio (and if I reboot) and try this with a “journalctl -f &” running, I get:

root@elm2:~# echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/reset
Jul 26 17:37:38 elm2 kernel: pci 0000:01:00.0: not ready 1023ms after FLR; waiting
Jul 26 17:37:39 elm2 kernel: pci 0000:01:00.0: not ready 2047ms after FLR; waiting
Jul 26 17:37:41 elm2 kernel: pci 0000:01:00.0: not ready 4095ms after FLR; waiting
Jul 26 17:37:46 elm2 kernel: pci 0000:01:00.0: not ready 8191ms after FLR; waiting
Jul 26 17:37:54 elm2 kernel: pci 0000:01:00.0: not ready 16383ms after FLR; waiting
Jul 26 17:38:11 elm2 kernel: pci 0000:01:00.0: not ready 32767ms after FLR; waiting
root@elm2:~# Jul 26 17:38:46 elm2 kernel: pci 0000:01:00.0: not ready 65535ms after FLR; giving up

Is this a bug in the PCI handling of the device/firmware?

Welcome to the Hailo Community!

We do not validate HailoRT with VMs. You may be able to get this to work or it may fail.

Out of curiosity, can you share some details about your application and why you want to use a VM?

Hey @klausk

That last post was bare metal. I wanted to see if the board supported FLR there. It doesn’t. Hopefully that’s a bug :slight_smile:

The application is in natural language speech intention analysis for automation systems - early days at the moment.

My goal is to deploy a development VM on my main VM server. It’s a Debian 12 box with a number of dev VMs on it under KVM.

The PCI passthrough should ‘just work’ and is great for GPUs.

I understand this isn’t a supported setup but it would be very useful to get it working so I’d appreciate any help that I can get!

Since that does not seem to be the case for the Hailo-8 right now, I suspect it required some engineers to develop this capability for GPUs.

I understand that. I am not sure how I can assist you further.