Hi,
I want to use yolov7.hef (from model zoo) with 2 hailo-8 devices.
When I ran hailortcli scan
, the results showed that there are two devices:
Device: 0000:01:00.0
Device: 0001:01:00.0
When I run hailortcli run yolov7.hef - -device-count 2
, it progresses to a certain point, but then the inference stops and doesn’t continue.
However, if I set the device-count
option to 1 or don’t specify it at all, it runs fine.
Additionally, when using shortcut_net.hef
(from libhailort’s examples) instead of yolov7.hef
, it works fine even with device-count
set to 2.
I’m using:
Could this issue be related to yolov7.hef being multi-context? (The number of contexts for yolov7.hef is shown as 4.)
Am I the only one experiencing this issue? If anyone who has this working correctly, please let me know.
Also, if anyone has any idea why this might be happening, please share your thoughts. Thank you.
This is working fine on my system with 5 Hailo-8 (4 on a Hailo-8 Century PCIe card and one on a Thunderbolt adapter).
Your IDs PCIe IDs look different than mine. How are the Hailo-8 devices connected? What is the platform e.g. are these two Hailo-8 M.2 modules of the same type on a motherboard, some adapter with or without PCIe switch?
Do you get any error messages?
Thanks for your response and for looking into my issue.
Regarding the platform I’m using with the Hailo chip, while I’m not a hardware specialist, I’ve heard from the person who’s handling the hardware side of our project that we have two Hailo chips directly integrated as a custom addition to an NXP LX2160A board.
Specifically, they are attached to PCIe controller 1 and PCIe controller 2 respectively.
I forgot to mention something earlier. When I explicitly specify - - frames-count
(e.g. - - frames-count 100
), the error becomes clear: the process stops at a certain percentage, and after 10 seconds, the following error message appears:
$ hailortcli run /var/yolov7.hef —device-count 2 —frames-count 100
Running streaming inference (/var/yolov7.hef):
Transform data: true
Type: auto
Quantized: true
Network yolov7/yolov7: 72% | 72/100 | FPS: 7.19 | ETA: 00:00:03
[HailoRT] [error] Got HAILO_TIMEOUT while waiting for input stream buffer yolov7/input_layer1
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_TIMEOUT(4) - Failed write to stream (device: 0000:01:00.0)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_TIMEOUT(4) - HwWriteEl1yolov7/input_layer1 (H2D) failed with status=HAILO_TIMEOUT(4)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_TIMEOUT(4)
[HailoRT] [error] Queue element PushQEl1yolov7/input_layer1 run in thread function failed! status = HAILO_TIMEOUT(4)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_TIMEOUT(4)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_TIMEOUT(4)
I hope this detailed information helps in diagnosing the problem. Please let me know if you require any further clarification.
Thanks again for your assistance!
Best regards,
Dear @KlausK
I’ve made some more specific findings regarding the issues I was previously experiencing, which allows me to ask more targeted questions.
Recently, I downloaded the yolov7.hef file from the Hailo Model Zoo GitHub repository under tag v2.12, and it is working perfectly fine with my setup (libhailort, Hailo PCIe driver, and Hailo firmware version 4.18.0).
Previously, I had a yolov7.hef file that caused errors when running with two or more chips. While I believed this older file was also sourced from the Hailo Model Zoo, I am no longer certain about the exact download location due to the time elapsed.
Upon checking my download history, I found that the older, problematic yolov7.hef was downloaded around August 2024. The crucial difference I noted is the download link:
I have reviewed the commit history of the Hailo Model Zoo GitHub repository to see if there were any instances where the download link for yolov7.hef resembled the https://hailo-csdata.s3.eu-west-2.amazonaws.com/resources/hefs/h8/yolov7.hef format. However, in all the links I examined (though I haven’t gone through every single commit), the URLs followed the https://hailo-model.zoo … structure.
Therefore, I have the following two questions:
-
Could the https://hailo-csdata.s3.eu-west-2.amazonaws.com/resources/hefs/h8/yolov7.hef file have indeed originated from the Hailo Model Zoo GitHub at some point? If not, could you please clarify where this specific yolov7.hef download link might have been posted?
-
Could you shed some light on the differences between these two yolov7.hef files that might cause the older https://hailo-csdata.s3.eu-west-2.amazonaws.com/resources/hefs/h8/yolov7.hef to produce errors when running with two or more chips? If the exact root cause is unclear, please do not feel obligated to provide an answer to this question.
Thank you for your time and assistance with this matter.
Sincerely,