So I bought the Raspberry Pi AI Kit, assuming it would work as easy as the Coral Edge TPU USB Accelerator. Unfortunately it is not the case, and now I’m stuck trying to figure out the dependencies of the whole Hailo stack.
Won’t it be great if one could use the AI Kit (Hailo-8L) directly with TensorFlow Lite, just like one would use the Coral Edge TPU?
So my request is, please add a delegate to TensorFlow Lite to support the Hailo-8.
My main goal is to get the Hailo-8L working on a Raspberry Pi 5, running Ubuntu 24.04, as that is the setup our industrial customers use. I’ll pop the details of the issues in another post.
I understand this, but on the other hand this causes fragmentation in the market, where we would rather want to see standardisation.
If a customer comes to use with an existing model, we would like to deploy that model on any platform: AMD64 CPU, Nvidia, Coral TPU, Hailo, MediaTek, etc. TensorFlow, and TF Lite looks like a good standard to support for this. But I’m open to suggestions.
The only thing we want to avoid here is to have to build and maintain a pipeline for each specific hardware accelerator. Especially if these pipelines need manual intervention - with context and knowledge of the model - like they currently do.
What I want to see is a ML workload that is as portable as a Linux amd64 binary - it will run on any CPU supporting the amd64 instruction set.
Exactly the issue. We are not talking about CPUs but CPU, GPUs and specialized hardware like the Hailo-8. Even on CPUs you have different instruction sets and extensions like AVX and NEON.
While this might be inconvenient. It also allows to innovate and find unique solutions. And it keeps us engineers employed. Would you still work on AI if it was simple or would you want to work on the next big thing.
That is the price to pay, if you want to run a NN on a low power, high efficiency chip and at the price of a Hailo-8 over an expensive GPU. That may not always be the economic thing to do. If you only have one system and plenty of space you are better off paying for an expensive GPU. Once you want to scale, paying for extra software development pays off many times in reduced hardware cost.
I second this request 100% The reason I use Raspberry Pi instead of rkXXXX or some other hot benchmarking CPU is that I want to code my own project, not spend 40 hours debugging their solution stack. I had the 4-tops Coral accelerator from the beginning, and I was excited to finally see 13/26 tops vs. 4-tops solution on RPi5. I dropped the bucks with no hesitation, only to find out I just bought a time sink that forced the design pattern of my code. I am glad I didn’t purchase the 26-tops model. You see, I pay for a solution, not just a piece of difficult hardware with tops written on it. To be worth the money the Hello World example should look like this:
model = HALIO(“yolo11n.pt”)
while True:
frame = picam2.capture_array()
results = model(frame)
annotated_frame = results[0].plot()
cv2.imshow(“Camera”, annotated_frame)
Now show me 100 fps at 640x480, and I would buy the 26-tops model.
@cfrank
So glad to hear you got it to work. Happy to see that the final code does look like how you envisioned. The 30fps is most likely limited by the camera. If you need higher performance (more cameras and/or more models), you can use batch predict method in our PySDK. Please let me know if you need help with using batch predict or with any models or any other advanced features like tracking, tiling, etc.
@jpm
I am not sure if you are still looking for a solution but at DeGirum we developed our PySDK to solve this exact problem. Please take a look at Simplifying Edge AI Development with DeGirum PySDK and Hailo. Apart from Hailo, PySDK supports Google EdgeTPU, NVidia GPUs/SoCs, RockChip SoCs, Intel CPUs/GPUs/NPUs so that you can use one SW stack to target multiple hardware options.