I have deployed the AI model on the Raspberry-pi and now i need to measure the TOPs consumed by the AI Model.
As i am new to the Hailo I need the help to complete the Measure the TOps
I have deployed the AI model on the Raspberry-pi and now i need to measure the TOPs consumed by the AI Model.
As i am new to the Hailo I need the help to complete the Measure the TOps
You cannot directly measure the TOPS of a model running on a Hailo device. The architecture is similar to an FPGA, with compute resources distributed across the device, rather than a CPU with cycle counters.
However, you can generate a profiler report for your model, which provides the number of operations (OPS) per input tensor. To do this, follow the built-in tutorials in the Hailo AI Software Suite. Inside the Docker container, run the following command:
hailo tutorial
The usage of the profiler is demonstrated in the DFC 3 Compilation tutorial.
For models available in the Model Zoo, you can download the corresponding Profiler Report directly from our GitHub page by clicking the PR link. For example:
GitHub - Hailo Model Zoo - Hailo-8 Object Detection
You can measure the maximum FPS of a compiled model on your platform using the HailoRT CLI. Simply run:
hailortcli run model.hef
With this information, you can calculate the maximum TOPS for your setup. Keep in mind that the value may change if your application runs at a lower FPS or on a different host, for example, one with more PCIe lanes.
can i get some more detailed way for better understanding it
On the terminal I got this can get some clarity on this so that i can understand it
HW-only FPS : 61.863900
Peak TOPS (HW-only) : 1.232915655
Streaming FPS : 0.353611
Streaming TOPS : 0.007047285
hw_only mode runs the model without dequantizing the data, therefore removing influence of the host CPU performance.
Streaming mode runs the model including dequantization of data and NMS when included in the HEF. Because these run on the host CPU this could be lower than hw_only mode.
In your case the number is unexpectedly low. Is this from the same model?
If your model is compiled to multiple contexts you can achieve a higher FPS/TOPS by using the --batch-size parameter, because it will reduce the switching overhead at the cost of latency. You can find out whether your model is single or multiple context by running the following command:
hailortcli parse-hef model.hef
hailortcli run model.hef --batch-size 8