Just as the topic says, is there any method that I could do full load test on a hailo8 chip? And get the power comsumption as well as the cpu usage.
User guide of the integration tool posted on developer zone leads to this community. Does any one has a real user guide for integration tool ver 1.18.0?
The Hailo Integration Tool can be used to test a system before deploying it. This will allow you to test the hardware to the limits.
If you want to test more realistic uses cases without running applications you can use the HailoRT CLI tools.
hailortcli run model.hef //for single model
hailortcli run2 set-net model.hef //for multiple models
There are many options to measure power, temperature, set framerates and batch-sizes. Use --help
.
To measure power, you must use Hailo hardware equipped with a power monitoring device. This can be tested by executing the following command:
hailortcli fw-control identify --extended
You should get something like this:
...
Product Name: HAILO-8 AI ACCELERATOR M.2 M KEY MODULE
...
Device supported features: Current Monitoring, PCIE
...
Note: Current monitoring is optional and may not be available on your hardware.
CPU usage is quite independent of the Hailo-8 hardware utilization. It largely depends on the model and its usage, for example, the extent of pre- and post-processing required. Therefore, you will need to conduct tests with your specific application and the models you plan to use.
See PM.
What should I do to make sure that the resources in the chip is 100% utilized when I’m doing stress_test base on integration tool? And it seems that the parameter “expected_fps” is not a constraint of the test, because even if the input is much more higher than the model can reach, the test will still pass.
Is there any thing I could do to see the fps and the tempraure at the same time? To check the if the performance of the device could stay great under a long-term and high temperature condition.
This is an unrealistic scenario. Neural networks can not make use of 100% of the hardware resources. The integration tool comes with some specifically designed HEFs to test the limits.
The thermal test is designed to generate a report containing the necessary data, including a graph that illustrates the maximum ambient temperature at which you can operate a neural network with a specific power.
And it was saying on the official website that the hailo8 chip could reach at most 26 TOPS. But most of the models I saw on
hailo_model_zoo/docs/public_models/HAILO8 at master · hailo-ai/hailo_model_zoo · GitHub
are less than 10 TOPS while running. Is this 26 TOPS are just a theoretical value or we could get to it by infering a specific model?
Both TOPS numbers can be true at the same time.
A model may use the 26 TOPS or very close to it at a specific instance in time when all compute units are active. However because every layer requires different amounts of compute they run at different FPS. Therefore layers with higher FPS will have to wait for the layer before or after. During these times the layers are inactive resulting in a lower average TOPS number.
For realistic models it is not possible to make all layers run at the exact same speed and therefore the average TOPS number will be lower than the peak TOPS number.