Hello, I deployed my convolutional neural network on the Raspberry Pi 5 AI Kit (26TOPS), with the Hailort version being 4.20.
However, I now want to optimize the running speed of my network. One feasible method is to examine the compatibility between my network and Hailo8,
That is, I want to check the actual running time of each component during a single inference, and obtain the data as shown in the following figure. Are there any recommended methods?
You can create a profiler report. It will give you the FPS and other data of each layer. Check the Hailo Dataflow Compiler User Guide or run the tutorials in the Hailo AI Software Suite Docker with the following command to find out how to create a profiler report:
hailo tutorial
You can also run:
hailo profiler --help
Note: The Hailo Dataflow Compiler can implement the same layer running faster or slower by using different amount of resources. So the same layer will run faster in a network that is smaller because the there are more resources available for each layer.
To get the maximum performance for a specific network you can use the performance mode of the compiler. This method of compilation will require significantly longer time to complete, because the compiler tries to use very high utilization levels, that might not allocate successfully. If it fails to allocate, it automatically tries lower utilization, until it finds the highest possible utilization. See Hailo Dataflow Compiler User Guide.