Inference using scrfd models

Hi everyone!

I am trying to make sense of running inference using the scrfd models provided in the Hailo Model Zoo and have a few questions:

  • To run meaningful inference using a scrfd model (e.g. scrfd_2.5g), I need to/should use the postprocessor script from the Hailo Model Zoo located under hailo_model_zoo/core/postprocessing/face_detection, right?
  • To use the postprocessor, I need to supply the anchor sizes. I obtained those by looking at hailo_model_zoo/cfg/base/scrfd.yaml and copying the values from there into a simple dict like this: anchors = {"steps": [8, 16, 32], "min_sizes": [[16, 32], [64, 128], [256, 512]]}. Is this correct? (And if so, wouldn’t it make sense to declare these values as deefaults in the script?)
  • Finally, I would need to actually perform the postprocessing using the tf_postproc function. Here is where I’m stuck - what exactly are the endnodes supposed to be? I copied the approach from the Hailo-Application-Example code - Python - Streaming example that is based on a yolox model. However, the desired inputs and ways to use the postprocessor differ. Based on the output shape of scrfd and the endnodes used in the code example, I assume that I have to derive the endnodes in some way from the output shape of the model - but the numbers used in the example code look essentially like magic numbers to me. Could someone help me out here?

As I plan on using Hailo-8 in a commercial environment and am unfortunately in no position to purchase some sort of license from DeGirum, using DeGirum’s PySDK is not an option here

Hey @Nils-Oliver ,

Your first two questions are correct.

Finding End Nodes

To identify the end nodes needed for post-processing:

  1. Use hailomz info scrfd_2.5g to get the complete list of model layers
  2. Look for layers with naming patterns like bbox_, cls_, or landmark_ followed by numbers
  3. Check model configuration in hailo_model_zoo/cfg/networks/scrfd_2.5g.yaml for declared outputs
  4. Visualize the model architecture in Netron:
netron path/to/scrfd_2.5g.onnx
  1. Run with profiling enabled to trace tensor flow:
hailomz profiler scrfd_2.5g.har or hailomz parse-hef scrfd_2.5g.hef

The typical SCRFD end nodes include three groups of tensors for each feature level (8px, 16px, 32px), representing bounding boxes, confidence scores, and facial landmarks.