Face Detection + Gender Classification: Pipelining two models on Hailo devices

User Guide: Model Pipelining with DeGirum PySDK

Model pipelining is a versatile approach in AI, allowing multiple models to work in sequence. The output of one model is used as the input to another, enabling more sophisticated applications by combining specialized tasks. This guide introduces the concept using a practical example of face detection followed by gender classification.


Example: Face Detection and Gender Classification

In this example, we use two models:

  1. Face Detection Model: Detects faces in a video stream and generates bounding boxes around them.
  2. Gender Classification Model: Classifies the gender of each detected face.

The models are combined into a pipeline where the face detection model processes the input video, and its outputs (cropped face regions) are passed to the gender classification model for further analysis.


Code Reference

import degirum as dg, degirum_tools

# choose inference host address
inference_host_address = "@cloud"
# inference_host_address = "@local"

# choose zoo_url
zoo_url = "degirum/models_hailort"
# zoo_url = "../models"

# set token
token = degirum_tools.get_token()
# token = '' # leave empty for local inference

face_det_model_name = "yolov8n_relu6_face--640x640_quant_hailort_hailo8l_1"
gender_cls_model_name = "yolov8n_relu6_fairface_gender--256x256_quant_hailort_hailo8l_1"
video_source = "../assets/faces_and_gender.mp4"

# Load face detection and gender detection models
face_det_model = dg.load_model(
    model_name=face_det_model_name,
    inference_host_address=inference_host_address,
    zoo_url=zoo_url,
    token=degirum_tools.get_token(),
    overlay_color=[(255,255,0),(0,255,0)]    
)

gender_cls_model = dg.load_model(
    model_name=gender_cls_model_name,
    inference_host_address=inference_host_address,
    zoo_url=zoo_url,
    token=degirum_tools.get_token(),
)

# Create a compound cropping model with 30% crop extent
crop_model = degirum_tools.CroppingAndClassifyingCompoundModel(
    face_det_model, 
    gender_cls_model, 
    30.0
)

#  Run AI inference on video stream and display inference results
# Press 'x' or 'q' to stop
with degirum_tools.Display("Faces and Gender") as display:
    for inference_result in degirum_tools.predict_stream(crop_model, video_source):
        display.show(inference_result)

How It Works

  1. Model Loading:

    • The face detection and gender classification models are loaded using dg.load_model.
    • The overlay_color parameter is used to differentiate detected faces visually.
  2. Pipeline Creation:

    • A compound model is created using degirum_tools.CroppingAndClassifyingCompoundModel, which combines face detection and gender classification.
    • The crop_extent parameter (30% in this example) ensures that the face region is appropriately cropped for the second model.
  3. Inference Execution:

    • The predict_stream function runs the compound model on the input video source.
    • Cropped face regions are passed from the detection model to the classification model.
  4. Result Display:

    • Detected faces and their classified genders are displayed in a dedicated window. Use the ‘x’ or ‘q’ keys to stop the display.

Applications

  • Video analytics
  • Security and surveillance
  • Retail and customer insights
  • Personalized user experiences

This pipelining approach allows you to extend workflows to include additional tasks, such as emotion detection or age estimation, making it a scalable and modular solution for complex AI applications.

Hi, where i can get information about degirum tools like CroppingAndClassifyingCompoundModel ?

Hi @kyurrii
We are currently working on comprehensive docs for degirum tools. In the meantime you can look at the code and code examples as the repo is public: DeGirum/degirum_tools: Utilities for use with PySDK. Please let us know if you have any specific questions.

1 Like

Ok. I looked through the examples, but it would be good really to have list of tools with specifications. So looking forward to appearance of the docs you mentioned.

Should we stop any development we are doing with the rpi5 examples that use GStreamer and start learning the PySDK if we plan to do multi model processing?

Hi @user116
Both the Gstreamer based pipelines and DeGirum PySDK are built on top of HailoRT. PySDK APIs are simpler for application development, and we try to provide working examples for many common scenarios. But it is up to the end user to choose the framework most suitable for their needs. Hope this helps.

@kyurrii
Happy to let you know that we have initial version of documentation for degirum_tools. It still needs work but the current version should already serve as a good starting point.

@shashi, Thanks for notification. Will check out asap.

I can’t find any multi-modal documentation with GStreamer so I’m assuming we should switch to PySDK as all the examples for multi-modal are pointing that direction. Thank you @shashi

Hi @user116
Multi model applications are possible with gstreamer as well but they are hard to develop and debug. PySDK makes it simpler for such use cases.

@shashi I’m liking the PySDK stuff I’ve been reading for the last few hours. I’ve got the hailo_examples installed and running with Jupyter but I haven’t been able to get the rpi camera running in any of those examples. All the examples use static images or video files so I’m still reading for how to use the live camera.

Hi @user116
Please see this example: hailo_examples/examples/016_custom_video_source.ipynb at main · DeGirum/hailo_examples · GitHub