How do I cascade multiple models on Rasberry Pi?

My project requires some cascading of models.

I am using the Raspberry Pi 5 with the 24TOPs AI Hat.

Here is what I want to do.

There is currently one camera, and I want to feed the stream into a YOLO model. The output of that yolo model will then need to go into another YOLO model and potentially a classification model. How can I implement this on the RPI?
Please note that the cascade is conditional. It may need to go to a different model depending on the detected class.

Something like below.

Yolo > Yolo > Classification
|
> Classification

I’ve been able to convert my model to HEF and get it running with the basic rpi-example object detection. But unsure how to proceed with cascading multiple outputs.

Ideally, I do not want the video stream to be running at all times either, only when there is significant change to the previous frame. This is in hopes to reduce the power consumption.

Help appreciated. Thank you!

Hey @kiran ,

Welcome to the Hailo Community !

We have a pipeline in tappas ( not integrated into rpi examples yet )
that has two model with cascading pipeline , please check it out

You can adapt it to rpi examples , its not hard to do .

Or you can do it manually with steps going something like this :

Step 1: Architecture Overview

  1. Cascading Pipeline Logic:
  • Feed the camera stream into the first YOLO model.
  • Based on YOLO’s detection results:
    • Route specific detections to a second YOLO model.
    • Route other detections to a classification model.
  1. Conditional Execution:
  • Use the detection output to trigger subsequent models only when required, avoiding unnecessary computation.
  1. Change Detection:
  • Monitor significant changes between frames to determine whether the pipeline should process the current frame.

Step 2: Implementation Details

Conditional Model Cascading

Use the HailoRT API to manage the cascading pipeline. Here’s how you can set up the logic:

  • Initial YOLO Inference:
    • Run the YOLO model on the input frame.
    • Retrieve the bounding boxes and class predictions.
  • Conditional Logic:
    • Use the YOLO output to decide whether to:
      • Send the ROI (Region of Interest) to a secondary YOLO model.
      • Send the ROI to a classification model.
      • Skip further processing for unimportant detections.
  • Dynamic Input for Subsequent Models:
    • Extract the ROI from the original frame based on YOLO’s detections.
    • Resize and preprocess the ROI to match the input dimensions of the next model.

3. Frame Change Detection

Implement basic frame differencing or use lightweight motion detection to identify significant changes:

import cv2
import numpy as np

def detect_frame_change(prev_frame, current_frame, threshold=30):
    # Compute absolute difference
    diff = cv2.absdiff(prev_frame, current_frame)
    # Convert to grayscale
    gray_diff = cv2.cvtColor(diff, cv2.COLOR_BGR2GRAY)
    # Threshold the difference
    _, thresh = cv2.threshold(gray_diff, threshold, 255, cv2.THRESH_BINARY)
    # Count non-zero pixels
    change = np.sum(thresh > 0)
    return change > some_significant_change_value

Use this function to decide whether to run the pipeline for the current frame.