Processing live video stream from opencv on raspberry pi 5

Hello @shashi ,

I have couple of doubts with respect to the object tracking link provided,

  1. Is there anyway to obtain the tracking ids in order to check how many ids were generated (or how many distinct objects were tracked) ?
  2. When no objects are found, the inference_result will be empty. In this case, how can we check if the inference_result is empty or not ?

Would you be kind to please help clarify ?

Hi @AKSHAY_KUMAR
You can look at the results object and find the track id assigned to every detection. Since, inference results object is a list of dictionaries, you can simply check if inference results is empty or not. Please let me know if you need further help.

1 Like

Hello @shashi , Thank you very much, it really helped.
I am now able to filter exactly what is needed from the tracker.

Hello Shashi,

In order to adequate this code to my example, I have changed model name to one downloaded from Hailo_model_zoo.
The only change made was the model name (and its location):
model_name = "mobilenet_v2_1.0"

By doing so, the following error appears:

degirum.exceptions.DegirumException: Failed to perform model 'mobilenet_v2_1.0' inference: 'module' object is not callable

Maybe am I introducing wrong the parameters?

Thank you for your help!!

Hi @Laura_Martinez
Did your code work with a model from our model zoo? I want to eliminate basic PySDK setup issues first. In other words, do you have any PySDK examples working? If so, we can try to understand why your own model is failing.

Hello @sashi I am having a problem with constant changing of the image path in the code. Can you help me with suggestions on how to change the the code so that I can run it from any source specified by me when I run this command in terminal?
python --input

The code is down below

type or paste code hereimport degirum as dg, degirum_tools
import matplotlib.pyplot as plt

# utility function to display images
def display_images(images, title="Images", figsize=(15, 5)):
    """
    Display a list of images in a single row using Matplotlib.

    Parameters:
    - images (list): List of images (NumPy arrays) to display.
    - title (str): Title for the plot.
    - figsize (tuple): Size of the figure.
    """
    num_images = len(images)
    fig, axes = plt.subplots(1, num_images, figsize=figsize)
    if num_images == 1:
        axes = [axes]  # Make it iterable for a single image
    for ax, image in zip(axes, images):
        image_rgb = image[:, :, ::-1]  # Convert BGR to RGB
        ax.imshow(image_rgb)
        ax.axis('off')
    fig.suptitle(title, fontsize=16)
    plt.tight_layout()
    plt.show()

# utility function to crop images based on inference results
def crop_images(image, results):
    """
    Crops regions of interest (ROIs) from an image based on inference results.

    Args:
        image (numpy.ndarray): The input image as a NumPy array.
        results (list of dict): A list of inference results, each containing:
            - bbox (list of float): Bounding box in [x_min, y_min, x_max, y_max] format.
            - category_id (int): Class ID (ignored in this function).
            - label (str): Label of the detected object (ignored in this function).
            - score (float): Confidence score (ignored in this function).

    Returns:
        list of numpy.ndarray: A list of cropped image regions.
    """
    cropped_images = []

    for result in results:
        bbox = result.get('bbox')
        if not bbox or len(bbox) != 4:
            continue

        # Convert bbox to integer pixel coordinates
        x_min, y_min, x_max, y_max = map(int, bbox)

        # Ensure the bounding box is within image bounds
        x_min = max(0, x_min)
        y_min = max(0, y_min)
        x_max = min(image.shape[1], x_max)
        y_max = min(image.shape[0], y_max)

        # Crop the region of interest
        cropped = image[y_min:y_max, x_min:x_max]
        cropped_images.append(cropped)

    return cropped_images

# utility function to rearrange detections
def rearrange_detections(detections):
    # Sort characters by leftmost x-coordinate
    detections_sorted = sorted(detections, key=lambda det: det["bbox"][0])
    # Concatenate labels to form the license plate string
    return "".join([det["label"] for det in detections_sorted])

# choose inference host address
inference_host_address = "@cloud"
# inference_host_address = "@local"

# choose zoo_url
zoo_url = "degirum/models_hailort"
# zoo_url = "../models"

# set token
token = degirum_tools.get_token()
# token = '' # leave empty for local inference

# image source
image_source = "20250402_122058.jpg"

# model names
lp_det_model_name = "yolov8n_relu6_lp--640x640_quant_hailort_hailo8l_1"
lp_rec_model_name = "yolov8n_relu6_lp_ocr--256x128_quant_hailort_hailo8l_1"

# Load face detection and gender detection models
lp_det_model = dg.load_model(
    model_name=lp_det_model_name,
    inference_host_address=inference_host_address,
    zoo_url=zoo_url,
    token=degirum_tools.get_token(),
    overlay_color=[(255,255,0),(0,255,0)]    
)

lp_rec_model = dg.load_model(
    model_name=lp_rec_model_name,
    inference_host_address=inference_host_address,
    zoo_url=zoo_url,
    token=degirum_tools.get_token(),
    output_use_regular_nms=False,
    output_confidence_threshold=0.1
)

# Run license plate detection model
detected_license_plates = lp_det_model(image_source) 

# run OCR model on cropped license plates
if detected_license_plates.results:
    cropped_license_plates = crop_images(detected_license_plates.image, detected_license_plates.results)
    for index, cropped_license_plate in enumerate(cropped_license_plates):
        ocr_results = lp_rec_model.predict(cropped_license_plate)
        ocr_label = rearrange_detections(ocr_results.results)
        detected_license_plates.results[index]["label"] = ocr_label

# Display the final result
display_images([detected_license_plates.image_overlay], title="License Plate Recognition Result")

Hi @ajndossi
Here is how you can make input parameters specifiable through command line:

import argparse
import degirum as dg
import degirum_tools
import matplotlib.pyplot as plt

def display_images(images, title="Images", figsize=(15, 5)):
    num_images = len(images)
    fig, axes = plt.subplots(1, num_images, figsize=figsize)
    if num_images == 1:
        axes = [axes]
    for ax, image in zip(axes, images):
        image_rgb = image[:, :, ::-1]  # Convert BGR to RGB
        ax.imshow(image_rgb)
        ax.axis('off')
    fig.suptitle(title, fontsize=16)
    plt.tight_layout()
    plt.show()

def crop_images(image, results):
    cropped_images = []
    for result in results:
        bbox = result.get('bbox')
        if not bbox or len(bbox) != 4:
            continue
        x_min, y_min, x_max, y_max = map(int, bbox)
        x_min = max(0, x_min)
        y_min = max(0, y_min)
        x_max = min(image.shape[1], x_max)
        y_max = min(image.shape[0], y_max)
        cropped = image[y_min:y_max, x_min:x_max]
        cropped_images.append(cropped)
    return cropped_images

def rearrange_detections(detections):
    detections_sorted = sorted(detections, key=lambda det: det["bbox"][0])
    return "".join([det["label"] for det in detections_sorted])

def main(args):
    # Load models
    det_model = dg.load_model(
        model_name=args.det_model,
        inference_host_address=args.inference_host,
        zoo_url=args.zoo_url,
        token=args.token or degirum_tools.get_token(),
        overlay_color=[(255, 255, 0), (0, 255, 0)]
    )

    ocr_model = dg.load_model(
        model_name=args.ocr_model,
        inference_host_address=args.inference_host,
        zoo_url=args.zoo_url,
        token=args.token or degirum_tools.get_token(),
        output_use_regular_nms=False,
        output_confidence_threshold=0.1
    )

    # Run detection
    det_results = det_model(args.image)

    # OCR
    if det_results.results:
        cropped = crop_images(det_results.image, det_results.results)
        for idx, plate_img in enumerate(cropped):
            ocr_results = ocr_model.predict(plate_img)
            label = rearrange_detections(ocr_results.results)
            det_results.results[idx]["label"] = label

    # Display
    display_images([det_results.image_overlay], title="License Plate Recognition Result")

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="License Plate Recognition with OCR")

    parser.add_argument("image", help="Path to input image")
    parser.add_argument("--inference-host", default="@cloud", help="Inference host address (e.g., @cloud or @local)")
    parser.add_argument("--zoo-url", default="degirum/models_hailort", help="Model zoo URL")
    parser.add_argument("--token", default="", help="Access token (optional, auto-resolved if blank)")
    parser.add_argument("--det-model", default="yolov8n_relu6_lp--640x640_quant_hailort_hailo8l_1", help="Detection model name")
    parser.add_argument("--ocr-model", default="yolov8n_relu6_lp_ocr--256x128_quant_hailort_hailo8l_1", help="OCR model name")

    args = parser.parse_args()
    main(args)

thank you for your time and consideration. I ran this command python filename.py images/car.mp4 and this is the error that I receive

degirum.exceptions.DegirumException: Failed to read image from ‘images/cars_lp.mp4’

is it possible for your models to detect license plate in videos because from this error it seems that the models you trained cannot work with videos

@ajndossi
Since the code was written for images, it is the expected result that it would fail for video input. Please note that models like YOLOv8 are not trained for images or videos specifically. Models are trained using a training dataset with a lot of images and the trained model can be used on images as well as videos (since a video is just a sequence of images). If you want to make the code work for videos, you should follow the previous suggestions and code posted in this issue. Please let me know if you need further help. If you are specifically interested in license plate detection/recognition, please check our guide at: A Comprehensive Guide to Building a License Plate Recognition (LPR) Systems

@shashi is it possible to rotate camera using predict_stream?

Hi @AbnerDC
PySDK does not physically control the camera. Not sure if that is what you mean by rotate camera.

Kindly to achieve this

Hello @shashi thanks for your continuous help with questions. Do you have any concern about rotating camera? I’ll appreciate it

Hi @AbnerDC
We just finished preparing an example for this use case using PySDK. Please see if this solves your issue: hailo_examples/examples/016_custom_video_source.ipynb at main · DeGirum/hailo_examples

It is at the very end of that example :slight_smile:

1 Like

My question is going to be a bit dumb. I am using IMX500 AI camera. I tried to deploy the codes you provided. However, the error showed up as there was no module named picamera. I am running your codes in the virtual environment. I understand that picamera is installed in the system for widely used. So, is there any way I can use picamera module in my virtual environment? I tried this way already - python -m ven my_venv - - system-site-packages. And, it didn’t work either coz it said it is conflicted with other packages. Please help me. I am in big trouble. Thanks in advance.

My question is going to be a bit dumb. I am using IMX500 AI camera. I tried to deploy the codes you provided. However, the error showed up as there was no module named picamera. I am running your codes in the virtual environment. I understand that picamera is installed in the system for widely used. So, is there any way I can use picamera module in my virtual environment? I tried this way already - python -m ven my_venv - - system-site-packages. And, it didn’t work either coz it said it is conflicted with other packages. Please help me. I am in big trouble. Thanks in advance.

Hi @SAW_THURA_KYAW_KYAW
Does the camera work outside of the virtualenv and without PySDK? In other words, are you able to use the camera to see the video output?

Hello, I tried to replicate this example

But I didn’t make it using analyzers:

tracker = degirum_tools.ObjectTracker(
class_list=[“person”],
track_thresh=0.35,
track_buffer=100,
match_thresh=0.9999,
trail_depth=50,
anchor_point=degirum_tools.AnchorPoint.TOP_CENTER,
)

for detected_persons in degirum_tools.predict_stream(person_detection_model, video_source, analyzers=[tracker]):

Hi @AbnerDC
Sorry, I did not understand your issue. Are you saying that you are unable to attach a tracker analyzer to the example that shows how to use rotated rame?