Hello, my board is rapi5 with Hailo-8L, I want to run with python “Hailo-Application-Code-Examples” from github, but I don’t know how to install Hailo Model Zoo
Welcome to the Hailo Community!
You do not need to install the Hailo Model Zoo. You can download precompiled models (HEFs) from the Model Zoo GitHub page. For instance Object detection models for Hailo-8L.
GitHub - Hailo Model Zoo - Hailo-8L - Object detection
To convert your own models you will need to install the Hailo AI Software Suite docker on a x86 Ubuntu machine. The suite comes with built in tutorials that show you the conversion process step by step. Just call the following command inside the docker.
hailo tutotorial
This will start a Jupyter Notebook server with notebooks for each step.
The suite also includes a hailomz
command line tool that lets you convert models from the Model Zoo.
After going trough the tutorial call hailomz --help
and you will notice the same steps (parse, optimize, compile) as in the general workflow. This will allow you to convert the models from the Model Zoo while using fewer parameters. The hailo_model_zoo directory contains yaml and alls files that tell hailomz what parameters to use to convert the models.
Hi
as I want to run this code in RPI5 with hailo8L (for fast_sam), but it requires hailo model zoo to be installed…how to proceed then?
https://github.com/hailo-ai/Hailo-Application-Code-Examples/blob/main/runtime/python/instance_segmentation/yoloseg_inference.py
To get started, please visit our Hailo Model Zoo repository on GitHub:
You’ll find a Quick Start guide there that should walk you through the process. if you run into any issues or have questions, feel free to post them here in the community.
hi Klausk
thanks for your help,
When I run Hailo-Application-Code-Examples/runtime/python/streaming at main · hailo-ai/Hailo-Application-Code-Examples · GitHub,
The script will call Hailo model zoo but Raspi5 can’t install DFC then it also can’t install Hailo model zoo. Have you successfully run this on a Raspberry Pi 5?
Best regards
Sam
Hi Klausk sir,
Thanks for your help. I have followed your instruction to download model from git hub but When I run the “/Hailo-Application-Code-Examples/runtime/python/streaming/yolox_stream_inference.py”. it is same error as photo
facing same issue…any solution?
Hey
In this part of the example, we’re utilizing the Hailo Model Zoo for post-processing. Here’s the relevant code:
# Create a dictionary that maps tensor shapes to layer names (needed for post-processing)
layer_from_shape: dict = {infer_results[key].shape: key for key in infer_results.keys()}
from hailo_model_zoo.core.postprocessing.detection import yolo
# Post-processing configuration, as recommended in hailo_model_zoo/cfg/base/yolox.yaml
anchors = {"strides": [32, 16, 8], "sizes": [[1, 1], [1, 1], [1, 1]]}
yolox_post_proc = yolo.YoloPostProc(
img_dims=(INPUT_RES_H, INPUT_RES_W),
nms_iou_thresh=0.65,
score_threshold=0.01,
anchors=anchors,
output_scheme=None,
classes=80,
labels_offset=1,
meta_arch="yolox",
device_pre_post_layers=[]
)
# The order here is crucial since the reorganized tensor must be in (BS,H,W,85) shape
endnodes = [
infer_results[layer_from_shape[1, 80, 80, 4]], # stride 8
infer_results[layer_from_shape[1, 80, 80, 1]], # stride 8
infer_results[layer_from_shape[1, 80, 80, 80]], # stride 8
infer_results[layer_from_shape[1, 40, 40, 4]], # stride 16
infer_results[layer_from_shape[1, 40, 40, 1]], # stride 16
infer_results[layer_from_shape[1, 40, 40, 80]], # stride 16
infer_results[layer_from_shape[1, 20, 20, 4]], # stride 32
infer_results[layer_from_shape[1, 20, 20, 1]], # stride 32
infer_results[layer_from_shape[1, 20, 20, 80]] # stride 32
]
# Process the outputs
hailo_preds = yolox_post_proc.yolo_postprocessing(endnodes)
num_detections = int(hailo_preds['num_detections'])
scores = hailo_preds["detection_scores"][0].numpy()
classes = hailo_preds["detection_classes"][0].numpy()
boxes = hailo_preds["detection_boxes"][0].numpy()
# Set the number of detections to 0 if the first score is 0
if scores[0] == 0:
num_detections = 0
# Create a dictionary to hold prediction results
preds_dict = {
'scores': scores,
'classes': classes,
'boxes': boxes,
'num_detections': num_detections
}
# Report and visualize detections
frame = report.report_detections(
preds_dict,
frame,
scale_factor_x=orig_w,
scale_factor_y=orig_h
)
cv2.imshow('frame', frame)
If you’d like, you can either replace the operations performed by the Hailo Model Zoo with your own post-processing steps, or try running other examples. However, please note that these examples have primarily been tested on x86 architectures and are not optimized for Raspberry Pi (RPI) at this time. We’re working on versions specifically optimized for RPI, similar to what we did with the TAPPAS , which now works on RPI as RPI-examples.
Let me know if you encounter any issues or if you would like further assistance with this setup.
Best regards,
Omri
Hi
we cannot install hailo_model_zoo in rpi…so how to test the examples from Hailo-Application-Code-Examples in rpi ?
Specifically I want to test fast sam
Thank you
Hey @dario.ravarro,
If you take a look at this inference example: Hailo YOLOSeg Inference Example, you’ll notice that the post-processing function imports from the Hailo model zoo.
To use this example on the Raspberry Pi (RPI), you’ll need to write your own post-processing function and a function to visualize the results. That’s why we didn’t recommend using these examples directly until they are optimized. The best way to proceed would be to review the post-processing functions in the Hailo model zoo and integrate them into your example instead of importing them.
Here are the functions you’ll need:
def visualize_yolov5_seg_results(
detections, img, class_names=None, alpha=0.5, score_thres=0.25, mask_thresh=0.5, max_boxes_to_draw=20, **kwargs
):
img_idx = 0
img_out = img[img_idx].copy()
boxes = detections["detection_boxes"]
# Scale the boxes to match the input image dimensions
boxes[:, 0::2] *= img_out.shape[1]
boxes[:, 1::2] *= img_out.shape[0]
masks = detections["mask"] > mask_thresh
scores = detections["detection_scores"]
classes = detections["detection_classes"]
skip_boxes = kwargs.get("meta_arch", "") == "yolov8_seg_postprocess" and kwargs.get("classes", "") == 1
keep = scores > score_thres
boxes, masks, scores, classes = boxes[keep], masks[keep], scores[keep], classes[keep]
max_boxes = min(max_boxes_to_draw, len(keep))
boxes, masks, scores, classes = boxes[:max_boxes], masks[:max_boxes], scores[:max_boxes], classes[:max_boxes]
for idx, mask in enumerate(masks):
xmin, ymin, xmax, ymax = boxes[idx].astype(np.int32)
color = np.random.randint(0, 255, size=3, dtype=np.uint8)
if not skip_boxes:
img_out = cv2.rectangle(img_out, (xmin, ymin), (xmax, ymax), [int(c) for c in color], 3)
if np.sum(mask) > 0:
polygons, _ = mask_to_polygons(mask)
mask_overlay = np.repeat(mask[:, :, np.newaxis], 3, axis=2) * color
img_out = cv2.addWeighted(mask_overlay, alpha, img_out, 1, 0)
for polygon in polygons:
img_out = cv2.polylines(
img_out, [polygon.reshape((-1, 1, 2)).astype(np.int32)], isClosed=True, color=color, thickness=1
)
if not skip_boxes:
label = f"{class_names[int(classes[idx])]}"
score = f"{int(100 * scores[idx])}%"
text = f"{label}: {score}"
(w, h), _ = cv2.getTextSize(text, fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=0.5, thickness=2)
org = (xmin, ymin)
img_out = cv2.putText(img_out, text, org, fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=0.5, color=[255, 255, 255], thickness=2)
return img_out
def yolov8_seg_postprocess(endnodes, device_pre_post_layers=None, **kwargs):
num_classes = kwargs["classes"]
strides = kwargs["anchors"]["strides"][::-1]
image_dims = tuple(kwargs["img_dims"])
reg_max = kwargs["anchors"]["regression_length"]
raw_boxes = endnodes[:7:3]
scores = [np.reshape(s, (-1, s.shape[1] * s.shape[2], num_classes)) for s in endnodes[1:8:3]]
scores = np.concatenate(scores, axis=1)
decoded_boxes = _yolov8_decoding(raw_boxes, strides, image_dims, reg_max)
score_thres = kwargs["score_threshold"]
iou_thres = kwargs["nms_iou_thresh"]
proto_data = endnodes[9]
batch_size, _, _, n_masks = proto_data.shape
fake_objectness = np.ones((scores.shape[0], scores.shape[1], 1))
scores_obj = np.concatenate([fake_objectness, scores], axis=-1)
coeffs = [np.reshape(c, (-1, c.shape[1] * c.shape[2], n_masks)) for c in endnodes[2:9:3]]
coeffs = np.concatenate(coeffs, axis=1)
predictions = np.concatenate([decoded_boxes, scores_obj, coeffs], axis=2)
nms_res = non_max_suppression(predictions, conf_thres=score_thres, iou_thres=iou_thres, multi_label=True)
outputs = []
for b in range(batch_size):
protos = proto_data[b]
masks = process_mask(protos, nms_res[b]["mask"], nms_res[b]["detection_boxes"], image_dims, upsample=True)
output = {
"detection_boxes": np.array(nms_res[b]["detection_boxes"]),
"mask": np.transpose(masks, (0, 1, 2)) if masks is not None else None,
"detection_scores": np.array(nms_res[b]["detection_scores"]),
"detection_classes": np.array(nms_res[b]["detection_classes"]).astype(int)
}
outputs.append(output)
return outputs
Let me know if you have any questions!
Regards
I copied basically most of the content of instance_segmentation_postprocessing.py into yoloseg_inference and also installed cython to be able to import cython_nms.pyx.
Where can I upload the yoloseg_inference.py?
Running
python yoloseg_inference.py fast_sam_s.hef dog_bicycle.jpg fast
generates an output image in a new folder, with segmentation.
However the segmentation is missing some parts… How to fine tune it?
How to run segmentation on live stream from rpi camera?
Great job!
It seems the missing segmentation might be related to a post-processing issue. The object is being detected, but the coverage isn’t complete across the entire object. I recommend making some adjustments to the post-processing and testing it again. Since the app was initially developed for x86 (PC) or VMS with Hailo, it might require some modifications in post-processing to work properly in this setup.
As for live streaming, you can refer to our implementation here: Hailo Live Stream Example.
The live stream example doesn’t work with RPI camera.
I am trying to use rpicamera2 in the segmentation code, but it’s unclear how to implement it properly in yoloseg_inference.py
how to modify yoloseg_inference to get images from RPI camera?
@omria I’m inferencing yolov8 instance segmentation model and depth model parallel on the device Hailo-8, I manually modified from (hailo_models_zoo) post-processing functions for the segmentation model , But the issue is that the post-processing function takes a longer time than the model inference time, I see there is potential solutions using libyolo_post.so functions available on Tappas, using GStreamer can significantly reduce post-processing time , but I planing use multiprocessing instead of GStreamer, is there possible suggestions using libyolo_post function without gstreamer?
Hi can you tell me where can i find this function
_yolov8_decoding
Got an error here _yolov8_decoding is not defined
Found it here :Hailo zoo github
Also sharing the code which worked on Pi with hailo 8l , image size 640X640(used my custom model)
import numpy as np
import cv2
from hailo_platform import (HEF, Device, VDevice, HailoStreamInterface, InferVStreams, ConfigureParams,
InputVStreamParams, OutputVStreamParams, FormatType)
from zenlog import log
from PIL import Image
import os
import argparse
from cython_utils.cython_nms import nms as cnms
parser = argparse.ArgumentParser(description='Running a Hailo inference with actual images using Hailo API and OpenCV')
parser.add_argument('hef', help="HEF file path")
parser.add_argument('images', help="Images path to perform inference on. Could be either a single image or a folder containing the images")
parser.add_argument('arch', help="The architecture type of the model: v5, v8 or fast")
parser.add_argument('--class-num', help="The number of classes the model is trained on. Defaults to 80 for v5 and v8, and 1 for fast_sam.", default=80)
parser.add_argument('--output_dir', help="The path to the output directory where the images would be save. Default to the output_images folder in currect directory.")
args = parser.parse_args()
kwargs = {}
kwargs['device_pre_post_layers'] = None
## strides - Constant scalar per bounding box for scaling. One for each anchor by size of the anchor values
## sizes - The actual anchors for the bounding boxes
## regression_length - The regression length required for the distance estimation
arch_dict = {'v5':
{ 'anchors':
{'strides': [32,16,8],
'sizes': np.array([[116, 90, 156, 198, 373, 326], [30, 61, 62, 45, 59, 119], [10, 13, 16, 30, 33, 23]])}
},
'v8': {'anchors':
{'strides': [32,16,8],
'regression_length' : 15}
},
'fast': {'anchors':
{'strides': [32,16,8],
'regression_length' : 15}}
}
# --------------------------------------------------------- #
# ---------------- Architecture functions ----------------- #
def _softmax(x):
return np.exp(x) / np.expand_dims(np.sum(np.exp(x), axis=-1), axis=-1)
def mask_to_polygons(mask):
mask = np.ascontiguousarray(mask)
res = cv2.findContours(mask.astype("uint8"), cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)
hierarchy = res[-1]
if hierarchy is None: # empty mask
return [], False
has_holes = (hierarchy.reshape(-1, 4)[:, 3] >= 0).sum() > 0
res = res[-2]
res = [x.flatten() for x in res]
res = [x + 0.5 for x in res if len(x) >= 6]
return res, has_holes
def crop_mask(masks, boxes):
"""
Zeroing out mask region outside of the predicted bbox.
Args:
masks: numpy array of masks with shape [n, h, w]
boxes: numpy array of bbox coords with shape [n, 4]
"""
n_masks, _, _ = masks.shape
integer_boxes = np.ceil(boxes).astype(int)
x1, y1, x2, y2 = np.array_split(np.where(integer_boxes > 0, integer_boxes, 0), 4, axis=1)
for k in range(n_masks):
masks[k, : y1[k, 0], :] = 0
masks[k, y2[k, 0] :, :] = 0
masks[k, :, : x1[k, 0]] = 0
masks[k, :, x2[k, 0] :] = 0
return masks
def process_mask(protos, masks_in, bboxes, shape, upsample=True, downsample=False):
mh, mw, c = protos.shape
ih, iw = shape
masks = _sigmoid(masks_in @ protos.reshape((-1, c)).transpose((1, 0))).reshape((-1, mh, mw))
downsampled_bboxes = bboxes.copy()
if downsample:
downsampled_bboxes[:, 0] *= mw / iw
downsampled_bboxes[:, 2] *= mw / iw
downsampled_bboxes[:, 3] *= mh / ih
downsampled_bboxes[:, 1] *= mh / ih
masks = crop_mask(masks, downsampled_bboxes)
if upsample:
if not masks.shape[0]:
return None
masks = cv2.resize(np.transpose(masks, axes=(1, 2, 0)), shape, interpolation=cv2.INTER_LINEAR)
if len(masks.shape) == 2:
masks = masks[..., np.newaxis]
masks = np.transpose(masks, axes=(2, 0, 1)) # CHW
if not downsample:
masks = crop_mask(masks, downsampled_bboxes) # CHW
return masks
def _sigmoid(x):
return 1 / (1 + np.exp(-x))
def non_max_suppression(prediction, conf_thres=0.25, iou_thres=0.45, max_det=300, nm=32, multi_label=True):
"""Non-Maximum Suppression (NMS) on inference results to reject overlapping detections
Args:
prediction: numpy.ndarray with shape (batch_size, num_proposals, 351)
conf_thres: confidence threshold for NMS
iou_thres: IoU threshold for NMS
max_det: Maximal number of detections to keep after NMS
nm: Number of masks
multi_label: Consider only best class per proposal or all conf_thresh passing proposals
Returns:
A list of per image detections, where each is a dictionary with the following structure:
{
'detection_boxes': numpy.ndarray with shape (num_detections, 4),
'mask': numpy.ndarray with shape (num_detections, 32),
'detection_classes': numpy.ndarray with shape (num_detections, 80),
'detection_scores': numpy.ndarray with shape (num_detections, 80)
}
"""
assert 0 <= conf_thres <= 1, f"Invalid Confidence threshold {conf_thres}, valid values are between 0.0 and 1.0"
assert 0 <= iou_thres <= 1, f"Invalid IoU threshold {iou_thres}, valid values are between 0.0 and 1.0"
nc = prediction.shape[2] - nm - 5 # number of classes
xc = prediction[..., 4] > conf_thres # candidates
max_wh = 7680 # (pixels) maximum box width and height
mi = 5 + nc # mask start index
output = []
for xi, x in enumerate(prediction): # image index, image inference
x = x[xc[xi]] # confidence
# If none remain process next image
if not x.shape[0]:
output.append(
{
"detection_boxes": np.zeros((0, 4)),
"mask": np.zeros((0, 32)),
"detection_classes": np.zeros((0, 80)),
"detection_scores": np.zeros((0, 80)),
}
)
continue
# Confidence = Objectness X Class Score
x[:, 5:] *= x[:, 4:5]
# (center_x, center_y, width, height) to (x1, y1, x2, y2)
boxes = xywh2xyxy(x[:, :4])
mask = x[:, mi:]
multi_label &= nc > 1
if not multi_label:
conf = np.expand_dims(x[:, 5:mi].max(1), 1)
j = np.expand_dims(x[:, 5:mi].argmax(1), 1).astype(np.float32)
keep = np.squeeze(conf, 1) > conf_thres
x = np.concatenate((boxes, conf, j, mask), 1)[keep]
else:
i, j = (x[:, 5:mi] > conf_thres).nonzero()
x = np.concatenate((boxes[i], x[i, 5 + j, None], j[:, None].astype(np.float32), mask[i]), 1)
# sort by confidence
x = x[x[:, 4].argsort()[::-1]]
# per-class NMS
cls_shift = x[:, 5:6] * max_wh
boxes = x[:, :4] + cls_shift
conf = x[:, 4:5]
preds = np.hstack([boxes.astype(np.float32), conf.astype(np.float32)])
keep = cnms(preds, iou_thres)
if keep.shape[0] > max_det:
keep = keep[:max_det]
out = x[keep]
scores = out[:, 4]
classes = out[:, 5]
boxes = out[:, :4]
masks = out[:, 6:]
out = {"detection_boxes": boxes, "mask": masks, "detection_classes": classes, "detection_scores": scores}
output.append(out)
return output
def xywh2xyxy(x):
y = np.copy(x)
y[:, 0] = x[:, 0] - x[:, 2] / 2
y[:, 1] = x[:, 1] - x[:, 3] / 2
y[:, 2] = x[:, 0] + x[:, 2] / 2
y[:, 3] = x[:, 1] + x[:, 3] / 2
return y
def _yolov8_decoding(raw_boxes, strides, image_dims, reg_max):
boxes = None
for box_distribute, stride in zip(raw_boxes, strides):
# create grid
shape = [int(x / stride) for x in image_dims]
grid_x = np.arange(shape[1]) + 0.5
grid_y = np.arange(shape[0]) + 0.5
grid_x, grid_y = np.meshgrid(grid_x, grid_y)
ct_row = grid_y.flatten() * stride
ct_col = grid_x.flatten() * stride
center = np.stack((ct_col, ct_row, ct_col, ct_row), axis=1)
# box distribution to distance
reg_range = np.arange(reg_max + 1)
box_distribute = np.reshape(
box_distribute, (-1, box_distribute.shape[1] * box_distribute.shape[2], 4, reg_max + 1)
)
box_distance = _softmax(box_distribute)
box_distance = box_distance * np.reshape(reg_range, (1, 1, 1, -1))
box_distance = np.sum(box_distance, axis=-1)
box_distance = box_distance * stride
# decode box
box_distance = np.concatenate([box_distance[:, :, :2] * (-1), box_distance[:, :, 2:]], axis=-1)
decode_box = np.expand_dims(center, axis=0) + box_distance
xmin = decode_box[:, :, 0]
ymin = decode_box[:, :, 1]
xmax = decode_box[:, :, 2]
ymax = decode_box[:, :, 3]
decode_box = np.transpose([xmin, ymin, xmax, ymax], [1, 2, 0])
xywh_box = np.transpose([(xmin + xmax) / 2, (ymin + ymax) / 2, xmax - xmin, ymax - ymin], [1, 2, 0])
boxes = xywh_box if boxes is None else np.concatenate([boxes, xywh_box], axis=1)
return boxes # tf.expand_dims(boxes, axis=2)
def visualize_yolov5_seg_results(
detections, img, class_names=None, alpha=0.5, score_thres=0.25, mask_thresh=0.5, max_boxes_to_draw=20, **kwargs
):
img_idx = 0
img_out = img[img_idx].copy()
boxes = detections["detection_boxes"]
# Scale the boxes to match the input image dimensions
boxes[:, 0::2] *= img_out.shape[1]
boxes[:, 1::2] *= img_out.shape[0]
masks = detections["mask"] > mask_thresh
scores = detections["detection_scores"]
classes = detections["detection_classes"]
#print("Classes:",classes)
skip_boxes = kwargs.get("meta_arch", "") == "yolov8_seg_postprocess" and kwargs.get("classes", "") == 1
keep = scores > score_thres
boxes, masks, scores, classes = boxes[keep], masks[keep], scores[keep], classes[keep]
max_boxes = min(max_boxes_to_draw, len(keep))
boxes, masks, scores, classes = boxes[:max_boxes], masks[:max_boxes], scores[:max_boxes], classes[:max_boxes]
for idx, mask in enumerate(masks):
xmin, ymin, xmax, ymax = boxes[idx].astype(np.int32)
color = np.random.randint(0, 255, size=3, dtype=np.uint8)
color_1 = np.array(np.random.randint(0, 255, size=3, dtype=np.uint8)).tolist()
#print("Colour:",color)
if not skip_boxes:
img_out = cv2.rectangle(img_out, (int(xmin/640), int(ymin/640)), (int(xmax/640), int(ymax/640)), [int(c) for c in color], 1)
if np.sum(mask) > 0:
polygons, _ = mask_to_polygons(mask)
mask_overlay = np.repeat(mask[:, :, np.newaxis], 3, axis=2) * color
img_out = cv2.addWeighted(mask_overlay, alpha, img_out, 1, 0)
for polygon in polygons:
img_out = cv2.polylines(
img_out, [polygon.reshape((-1, 1, 2)).astype(np.int32)], isClosed=True, color=color_1, thickness=1
)
#print("Class_index:",[int(classes[idx])])
#print("Score_index:",int(100 * scores[idx]))
#print("Class names:",class_names)
if not skip_boxes:
label = f"Class{int(classes[idx])}"
score = f"{int(100 * scores[idx])}%"
text = f"{label}: {score}"
print(text)
(w, h), _ = cv2.getTextSize(text, fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=0.5, thickness=2)
org = (int(xmin/640), int(ymin/640))
print(org)
img_out = cv2.putText(img_out, text, org, fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=0.5, color=color_1, thickness=1)
return img_out
def yolov8_seg_postprocess(endnodes, device_pre_post_layers=None, **kwargs):
num_classes = kwargs["classes"]
strides = kwargs["anchors"]["strides"][::-1]
image_dims = tuple(kwargs["img_dims"])
reg_max = kwargs["anchors"]["regression_length"]
raw_boxes = endnodes[:7:3]
scores = [np.reshape(s, (-1, s.shape[1] * s.shape[2], num_classes)) for s in endnodes[1:8:3]]
scores = np.concatenate(scores, axis=1)
decoded_boxes = _yolov8_decoding(raw_boxes, strides, image_dims, reg_max)
score_thres = kwargs["score_threshold"]
iou_thres = kwargs["nms_iou_thresh"]
proto_data = endnodes[9]
batch_size, _, _, n_masks = proto_data.shape
fake_objectness = np.ones((scores.shape[0], scores.shape[1], 1))
scores_obj = np.concatenate([fake_objectness, scores], axis=-1)
coeffs = [np.reshape(c, (-1, c.shape[1] * c.shape[2], n_masks)) for c in endnodes[2:9:3]]
coeffs = np.concatenate(coeffs, axis=1)
predictions = np.concatenate([decoded_boxes, scores_obj, coeffs], axis=2)
nms_res = non_max_suppression(predictions, conf_thres=score_thres, iou_thres=iou_thres, multi_label=True)
outputs = []
for b in range(batch_size):
protos = proto_data[b]
masks = process_mask(protos, nms_res[b]["mask"], nms_res[b]["detection_boxes"], image_dims, upsample=True)
output = {
"detection_boxes": np.array(nms_res[b]["detection_boxes"]),
"mask": np.transpose(masks, (0, 1, 2)) if masks is not None else None,
"detection_scores": np.array(nms_res[b]["detection_scores"]),
"detection_classes": np.array(nms_res[b]["detection_classes"]).astype(int)
}
outputs.append(output)
return outputs
def postproc_yolov8seg(raw_detections):
raw_detections_keys = list(raw_detections.keys())
layer_from_shape: dict = {raw_detections[key].shape:key for key in raw_detections_keys}
mask_channels = 32
detection_output_channels = (kwargs['anchors']['regression_length'] + 1) * 4 # (regression length + 1) * num_coordinates
endnodes = [raw_detections[layer_from_shape[1, 80, 80, detection_output_channels]],
raw_detections[layer_from_shape[1, 80, 80, kwargs['classes']]],
raw_detections[layer_from_shape[1, 80, 80, mask_channels]],
raw_detections[layer_from_shape[1, 40, 40, detection_output_channels]],
raw_detections[layer_from_shape[1, 40, 40, kwargs['classes']]],
raw_detections[layer_from_shape[1, 40, 40, mask_channels]],
raw_detections[layer_from_shape[1, 20, 20, detection_output_channels]],
raw_detections[layer_from_shape[1, 20, 20, kwargs['classes']]],
raw_detections[layer_from_shape[1, 20, 20, mask_channels]],
raw_detections[layer_from_shape[1, 160, 160, mask_channels]]]
predictions_dict = yolov8_seg_postprocess(endnodes, **kwargs)
return predictions_dict
def postproc_yolov5seg(raw_detections):
raw_detections_keys = list(raw_detections.keys())
layer_from_shape: dict = {raw_detections[key].shape:key for key in raw_detections_keys}
mask_channels = 32
detection_channels = (kwargs['classes'] + 4 + 1 + mask_channels) * len(kwargs['anchors']['strides']) # (num_classes + num_coordinates + objectness + mask) * strides_list_len
endnodes = [raw_detections[layer_from_shape[1, 160, 160, mask_channels]],
raw_detections[layer_from_shape[1, 80, 80, detection_channels]],
raw_detections[layer_from_shape[1, 40, 40, detection_channels]],
raw_detections[layer_from_shape[1, 20, 20, detection_channels]]]
predictions_dict = yolov5_seg_postprocess(endnodes, **kwargs)
return predictions_dict
# ---------------- Pre-processing functions ----------------- #
def letterbox_image(image, size):
'''resize image with unchanged aspect ratio using padding'''
img_w, img_h = image.size
model_input_w, model_input_h = size
scale = min(model_input_w / img_w, model_input_h / img_h)
scaled_w = int(img_w * scale)
scaled_h = int(img_h * scale)
image = image.resize((scaled_w, scaled_h), Image.Resampling.BICUBIC)
new_image = Image.new('RGB', size, (114,114,114))
new_image.paste(image, ((model_input_w - scaled_w) // 2, (model_input_h - scaled_h) // 2))
return new_image
def preproc(image, width=640, height=640, normalized=True):
image = letterbox_image(image, (width, height))
if normalized == False:
## normalized_image = (base - mean) / std, given mean=0.0, std=255.0
image = np.array(image)
image[:,:, 0] = image[:,:, 0] / 255.0
image[:,:, 1] = image[:,:, 1] / 255.0
image[:,:, 2] = image[:,:, 2] / 255.0
return image
def load_input_images(images_path, images):
# if running inference on a single image:
if (images_path.endswith('.jpg') or images_path.endswith('.png') or images_path.endswith('.bmp') or images_path.endswith('.jpeg')):
images.append(Image.open(images_path))
# if running inference on an images directory:
if (os.path.isdir(images_path)):
for img in os.listdir(images_path):
if (img.endswith(".jpg") or img.endswith(".png") or img.endswith('.bmp') or img.endswith('.jpeg')):
images.append(Image.open(os.path.join(images_path, img)))
# ---------------- Start of the example --------------------- #
func_dict = {
'yolov5_seg': postproc_yolov5seg,
'yolov8_seg': postproc_yolov8seg,
'fast_sam': postproc_yolov8seg,
}
images_path = args.images
images = []
load_input_images(images_path, images)
anchors = {}
meta_arch = ''
arch = args.arch
arch_list = arch_dict.keys()
num_of_classes = args.class_num
if arch in arch_list:
anchors = arch_dict[arch]
kwargs['anchors'] = arch_dict[arch]['anchors']
if arch == 'v5':
meta_arch = 'yolov5_seg'
kwargs['score_threshold'] = 0.001
kwargs['nms_iou_thresh'] = 0.6
if arch == 'v8':
meta_arch = 'yolov8_seg'
kwargs['score_threshold'] = 0.001
kwargs['nms_iou_thresh'] = 0.7
kwargs['meta_arch'] = 'yolov8_seg_postprocess'
if arch == 'fast':
meta_arch = 'fast_sam'
kwargs['score_threshold'] = 0.25
kwargs['nms_iou_thresh'] = 0.7
kwargs['meta_arch'] = 'yolov8_seg_postprocess'
num_of_classes = '1'
kwargs['classes'] = 1
else:
error = 'Not a valid architecture. Please choose an architecture from the this list: v5, v8, fast'
raise ValueError(error)
kwargs['classes'] = int(num_of_classes)
output_dir = args.output_dir
if not output_dir:
output_dir = 'output_images'
devices = Device.scan()
hef = HEF(args.hef)
inputs = hef.get_input_vstream_infos()
outputs = hef.get_output_vstream_infos()
with VDevice(device_ids=devices) as target:
configure_params = ConfigureParams.create_from_hef(hef, interface=HailoStreamInterface.PCIe)
network_group = target.configure(hef, configure_params)[0]
network_group_params = network_group.create_params()
[log.info('Input layer: {} {}'.format(layer_info.name, layer_info.shape)) for layer_info in inputs]
[log.info('Output layer: {} {}'.format(layer_info.name, layer_info.shape)) for layer_info in outputs]
height, width, _ = hef.get_input_vstream_infos()[0].shape
kwargs['img_dims'] = (height,width)
input_vstream_info = hef.get_input_vstream_infos()[0]
input_vstreams_params = InputVStreamParams.make_from_network_group(network_group, quantized=False, format_type=FormatType.FLOAT32)
output_vstreams_params = OutputVStreamParams.make_from_network_group(network_group, quantized=False, format_type=FormatType.FLOAT32)
with InferVStreams(network_group, input_vstreams_params, output_vstreams_params) as infer_pipeline:
for i, image in enumerate(images):
processed_image = preproc(image, height=height, width=width)
input_data = {input_vstream_info.name: np.expand_dims(processed_image, axis=0).astype(np.float32)}
with network_group.activate(network_group_params):
raw_detections = infer_pipeline.infer(input_data)
results = func_dict[meta_arch](raw_detections)[0]
output_path = os.path.join(os.path.realpath('.'), 'output_images')
if not os.path.isdir(output_path):
os.mkdir(output_path)
processed_img = Image.fromarray(visualize_yolov5_seg_results(results, np.expand_dims(np.array(processed_image), axis=0), score_thres=0.3, class_names=num_of_classes, **kwargs))
processed_img.save(f'{output_dir}/output_image{i}.jpg', 'JPEG')
type or paste code here
Had to copy this to the folder where i had my code :cython_utils
refer :from cython_utils.cython_nms import nms as cnms
Thanks to omria and dario.ravarro