I’m working on a project to detect road users, including vehicles (cars, motorcycles, trucks, bicycles, buses, etc.) and pedestrians. I’ve been using the Hailo RPi5 examples on GitHub, which detect 80 classes from the COCO dataset by default.
What I want to achieve is to limit the detection to only the relevant road users. I noticed there’s an example in the repository for adapting the code to detect barcodes and QR codes. This example uses a barcode-labels.json file to filter detections, and it also relies on a specialized model trained for barcode detection.
I tried creating my own JSON file based on the barcode-labels example, like this:
While the JSON file successfully annotates the specified classes, the detection and bounding boxes still include all 80 COCO classes. It seems that the filtering only affects the annotation, not the detection itself.
I understand that the barcode detection example uses a model specifically trained for barcodes and QR codes. Does this mean I need to train or fine-tune a custom model that only detects road users? Or is there a way to modify the existing COCO-based model or filtering logic to achieve the desired result?
Hi @thomas38
The trained model will always detect the 80 classes on which it is trained. You can filter the results to focus on the classes you need. As you mentioned, if you do not want other classes to be detected, you need to have a custom model. However, I think it is better to just filter the results and check if you are satisfied with the accuracy before trying to train a custom model.
Thanks for your response. Two questions come to my mind:
First : If i If I apply a filter, is it still possible to access all prediction percentages for an object? For example, if a car is detected as a fridge (40%), a car (39%), and other objects (21%), can I retrieve all these predictions instead of just the top one? This would allow me to identify objects that are misclassified but still relevant as road users.
Then : if I want the best accuracy while maintaining smooth performance, what’s the best approach? I don’t always need to detect all road users, for example sometimes only cars, sometimes motorcycles and bicycles, and sometimes everything except trucks etc. Would it be better to:
Train separate models for each vehicle type and “combine” them as needed? (Is it even possible to “combine them”)
Or use a single model detecting all road users and apply filtering?
Hi @thomas38
Regarding your first question: YOLO models do not work like you mentioned. They do not identify an object and provide different labels whose probabilities add up to 1. It is possible that the same object is detected as both a fridge and a car but it is also possible that a car is not detected at all.
Regarding your second question: in your application, how do you know when you want to detect a particular class? In any case, from a performance point of view, I would suggest having a single model.