Improving Small Object Detection in 4K Drone Footage Using Tiling with Hailo AI Processors

alex_zh · July 3, 2025, 1:13pm

Detecting small objects in high-resolution images—especially 4K drone video—is a common challenge when using standard convolutional neural network (CNN) models like the YOLO series. Even powerful models often miss tiny targets when operating on full-frame 4K images.

So, how can we solve this problem?

In this post, I’ll share how we leveraged tiling techniques combined with the Hailo AI chip to dramatically improve small-object detection performance in 4K and even higher-resolution footage.

The Problem with Small Object Detection in 4K

In a typical 4K (3840x2160) frame, small objects occupy very few pixels relative to the whole image. Models like YOLOv8 process the entire frame at once, meaning small targets might get lost in downsampling or insufficient feature extraction.

The Tiling Solution

Our approach divides the 4K image into smaller, overlapping tiles. By processing each tile individually, we preserve the relative size of small objects in the input to the detector, improving detection accuracy.

In our tests, we used 4 tiles (2x2 grid), each with 1080p resolution and some overlap between tiles to avoid missing objects at the edges.

Example Tiling Parameters (C++ OpenCV Example):

// Tile configuration
const int offset_x = 50;
const int offset_y = 50;
const int tile_width = 1920;
const int tile_height = 1080;
const int overlap = 50;

std::vector<cv::Rect> tiles;
for (int row = 0; row < 2; ++row) {
    for (int col = 0; col < 2; ++col) {
        int x = offset_x + col * (tile_width - overlap);
        int y = offset_y + row * (tile_height - overlap);
        if (x + tile_width <= frame.cols && y + tile_height <= frame.rows) {
            tiles.emplace_back(x, y, tile_width, tile_height);
        }
    }
}

Each tile is passed independently through the YOLO model running on the Hailo AI accelerator, ensuring efficient and parallel processing.

Detection Results

The following results compare detection performance using the same confidence threshold on:

YOLOv8s (small model) on the full 4K frame
YOLOv8m (medium model) on the full 4K frame
YOLOv8s with 4x tiling on the 4K frame

vlcsnap-2025-07-03-21h09m59s7741920×2160 337 KB

vlcsnap-2025-07-03-21h10m16s0871920×2160 352 KB

vlcsnap-2025-07-03-21h10m52s4181920×2160 338 KB

The results clearly show that tiling dramatically improves the detection of small objects, even when using the lighter YOLOv8s model. Many targets missed in the full-frame tests were successfully detected in the tiled approach.

Conclusion

If you are working on small object detection in high-resolution scenarios like drones, security cameras, or industrial inspection, tiling + efficient bath inference is a simple but powerful technique.

This approach enables small models to achieve excellent performance in high-resolution environments, reducing both compute load and power consumption—critical factors for edge deployments.

Happy to discuss further optimizations, such as dynamic tiling, adaptive overlap, or post-processing strategies to merge detections from tiles.

yuri · July 3, 2025, 6:59pm

Out of curiosity, are you using an UDP RTSP stream? I was looking into the C++ API but it doesn’t support TCP out of the box and I’m not sure the Hailo team is open to PRs in that regard.

alex_zh · July 4, 2025, 1:53am

The host is doing the streaming part. Hailo8/8L only for the model inference

Jonathan_F · July 10, 2025, 1:33pm

How do you handle duplicate detections occurring the the overlap areas?

Topic		Replies	Views
Tiled inference of object detection model on high resolution image General	5	384	September 10, 2025
Hailo-8 on Raspberry Pi: Verifying Input Format and 640×640 Preprocessing General	6	100	September 26, 2025
General question about video size (1080p 30 fps) and Hailo8 General	10	544	May 23, 2025
How to implement tiling without downsampling General	12	129	October 5, 2025
Hailo-8, minium object size detection General	4	24	August 12, 2025