Text Detection + Recognition: PaddleOCR Pipelined on Hailo-8 and Hailo-10H Guide

Majed_Abu_Mokh · August 5, 2025, 4:21pm

This guide provides a high-level overview of the newly added PaddleOCR application, focusing into the internal structure and advanced functionality, the app performs end-to-end text recognition using a two-stage OCR pipeline accelerated by Hailo-8 and Hailo-10H devices.

The pipeline combines:

A text detector to locate text regions
A text recognizer to decode the text inside each region

Example Runs

Single Image:

python3 paddle_ocr.py -n ocr_det.hef ocr.hef -i ocr_img1.png

Folder of Images:

python3 paddle_ocr.py -n ocr_det.hef ocr.hef -i ./my_images/

Video File:

python3 paddle_ocr.py -n ocr_det.hef ocr.hef -i input.mp4

Camera:

python3 paddle_ocr.py -n ocr_det.hef ocr.hef -i camera

Optional: Spell Correction

You can optionally improve OCR text accuracy using a spelling correction dictionary powered by symspellpy

python3 paddle_ocr.py … --use-corrector

Full Pipeline Description

The PaddleOCR app uses a multi-threaded, queue-based pipeline to process input efficiently and asynchronously across multiple stages.

Preprocessing
   ↓
Text Detector (HEF 1)
   ↓
Detection Postprocess → [No Text] → Visualize
   ↓
Text Recognizer (HEF 2)
   ↓
OCR Postprocess
   ↓
Visualization
   ↓ [Output]

1. Preprocessing

Input source can be:
- A single image
- A folder of images
- A video file
- A live camera stream
Each frame is:
- Resized and padded to fit the detector’s input size (while preserving aspect ratio)
- Batched (if batch_size > 1)
Outputs:
- input_frame (for visualization)
- preprocessed_frame (ready for inference)
Sent to: detector_hailo_infer via det_input_queue

2. Text Detection (HEF 1)

Uses the first HEF model to detect text regions
Runs asynchronously using HailoInfer.run()
On inference completion, triggers a callback:
- Packs (original_frame, raw_output_tensor)
Sent to: det_postprocess_queue

3. Detection Postprocessing

Converts the raw heatmap into bounding boxes using DBPostProcess
For each box:
- Crops the region from the original frame
- Resizes it to fit the OCR model’s expected input size (with padding)
- Attaches metadata: frame ID and box location
If no boxes are detected:
- Sends the original frame with empty OCR results directly to visualization
Otherwise:
- Sends: (frame, [resized_crop], (frame_id, box)) to ocr_input_queue

4. Text Recognition (HEF 2)

Uses the second HEF model to recognize text in each cropped region
Also runs asynchronously using HailoInfer.run()
On completion, a callback sends:
- (frame_id, original_frame, ocr_result, box) to ocr_postprocess_queue

5. OCR Postprocessing

Collects all OCR outputs for a given frame (tracked by frame_id)
Keeps track of how many boxes are expected for that frame
Once all OCR results are collected:
- Groups them into one bundle: (frame, list_of_results, list_of_boxes)
- Sends to: vis_output_queue for visualization
Cleans up memory (removes processed frame_id entries)

6. Visualization & Rendering

Uses the inference_result_handler() to:
- Decode OCR model outputs into readable text
- (Optionally) apply spell correction using SymSpell if --use-corrector is set
Draws the results:
- Left side: original image
- Right side: same image with OCR results written inside white boxes
Saves each frame (image/video) to --output-dir
Optionally displays FPS if --show-fps is enabled

Threads Overview

Each of these stages runs in a separate thread:

Thread	Role
`preprocess_thread`	Prepares and resizes input
`det_thread`	Runs text detection HEF
`detection_postprocess`	Extracts boxes, crops, resizes
`ocr_thread`	Runs text recognition HEF
`ocr_postprocess`	Groups and synchronizes OCR results
`vis_postprocess`	Handles decoding, correction, and rendering

Internal Queues

Queue Name	Purpose
`det_input_queue`	Holds original + preprocessed frames for the detector inference engine
`det_postprocess_queue`	Receives detection outputs (raw tensors + original frames) for postprocessing
`ocr_input_queue`	Carries cropped text regions + metadata to the OCR inference engine
`ocr_postprocess_queue`	Receives OCR model outputs along with original frame and box info
`vis_output_queue`	Collects final grouped results (frame, texts, boxes) for visualization and output

shashi · August 5, 2025, 4:26pm

Hi @Majed_Abu_Mokh

Is there a github repo or a shared folder that contains these hef models?

nina-vilela · August 5, 2025, 4:32pm

hey @shashi,

The example includes a bash script for downloading the relevant HEF files (download_resources.sh)

shashi · August 5, 2025, 4:35pm

@nina-vilela

Thanks. For some reason, I could not see the link to the github repo when I posted my message.

Aleksei_Markov · August 6, 2025, 2:32am

Hi @Majed_Abu_Mokh , do you have models for Hailo8l ?

Majed_Abu_Mokh · August 6, 2025, 9:55am

Hi Aleksei_Markov
I don’t have models for Hailo-8L at the moment.
I’ll update you as soon as they’re released.

Topic		Replies	Views
Run PaddleOcr recognition model in Hailo AI accelerator General	5	79	June 26, 2025
Run PaddleOcr recognition model in Hailo8 AI accelerator General	3	63	June 29, 2025
How to do OCR with Yolov11 on Hailo8 Raspberry pi 5? General dfc , hailort , raspberry-pi , hailo8	2	209	April 22, 2025
Test AI Kit by ssh connection command line General	40	606	October 14, 2024
OCR Form Reader Development with Hailo-8L Chip General raspberry-pi , hailo8	5	1074	November 20, 2024