Exploring real-time neural inference on Hailo-8 for a multilingual cultural radio project

Dietrich_Tollner · October 24, 2025, 12:29am

Hello everyone,

I’m developing a research-driven radio project that combines large language models, real-time dialogue generation, and audio pattern recognition.

The system runs on a TUXEDO Nano (Ryzen AI 7 + Hailo-8) and aims to create an interactive, multilingual radio environment — one that responds to people in a friendly, human-like, and culturally engaging way.

My focus is on achieving low-latency inference in continuous audio interaction loops. Specifically, I’m exploring how to optimize short overlapping audio frames and mixed workloads (speech recognition + language response) within the Hailo SDK.

Has anyone here experimented with such real-time pipelines or found best practices for managing inference buffers or scheduling on Hailo-8 in similar conditions?

Any insights, references, or even cautionary notes are warmly appreciated.

Best regards,
D. T.

giladn · October 27, 2025, 1:44pm

@Dietrich_Tollner Welcome to Hailo’s community!
Checkout this project by @Katrin_Tomanek