Is MPS on Hailo8L available?

Hello Hailo Support Team,

I’m trying to run MPS (Multi-Process Service) on an Raspberry Pi 5 + Hailo-8L setup, but there’s very little information available, so I’m posting here in case anyone has experience with this.

:pushpin: Environment

  • Raspberry Pi 5

  • Hailo-8L M.2 AI HAT (PCIe)

  • HailoRT 4.22.0

  • Python (multiprocessing) tests

I’m attempting to run two YOLOv8s models (instance segmentation+pose estimation) simultaneously in separate processes.
However, it always fails and exits with RPC errors.

From the hailort documentation:

  • Hailo-15 → explicitly uses multi-process service

  • Hailo-10H → multi-process support is enabled by default

But I cannot find any reference to MPS for Hailo-8 or Hailo-8L, which makes me wonder.

:red_question_mark: My questions

1) Does Hailo-8L actually support the Multi-Process Service?

There is no mention of it in the docs, so I’m not sure if it’s unsupported, hidden, or simply not documented.

2) Is MPS unsupported in the Python API?

The MPS usage examples in the docs only show C++ (enable multi-process service, group IDs, etc.).
Python multiprocessing always results in RPC error 77 when trying to open a second VDevice.

Has anyone successfully:

  • run multiple processes on Hailo-8L at the same time, or

  • used Python to perform multi-process inference with HailoRT?

Any insight would be greatly appreciated. I’m trying to determine whether I’m hitting a software limitation, a hardware limitation, or simply missing something in setup.

Thanks!

Hey @minjoo_kim!

Yes, MPS is definitely supported on the Hailo-8L. You’ll just need to enable the HailoRT service and set up the configuration in your code or GStreamer pipeline.


Getting Started

First thing - let’s get the HailoRT service running on your Pi 5:

sudo systemctl enable --now hailort.service

You’ll need this running regardless of whether you’re using Python or GStreamer.


If You’re Using Python

For the Python API, you’ll want to create each process with its own VDevice using these settings:

params = VDevice.create_params()
params.scheduling_algorithm = HailoSchedulingAlgorithm.ROUND_ROBIN
params.multi_process_service = True
params.group_id = "SHARED"  # Make sure all your processes use the same group ID
vdevice = VDevice(params)

A few things to keep in mind:

  • Each model needs its own process - don’t try to split one model across multiple processes
  • Make sure you’re using the Async InferModel API rather than the older sync InferPipeline
  • All your processes need to use the same group_id so they can share the device properly

If You’re Using GStreamer

Just add multi-process-service=true to each hailonet element and use a shared group ID:

... ! hailonet \
    hef-path=/path/model1.hef \
    batch-size=2 \
    multi-process-service=true \
    vdevice-group-id=1 ! ...

And in your second process:

... ! hailonet \
    hef-path=/path/model2.hef \
    batch-size=2 \
    multi-process-service=true \
    vdevice-group-id=1 ! ...

Quick heads up if you’re using Docker: You’ll need to mount the service socket into your container:

-v /tmp/hailort-service:/tmp/hailort-service

Hope this helps!

Hi omria,

Thanks so much for the helpful advice — really appreciate it!
I didn’t realize Hailo-8L supports MPS as well, so that information was extremely useful.

I’m sharing a few more details below in case anyone can help me understand what I might be missing.


1. HailoRT daemon status on the host

sudo systemctl status hailort
sudo systemctl status hailort.service

Output:

● hailort.service - HailoRT service
     Loaded: loaded (/lib/systemd/system/hailort.service; enabled; preset: enabled)
     Active: active (running) since Mon 2025-11-24 14:32:29 KST; 2 days ago

The service seems active and running without any issues.


2. I’m running everything inside Docker

Here is the exact command I use:

sudo docker run -it \
  --privileged \
  --ipc=host \
  --net=host \
  -v /dev:/dev \
  -v /lib/modules:/lib/modules:ro \
  -v /usr/src:/usr/src:ro \
  -v /dev/bus/pci:/dev/bus/pci \
  -v /home/rpi2/:/app/tappas/rpi2/ \
  -v /run/hailo:/run/hailo \
  -v /tmp/:/tmp/ \
  -e HAILO_SOCK_PATH=/tmp/hailort_uds.sock \
  --name hailo6 my-hailo-tappas:final /bin/bash

The HailoRT socket (/tmp/hailort_uds.sock) and /run/hailo are mounted correctly.


3. My Python MPS test code

import os
import numpy as np
from multiprocessing import Process
from hailo_platform import (HEF, VDevice, HailoStreamInterface, InferVStreams, ConfigureParams,
    InputVStreamParams, OutputVStreamParams, InputVStreams, OutputVStreams, FormatType, HailoSchedulingAlgorithm)


# Define the function to run inference on the model
def infer(network_group, input_vstreams_params, output_vstreams_params, input_data):
    rep_count = 100
    with InferVStreams(network_group, input_vstreams_params, output_vstreams_params) as infer_pipeline:
        for i in range(rep_count):
            infer_results = infer_pipeline.infer(input_data)


def create_vdevice_and_infer(hef_path):
    # Creating the VDevice target with scheduler enabled
    params = VDevice.create_params()
    params.scheduling_algorithm = HailoSchedulingAlgorithm.ROUND_ROBIN
    params.multi_process_service = True
    params.group_id = "SHARED"
    with VDevice(params) as target:
        configure_params = ConfigureParams.create_from_hef(hef=hef, interface=HailoStreamInterface.PCIe)
        model_name = hef.get_network_group_names()[0]
        batch_size = 2
        configure_params[model_name].batch_size = batch_size

        network_groups = target.configure(hef, configure_params)
        network_group = network_groups[0]

        # Create input and output virtual streams params
        input_vstreams_params = InputVStreamParams.make(network_group, format_type=FormatType.FLOAT32)
        output_vstreams_params = OutputVStreamParams.make(network_group, format_type=FormatType.UINT8)

        # Define dataset params
        input_vstream_info = hef.get_input_vstream_infos()[0]
        image_height, image_width, channels = input_vstream_info.shape
        num_of_frames = 10
        low, high = 2, 20

        # Generate random dataset
        dataset = np.random.randint(low, high, (num_of_frames, image_height, image_width, channels)).astype(np.float32)
        input_data = {input_vstream_info.name: dataset}

        infer(network_group, input_vstreams_params, output_vstreams_params, input_data)

# Loading compiled HEFs:
first_hef_path = '../yolov8s/yolov8s.hef'
second_hef_path = '../yolov8s/yolov8s_seg.hef'
first_hef = HEF(first_hef_path)
second_hef = HEF(second_hef_path)
hefs = [first_hef, second_hef]
infer_processes = []

# Configure network groups
for hef in hefs:
    # Create infer process
    infer_process = Process(target=create_vdevice_and_infer, args=(hef,))
    infer_processes.append(infer_process)

print(f'Starting inference on multiple models using scheduler')

infer_failed = False
for infer_process in infer_processes:
    infer_process.start()
for infer_process in infer_processes:
    infer_process.join()
    if infer_process.exitcode:
        infer_failed = True

if infer_failed:
    raise Exception("infer process failed")
print('Done inference')


When both processes run concurrently, I consistently get:

Starting inference on multiple models using scheduler
[HailoRT] [error] CHECK_GRPC_STATUS failed with error code: 14.
[HailoRT] [warning] Make sure HailoRT service is enabled and active!
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_RPC_FAILED(77)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_RPC_FAILED(77)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_RPC_FAILED(77)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_RPC_FAILED(77)
Process Process-1:
[HailoRT] [error] CHECK_GRPC_STATUS failed with error code: 14.
[HailoRT] [warning] Make sure HailoRT service is enabled and active!
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_RPC_FAILED(77)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_RPC_FAILED(77)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_RPC_FAILED(77)
[HailoRT] [error] CHECK_SUCCESS failed with status=HAILO_RPC_FAILED(77)
Traceback (most recent call last):
Process Process-2:
  File "/usr/local/lib/python3.11/dist-packages/hailo_platform/pyhailort/pyhailort.py", line 3540, in _open_vdevice
    self._vdevice = _pyhailort.VDevice.create(self._params, device_ids)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
hailo_platform.pyhailort._pyhailort.HailoRTStatusException: 77


:red_question_mark: I’m not sure what I should fix — any suggestions would be appreciated.

Given that:

  • HailoRT daemon is active

  • The socket is visible inside the container

  • This issue appears only when using Python multiprocessing

I’m not sure whether the problem is related to Docker, the Python API, or something else.
So any guidance would be greatly appreciated.

Thanks again!

Best regards,
Minjoo

Not sure if you solved this problem, I had the similar issue before. You may try this: call the close() method in the sample code (file hailo_inference.py) after finishing inference in each process/thread. Also call these three lines again (sort of re-initializing the inference instance) in the same sample code file:
self.config_ctx = self.infer_model.configure()
self.configured_model = self.config_ctx.enter()
self.configured_model.set_scheduler_priority(priority)

Hello TT2024,

Thank you so much for your suggestion! I really appreciate you taking the time to share your experience and advice.

I’m happy to report that I was able to resolve the issue by addressing the HailoRT socket connection between the Docker container and the host system.

Thank you again for your help, and I hope you have a great day!

Best regards,

1 Like

You are welcome! Good to know you solved your issue.

1 Like