Difference between Multi Process Service and Scheduler

Hi everyone!

I’m currently trying to closely understand the difference between the multi process service and the model scheduler. Are they different? Does the multi process service just use the scheduler under the hood to distribute the compute time between the different processes?

I was running some tests using parallel executed benchmarks of hailortcli run –multi-process-serviceand compared the results to a combined run using hailortcli run2. I understand that the model scheduler allocates compute time to each model in a way that equalizes the FPS throughput for each model in a pipeline. With the multi process service, I would have expected that the compute time between each process and therefore each pipeline was split 50/50 for each process.

However, executing the parallel run commands resulted in FPS throughput that was almost equalized, meaning that the result was almost identical when running the models either in 2 seperate processes, or using the scheduler. I assume that the difference between the 2 approaches comes down to I/O and small differences in measuring. Is that correct?

That would require the service to measure inference time and distribute it somehow. It is much easier to just load a model run inference whenever computation for the previous model is done.

That sounds like what I would expect. Load model, infer one image and then repeat. Resulting in equalized FPS unless parameters (e.g. batch-size, threshold, timeout and priority) are set to change the balance.

Yes, I expect minimal differences between the two approaches. Inference takes a long time compared the scheduler/service runtime.