Can input be fed sub-frame (line-by-line) to start inference before the full frame arrives?

Hi Hailo team,

I’m running ResNet-18 on Hailo-8L over PCIe (HailoRT 4.23.0). Input comes from a line-scan camera that fills a 224×224 frame row by row in host DDR.

To be precise about the goal: this is not about transfer↔compute overlap (on my setup HW latency ≈2.3 ms vs end-to-end infer ≈2.6 ms, so PCIe transfer is only ~0.3 ms). I want to pace input at the camera’s line rate so the ~2.3 ms NN compute is hidden behind frame acquisition — i.e. the NPU starts the first conv on the first rows and emits one result when row 224 arrives.

Today InputStreamBase::write enforces buffer.size() == get_frame_size(), so partial frames are rejected.

Questions:

  1. Does the firmware consume a frame’s input progressively (byte-credit, CREDIT_IN_BYTES) and begin compute as rows arrive, or wait for the full frame?
  2. If I issue multiple sub-frame H2D(host-to-device) transfers on one input channel, will the firmware accumulate them by byte-credit into a single frame and emit exactly one output?
  3. Is there any supported/internal way to do this, or is full-frame atomicity a hard requirement?

Thanks!