When there are multiple inputs rather than only an image input, how should the calibration dataset be constructed?

In the optimization stage, I can perform calibration for a single image input. However, how should it be handled when there are multiple inputs? For example, an image (640x640) and a text embedding vector (1x512).

Maybe this helps:

Hailo Community - How can i optimize the model which have multi input