Try to split the model, but mistake

hay guys, When I tried to convert the “ppocr rec” onnx model to har, I encountered two issue:

  1. Softmax nodes does not support.
  2. Matrix larger than four dimensions can cause computational problems.
    I raised this issue on the forum, and one of the solutions I got was to split the model.
    Then what I did like:

    I removed those nodes with higher than 4-dimensional computation and softmax from the model, which means splitting the model into two sub models. Fortunately, both of these anomalies are in the same block.
    [input] -> origin_model -> [output]
    [input] -> sub_model_1 -> (compute on cpu) -> sub_model_2 -> [output]
    Everything is running well, These sub models have successfully passed the SDK_NATIVE test.
    When I wanted to quantify these sub models, I encountered a problem. I use the middle output generated in SDK_NATIVE as my quantization dataset, but after quantization, the output error reached an unacceptable level.
    What sould I do then?