@rosslote
Hey, i would like to share some thoughts… seems like you have similar issue what i had…
Low FPS - Can be improved using cpp based post processing. (In python i was having multiple issues. I was not able to achieve 8-10max per camera with almost full cpu and second issue was that the program crashes very oftem due to memory kept increase). Using cpp based pp increases fps to 25-30 per camera(rpi5). But i limited to 15 for ideal cpu uses.
Regarding your dequantization issue -
I believe the default quantization is set 8bit for bounding box and 16 bit for keypoints… and i was not able figure out the solution in python.(but if you set the quantization to fully 8bit. It will work. Points will be correct).
Reason - The python APIs are just wrapper around c++ and they are converting only to uint_8(as far as i remember)… you can take a look for more detailed info.
Solution - convert your model to fully 8 bit.
In cpp post processing you will have same issue. But i was able to build a fully 16 bits postprocessing by adjusting some of their convertion code from uint_8 to uint_16.
Please take a look at this thread- Hey I want to build my own custom postprocessing .so - #8 by saurabh