Hi,
I would like to build my vector database without relying on the quantized version. Could you provide the teacher model (before pruning, quantization, compression etc)?
I would like to run it on my gpu (either onnx,tf,torch, etc) and to be sure it matches the embedding of quantized model (minus some %).
Thanks