Comment by elif
Strictly speaking, you have not deployed any model on a 5090 because a 5090 card has never been produced.
And without specifying your quantization level it's hard to know what you mean by "not usable"
Anyway if you really wanted to try cheap distilled/quantized models locally you would be using used v100 Teslas and not 4 year old single chip gaming GPUs.
Are you a time traveller from the past? https://www.nvidia.com/en-gb/geforce/graphics-cards/50-serie...