Comment by segmondy
you can do this already with -ctk and -ctv, why would anyone need this?
-ctk, --cache-type-k TYPE KV cache data type for K
allowed values: f32, f16, bf16, q8_0, q4_0, q4_1, iq4_nl, q5_0, q5_1
(default: f16) (env: LLAMA_ARG_CACHE_TYPE_K)
-ctv, --cache-type-v TYPE KV cache data type for V allowed values: f32, f16, bf16, q8_0, q4_0, q4_1, iq4_nl, q5_0, q5_1
(default: f16)