Comment by Aurornis

Comment by Aurornis 2 days ago

0 replies

> Using `--flash-attn --cache-type-k q8_0 --cache-type-v q8_0`

I think you meant ‘--cache-type-v q4_0’

I would also like an explanation for what’s different in this patch compared to the standard command line arguments.