Comment by johndough
I was wondering whether multiple GPUs make it go appreciably faster when limited by VRAM. Do you have some tokens/sec numbers for text generation?
I was wondering whether multiple GPUs make it go appreciably faster when limited by VRAM. Do you have some tokens/sec numbers for text generation?