Comment by danielhanchen
Comment by danielhanchen 11 hours ago
UD stands for "Unsloth-Dynamic" which upcasts important layers to higher bits. Non UD is just standard llama.cpp quants. Both still use our calibration dataset.
Comment by danielhanchen 11 hours ago
UD stands for "Unsloth-Dynamic" which upcasts important layers to higher bits. Non UD is just standard llama.cpp quants. Both still use our calibration dataset.
Oh good idea! In general UD-Q4_K_XL (Unsloth Dynamic 4bits Extra Large) is what I generally recommend for most hardware - MXFP4_MOE is also ok
It takes download time + 1 minute to test speed yourself, you can try different quants, it's hard to write down a table because it depends on your system ie. ram clock etc. if you go out of gpu.
I guess it would make sense to have something like max context size/quants that fit fully on common configs with gpus, dual gpus, unified ram on mac etc.
What is your definition of "important" in this context?
Oh we wrote about it here: https://unsloth.ai/docs/basics/unsloth-dynamic-2.0-ggufs
Please consider authoring a single, straightforward introductory-level page somewhere that explains what all the filename components mean, and who should use which variants.
The green/yellow/red indicators for different levels of hardware support are really helpful, but far from enough IMO.