Nf4.rar -
: An information-theoretically optimal data type for normally distributed weights. It uses 16 quantization levels based on the quantiles of a standard normal distribution.
: To reduce the memory footprint of LLMs (like Llama) enough to fit on a single GPU (e.g., a 24GB RTX 3090) while maintaining full 16-bit performance. NF4.rar
: RNF4 mediates the degradation of the PML-RARα fusion protein. NF4.rar
The term "NF4" is central to this "long paper" which revolutionized how large language models (LLMs) are fine-tuned on consumer hardware. NF4.rar