Ternary-weight models maintain quality on GSM8K and MMLU at a fraction of memory and energy.
MIT CSAIL researchers report that ternary-weight (1.58-bit) language models can match full-precision baselines on GSM8K and MMLU when trained from scratch with their proposed gradient-scaled scheme. A 7B variant trained on 2T tokens fits in 1.5GB. Weights and training code are public.