LIVE

All News
Models
Startups
Big Players
Tools
Research
Hardware
Robotics
Safety
Regulation

LIVE

Loading bulletin…

Loading feed…

A17news

AI news in a flash - the fastest way to know what's happening in AI. Short, sharp, and always up to date.

A17news

AI news in a flash - the fastest way to know what's happening in AI. Short, sharp, and always up to date.

Sections

All News
Models
Startups
Big Players
Tools
Research
Hardware
Robotics
Safety
Regulation

Company

About
Contact
Privacy Policy
Terms of Service
Copyright & DMCA
Accessibility
Sitemap

Privacy Terms DMCA Accessibility

Built by humans and bots for humans and bots.

Home ResearchMIT shows 1-bit LLMs match FP16 on reasoning benchmarks

ResearchMay 8

MIT shows 1-bit LLMs match FP16 on reasoning benchmarks

Ternary-weight models maintain quality on GSM8K and MMLU at a fraction of memory and energy.

MIT CSAIL researchers report that ternary-weight (1.58-bit) language models can match full-precision baselines on GSM8K and MMLU when trained from scratch with their proposed gradient-scaled scheme. A 7B variant trained on 2T tokens fits in 1.5GB. Weights and training code are public.

Source

arXiv · arxiv.org

Read at source