We built TurboQuant-MoE in 12 hours. The numbers surprised us too.
🧵 Thread:
1/ The problem: running Mixtral-8x7B requires ~90GB GPU RAM. Most teams can't afford that. We asked: what if compression didn't mean quality loss?
#TurboQuant #ML #github
1 Likes
1
1
22