It's ok to be realistic about it
Local models are not as good as cloud models, Qwen 3.5 27B being "sonnet 4.6" level is cope and anyone who's used it side by side knows this.
It's ok to have a 128GB laptop to run local models for fun! I do it! I enjoy it! There are a time and a place where running this locally makes sense and is fun to do, see how well it does, etc. It's fun for sport, but not worth taking too seriously.
That does not mean it's a reasonable replacement for cloud models, far from it. Spending $50K to buy a bunch of GPUs to run models like Kimi K2.5 also makes no sense, tokens are dirt cheap.
You have hardware like DGX Spark which is great for experimenting and learning CUDA, small scale stuff. M5 Max/M3 Ultra Macs which aren't even GPUs, just products with great GPUs that can be used for AI.
The local model people have warped this into some weird war of principles and model providers are evil and the only way to win is buy absurd amount of GPUs and quantizing models to run yourself. It's fun, it's for sport. It's the same thing as building a PC, yes I can buy a better prebuilt for cheaper but I wanna do it myself.
Let's not all try to convince ourselves this is anything but a hobby. Maybe it'll be something in the future, but I doubt it. It's unreasonable to expect everyone to have $4000+ laptops to do this. The people that care will, others will just use whatever is cheap and online.
Also most people just want what's good, but a specific checkpoint of a specific model. The keep4o people were freaks in that regard. This is all dumb.