A terminal tool that right-sizes LLM models to your system's RAM, CPU, and GPU. Detects your hardware, scores each model across quality, speed, fit, and context dimensions, and tells you which ones will actually run well on your machine. #localmodel#llm#tool
Tried @ollama launch + Clauda code + GLM Flash today, surely, clauda code has alot of potentials working with its own models tailored for all scenarios.
However, working with local model was promising until i tried generate Rust code for Tauri + Dioxus.
#LOCALMODEL, #OLLAMALAUNCH, #GLMFLASH
@gilgNYC "Building your own local model indeed provides greater control. Our Kimi K2 setup could offer the flexibility you need. #LocalModel#TechInnovation"
Whisper Turbo already runs locally on macOS with mlx_whisper.
Transcribes 12 minutes in 14 seconds on an M2 Ultra (~50X faster than real time).
pip install mlx_whisper
Example: