#SparseModels — Search

No JavaScript? That's cool, but you'll need to disable Turbo mode as it uses JavaScript in the client.

Chao Ma @ickma2311 · Nov 30, 2025

Geometry explains sparsity: L₂ circles → smooth, distributed solutions L₁ diamonds → corners → sparse solutions This is why Lasso works. Notes: ickma2311.github.io/Math/MIT18.065… #Optimization #SparseModels #MachineLearning

Chao Ma @ickma2311 · Nov 30, 2025

The “1/2-norm” isn’t a true norm: its unit ball is non-convex. Triangle inequality fails — midpoints fall outside the set. Yet it’s a powerful sparsity penalty. Full notes: ickma2311.github.io/Math/MIT18.065… #Convexity #SparseModels #Math

Vaibhav | e/acc @fenestbuc · Nov 10, 2025

Replying to @fenestbuc

5/5 What aspects of trillion-param MoE deployment interest you most? Memory offloading strategies? Dynamic routing budgets? Hierarchical expert organization? Drop your thoughts below 👇 #MoE #LLMs #SparseModels #AIResearch

Global Tech Council @GTechCouncil · Mar 8, 2025

Tech terms decoded! 🛠️ Attention techies, it’s time for #TermOfTheDay. Today, we are learning about: Sparse Models! ⚡ #TechTerms #SparseModels #AI #MachineLearning #DeepLearning #TechEducationd

سيف محمد بن صفوان | Saif Bin Safwan @Saif_BinSafwan · Nov 28, 2024

أعلنت Neural Magic عن نموذج اللغة Sparse Llama 3.1 8B، أصغر حجماً وأكثر كفاءة من سابقه. يهدف النموذج الجديد إلى جعل تقنيات الذكاء الاصطناعي في متناول الجميع، حيث يمكن تشغيله بأجهزة أقل قوة. #AI #MachineLearning #SparseModels #NeuralMagic #Llama_3_1_8B marktechpost.com/2024/11/25/neu…

Red Hat AI @RedHat_AI · Nov 25, 2024

Replying to @RedHat_AI

@vllm_project Download Sparse Llama: huggingface.co/neuralmagic/Sp… See benchmarks and our approach: hubs.li/Q02ZlXd90 Thanks to @_EldarKurtic, Alexdre Marques, @markurtz_, @DAlistarh, Shubhra Pandit & the Neural Magic team for always enabling efficient AI! #SparseModels #OpenSourceAI #vLLM

360

TechTonic @wtf_techtonic · Aug 10, 2024

Replying to @wtf_techtonic

Read more about this exciting finding and its implications for AI development: openreview.net/forum?id=svm53… openreview.net/pdf?id=svm53KQ… #artificialintelligence #deeplearning #sparsemodels

AITopTools @aitoptools · Aug 18, 2023

#AI & #MachineLearning need to converge well. Check out this new theory that could make this possible! 🤔 #SparseModels forbes.com/sites/johnwern…

FinSentim @FinSentim · Oct 4, 2021

Jonathan Schwarz et al. introduce #Powerpropagation, a new weight-parameterisation for #neuralnetworks that leads to inherently #sparsemodels. Exploiting the behavior of gradient descent, their method gives rise to weight updates exhibiting a "rich get richer" dynamic.

AK @_akhaliq · Oct 4, 2021

Powerpropagation: A sparsity inducing weight reparameterisation pdf: arxiv.org/pdf/2110.00296… abs: arxiv.org/abs/2110.00296 a new weight-parameterisation for neural networks that leads to inherently sparse models

FinSentim @FinSentim · Aug 13, 2021

Replying to @sarahookr

@sarahookr, @KaliTessera, and Benjamin Rosman take a broader view of training #sparsnetworks and consider the role of regularization, optimization, and architecture choices on #sparsemodels. They propose a simple experimental framework, #SameCapacitySparse vs #DenseComparison.

Sara Hooker @sarahookr · Aug 12, 2021

Tomorrow at @ml_collective DLTC reading group, @KaliTessera will be presenting our work on how initialization is only one piece of the puzzle for training sparse networks. Can taking a wider view of model design choices unlock sparse training? bit.ly/3xFtHKI

H2O.ai @h2oai · Feb 24, 2017

Replying to @Stanford

@Stanford H2O.ai advisors, Trevor Hastie & Rob Tibshirani, are holding a 2-day course in #MachineLearning #DeepLearning #SparseModels.