سيف محمد بن صفوان | Saif Bin Safwan
سيف محمد بن صفوان | Saif Bin Safwan @Saif_BinSafwan ·
أعلنت Neural Magic عن نموذج اللغة Sparse Llama 3.1 8B، أصغر حجماً وأكثر كفاءة من سابقه. يهدف النموذج الجديد إلى جعل تقنيات الذكاء الاصطناعي في متناول الجميع، حيث يمكن تشغيله بأجهزة أقل قوة. #AI #MachineLearning #SparseModels #NeuralMagic #Llama_3_1_8B marktechpost.com/2024/11/25/neu…
2
12
FinSentim
FinSentim @FinSentim ·
Jonathan Schwarz et al. introduce #Powerpropagation, a new weight-parameterisation for #neuralnetworks that leads to inherently #sparsemodels. Exploiting the behavior of gradient descent, their method gives rise to weight updates exhibiting a "rich get richer" dynamic.
AK AK @_akhaliq ·
Powerpropagation: A sparsity inducing weight reparameterisation pdf: arxiv.org/pdf/2110.00296… abs: arxiv.org/abs/2110.00296 a new weight-parameterisation for neural networks that leads to inherently sparse models
FinSentim
FinSentim @FinSentim ·
Replying to @sarahookr
@sarahookr, @KaliTessera, and Benjamin Rosman take a broader view of training #sparsnetworks and consider the role of regularization, optimization, and architecture choices on #sparsemodels. They propose a simple experimental framework, #SameCapacitySparse vs #DenseComparison.
Sara Hooker Sara Hooker @sarahookr ·
Tomorrow at @ml_collective DLTC reading group, @KaliTessera will be presenting our work on how initialization is only one piece of the puzzle for training sparse networks. Can taking a wider view of model design choices unlock sparse training? bit.ly/3xFtHKI
2