#Inference — Search

No JavaScript? That's cool, but you'll need to disable Turbo mode as it uses JavaScript in the client.

Hunt3r | HyperGPT 🇳🇬 @DollarBwoy00 · 9h

#inference

DIO @VulkanHQ · 1d

Replying to @inference_labs

@inference_labs I'm really ready and anticipating what's coming next #AI #inference

AIopenmind @AIopenmind · 1d

Jensen Huang sostiene che Nvidia entri nella fase dell’“inference inflection”: il focus passa dall’addestramento all’uso operativo dell’AI, con un’opportunità di ricavi da 1.000 miliardi . #Nvidia #AI #Inference #JensenHuang #GTC2026 aiopenmind.it/ArtificialInte…

Nvidia e l’“inference inflection”: la nuova corsa dell’IA

Jensen Huang rilancia la traiettoria di Nvidia e sostiene che il mercato stia entrando nell’era dell’“inference inflection”, cioè nella fase in cui il valore dell’intelligenza artificiale si sposta…

From aiopenmind.it

Vocabulary Ninja @VocabularyNinja · 1d

🔥 Comp Ninja - Inference and Beyond LKS2 & UKS2 🔥 "The comprehension books of the decade!" Inference-focused. 📗20 High Quality texts, thousands of inference questions! 📲 Order Hebit.ly/3JPA3mqbxuv #inference #comprehension #ninja #reading #teacher #primary #wNYm6

148

DIO @VulkanHQ · 2d

Replying to @inference_labs

@inference_labs I'm really ready and anticipating what's coming next #AI #Inference

Sócrates no auge @ohfavodemel · 2d

Conseguindo 14 tk/s com Qwen3.5-4B-UD-Q4_K_XL.gguf Na apu AMD 5600g vega7 pelo llama.cpp vulkan Modelo bem legal #inference #llm #localmodel #openclaw #apu

James Allard @fortuisis · 2d

A war is brewing—not for land or oil (though both will factor in), but for a commodity no one knew existed just years ago. #AI #Inference #war #geopolitics #digitaloil decentmind.substack.com/p/the-inevitab…

Vikrant Singh Rajput @vikrant_in · 2d

Only issue, I am use #inference unexpextedly this crashs on load. Tried 9b and 27b #Qwen3.5 #mlx. Not sure there issue #llm model, #inferencer or #openclaw

Vikrant Singh Rajput @vikrant_in · 2d

Prompt cache is Hugh for local LLM inference. I have just try it. Experience is great.

Holger Gerlach @hgworks · 2d

NVIDIA zeigt Groq 3 LPU — erster Chip aus dem 20-Mrd.-$-Deal. SRAM-Inferenz-Beschleuniger für Vera Rubin: 35x mehr Durchsatz/MW. Auslieferung Q3 2026 (Samsung 4nm). #NVIDIA #Groq #Inference #AI #KI #AINews tomshardware.com/tech-industry/…

How Nvidia's $20 billion Groq 3 LPU deal reshapes the Nvidia Vera Rubin Platform — Samsung 4nm...

The Groq 3 LPU arrives as Rubin CPX appears to exit the roadmap.

From tomshardware.com

GoodVisionAI @GoodVisionAI · 2d

Training happens once. Inference happens billions of times. The AI era is shifting to the edge. #GoodVisionAI is building the infra for this transition. #AI #Inference #GPU

SwiftInference.ai @swiftinference · 3d

AI inference is no longer experimental in telecoms - it's in the operational core. From real-time fraud detection to predictive network maintenance, the infrastructure decisions operators make now will define their competitive position for years.swiftinference.ai/blog/how-ai-in… #inference

How AI Inference Is Transforming Telecommunications in 2026 — SwiftInference Blog

AI inference is moving from the data centre to the network edge, fundamentally reshaping how telecoms operators manage infrastructure, reduce churn, and deliver service quality. Here is what the...

From swiftinference.ai

TechDay United Kingdom @techday_uk · 3d

NeuReality has appointed former Google AI Product Leader Shalini Agarwal as Senior Adviser to help steer strategy and scale its NR-NEXUS inference platform. #AI #MachineLearning #Inference #Tech techday.co.uk/story/neureali…

NeuReality names Shalini Agarwal to boost NR-NEXUS platform

NeuReality taps ex-Google AI Product Leader Shalini Agarwal as Senior Adviser to steer strategy and scale its NR-NEXUS inference platform.

From itbrief.co.uk

Groookounet @groookounet · 3d

Replying to @groookounet

2/10 Le choc de mars 2026 : Nvidia vient de racheter l'unité d'inférence de Groq pour 20 Md$. 💰 Pourquoi ? Parce que les puces classiques saturent. Groq apporte la vitesse "temps réel" dont les agents IA ont besoin. La guerre de l'inférence est déclarée. ⚔️ #inference

AI Talk in @AiTalkIn · 3d

AI Cinematic Integrity Auditor: Protecting Creative Vision from. aitalkin.com/t/ai-cinematic… #AI #Inference #Automation

AI Cinematic Integrity Auditor: Protecting Creative Vision from Algorithmic Drift in Post-Production

AI Cinematic Integrity Auditor: Protecting Creative Vision from Algorithmic Drift in Post-Production The Entertainment Challenge: The Invisible Loss of Directorial Intent in the Algorithmic Assembly...

From aitalkin.com

Sarbjeet Johal @sarbjeetjohal · 4d

Disaggregated #inference is changing how teams design AI architectures on @Kubernetes — splitting prefill and decode into distinct services with different resource profiles and scaling needs. #kubecon @SantoshYadavDev @SaiyamPathak @dhinchcliffe @nyike @dvellante @furrier @rseroter @kaslinfields @virtualized6ix @IsForAt @NVIDIAAIDev @NVIDIADC @NVIDIAAI @NVIDIAAIDev

NVIDIA Data Center @NVIDIADC · 5d

💡 Disaggregated LLM inference is changing how teams design AI architectures on Kubernetes — splitting prefill and decode into distinct services with different resource profiles and scaling needs. Learn how to: ✅ Separate prefill and decode for better GPU utilization andUse gang scheduling, hierarchical gang scheduling, and topology-aware placement ✅ Express multi-role inference pipelines with APIs like LeaderWorkerSet and NVIDIA Grove 🔗 Read the tech blognvda.ws/4lK2nWluv

553

Sanjay Siboo @SanjaySiboo · 4d

Inference is the core data-center workload, and tokens are the new commodity of the AI era, says Nvidia. networkworld.com/article/414813… #NVIDIA #Inference

Vocabulary Ninja @VocabularyNinja · 4d

173

Deva @DevaCodeX · 4d

Replying to @DevaCodeX

2️⃣ Gimlet Labs raised $80M to crack AI's inference bottleneck. Their multi-silicon cloud splits workloads across CPUs, GPUs & custom chips for 3-10x faster performance. Already at 8-figure revenue. techcrunch.com/2026/03/23/sta… #AI #Inference

Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way | TechCr...

Gimlet Labs just raised an $80 million Series A for tech that lets AI run across NVIDIA, AMD, Intel, ARM, Cerebras and d-Matrix chips, simultaneously.

From techcrunch.com

Nikhil Prakash @NikhilP12758418 · 4d

A open source platform for serving ai model on gpu at scale - detoserve #inference #gpu #skypilot #dynamo #hackernews

100

Harry He @harryheisme · 5d

Office Hour Time! I've been flooded with messages asking: "I just opened the SGLang repo, How do I not waste my first day?" Join this Office Hour and get your answers directly! #sglang #inference #inferenceframework #lmsys #AI

LMSYS Org @lmsysorg · 5d

📣 New to SGLang? No problem — Our Office Hours have you covered 👌 This week's session is built for beginners: "New to SGLang: What I Learned & What I Wish I Knew on Day 1." 👉Alex Nails (@@alxnails), MTS at @@radixark, is sharing what it's actually like to onboard into SGLang — hher, what took some time to click, and his ideas on what could be better. Join us for the mental model walkthrough for SGLang, and an open discussion on making the dev and learning experience better. 📅 March 25 | 6:00 PM PST Register on Luluma.com/87xexrbgbWjM