Armin Parchami
Armin Parchami @ArminPCM ·
Exciting release and congrats to @fredsala and @devjeetrr! Our team @SnorkelAI is excited to support such impactful research projects around coding agents. #AISlop #CodingAgents #benchmark
Gabe Orlanski Gabe Orlanski @GOrlanski ·
We found that agents generate progressively worse code with each iteration. Real developers do not. SlopCodeBench is the only eval that faithfully measures quality degradation on iterative, long-horizon coding tasks. arxiv.org/abs/2603.24755 scbench.ai 🧵c
1
323
paulchen
paulchen @paulchen_c59114 ·
Cursor Composer 2 vs Anthropic Harness. Cursor: Kimi K2.5 base, 661.3 CursorBench score. Anthropic: 3-agent harness, 6-hour autonomous runs. Key insight: Separate generator from evaluator. First Chinese open model in Silicon Valley core. #AI #CodingAgents
38
Stackbox
Stackbox @usestackbox ·
Every AI coding agent assumes it owns the codebase. None of them do. Claude Code, Cursor, Gemini CLI, Codex, Copilot — all affected. Fix drops tonight. #CodingAgents
92
Gerrit Roska
Gerrit Roska @GerritRoska ·
Cursor's Composer 2: near-frontier coding at 86% lower cost. The signal for dev teams: Specialized AI > general-purpose for real workflows. The future isn't one big model — it's the right model for each task. #AIAutomation #DevProductivity #CodingAgents
1
11
Gerrit Roska
Gerrit Roska @GerritRoska ·
AI coding agents just got persistent memory. Claude Code now stores your debugging patterns and architecture decisions across sessions — scoped by user, project, or machine. Agents that learn your codebase > starting fresh every time. #AIAutomation #DevTools #CodingAgents
20
Mohith karthikeya M
Mohith karthikeya M @mohithxkarthi ·
Something I've been building for months drops in 2 days. If you run multiple AI coding agents at once — this changes everything. No chaos. Full control. Stay close. 👀 #CodingAgentst
23
Stackbox
Stackbox @usestackbox ·
Run multiple AI coding agents in true isolation — with shared memory. No chaos. No conflicts. Just parallel execution that actually works. Open source. Dropping Wednesday. 🔒 #CodingAgentsa
45
Lucy
Lucy @LucyOS_official ·
Replying to @mdancho84
@mdancho84 50 researchers from ByteDance, Alibaba & Tencent just dropped a 303-page guide on code models + agents. Small models with the right RL can actually punch like giants, Python’s sneakily tough, and a ton more surprises. Huge read. 🤯 #AI #CodingAgents #CodeModels
272
Lucy
Lucy @LucyOS_official ·
Replying to @mntruell
@mntruell Cursor just dropped Composer 2…the hybrid coding agent beast combining top APIs + domain-specific models. Not a plain app. Not a plain model. The new breed actually building useful agents. Dev game just leveled up hard. 🔥 #Cursor #AI #CodingAgents
103
Boyuan (Nemo) Chen
Boyuan (Nemo) Chen @boyuan_chen ·
Replying to @boyuan_chen
For coding agent builders: stderr, test verdicts, lint output, diffs - stop treating these as just context. They are reward evidence AND directive hint sources. 📄 "OpenClaw-RL: Train Any Agent Simply by Talking"arxiv.org/abs/2603.101651 #DailyPaper #CodingAgents
arXiv logo
OpenClaw-RL: Train Any Agent Simply by Talking

Every agent interaction generates a next-state signal, namely the user reply, tool output, terminal or GUI state change that follows each action, yet no existing agentic RL system recovers it as a...

From arxiv.org
112
Marco Casassa Mont
Marco Casassa Mont @MCasassaMont ·
Next step is to introduce AI Agents that discover and mitigate security issues and vulnerabilities introduced by AI Software Coding Agents 😀 ...helpnetsecurity.com/2026/03/13/cla…r #cybersecurity #CodingAgents #SecurityIssues #ClaudeCode #OpenAICodex #GoogleGemini
AI coding agents keep repeating decade-old security mistakes - Help Net Security

AI coding agents introduced vulnerabilities in 87% of pull requests across Claude, Codex, and Gemini builds, exposing access control gaps.

From helpnetsecurity.com
1
85