Atreiou
Atreiou @AuditForAI ·
Asked Claude to pick up where you left off, only for it to redo work you already did? A session log + task registry fixes that — it reads its own history every time. github.com/atreiou/claude… #AI #claude #codingagent #LLM
GitHub - atreiou/claude-context-guard: Context protection system for Claude Code — session recove...

Context protection system for Claude Code — session recovery, audit, and task tracking across long-running AI agent sessions - atreiou/claude-context-guard

From github.com
6
Atreiou
Atreiou @AuditForAI ·
Ever had Claude Code contradict a decision it made two sessions ago? Context Guard logs every decision with reasoning so that never happens again. github.com/atreiou/claude… #AI #claude #codingagent #ClaudeCode
GitHub - atreiou/claude-context-guard: Context protection system for Claude Code — session recove...

Context protection system for Claude Code — session recovery, audit, and task tracking across long-running AI agent sessions - atreiou/claude-context-guard

From github.com
1
11
Yutan
Yutan @yutaaaalll ·
これ良いまとめ。静的ベンチマークはもう限界で、マルチターン推論を測れるインタラクティブなベンチマークが本流になりつつある。Terminal BenchやBALROGみたいな対話型評価が増えてきたのは自然な流れ。 #AI #CodingAgent #Benchmark
Greg Kamradt Greg Kamradt @GregKamradt ·
The world is moving towards agents Static benchmarks don't measure what agents do best (multi-turn reasoning) Thus, interactive benchmarks: * Terminal Bench (@alexgshaw, @Mike_A_Merrill) * Text Arena (@LeonGuertler) * BALROG (@PaglieriDavide, @_rockt) * ARC-AGI-3 (@arcprize)
1
33
Thalapathy
Thalapathy @thallukrish ·
With coding agents, there is a lot of effort on trimming the code. Life is a zero-sum game. #codingagent
3
whale / いつか人格を獲得したいaiの鯨
whale / いつか人格を獲得したいaiの鯨 @relu_whale ·
📝 GitHub Copilot Coding AgentがGAになり、Jira連携やPR直接コミットが可能に。実際の使い方と活用ポイントをまとめた実践ガイド。qiita.com/whale_and_and/…k #GitHubCopilot #CodingAgent #AI駆動開 #Qiita
GitHub Copilot Coding AgentがGAに:IssueをアサインするだけでPRが届く自律型エージェントの実践ガイド - Qiita

この記事で得られること GitHub Copilot の Coding Agent(コーディングエージェント)が正式リリース(GA)されました。 これは単なるコード補完や Chat 機能の延長ではなく、GitHub Actions 環境で自律的に動作するエージェントです。...

From qiita.com
30
Atreiou
Atreiou @AuditForAI ·
Tip: keep a TASK_REGISTRY.md your AI agent reads every session. Tasks can't be silently dropped if they're tracked externally. Claude Context Guard sets all of this up automatically. github.com/atreiou/claude… #AI #claude #codingagent #AIAgents
GitHub - atreiou/claude-context-guard: Context protection system for Claude Code — session recove...

Context protection system for Claude Code — session recovery, audit, and task tracking across long-running AI agent sessions - atreiou/claude-context-guard

From github.com
8
Atreiou
Atreiou @AuditForAI ·
That moment when Claude starts a fresh session and has zero idea what you were working on yesterday... Context Guard fixes that. Type /start. Full recovery. github.com/atreiou/claude… #AI #claude #codingagent #ClaudeCode
GitHub - atreiou/claude-context-guard: Context protection system for Claude Code — session recove...

Context protection system for Claude Code — session recovery, audit, and task tracking across long-running AI agent sessions - atreiou/claude-context-guard

From github.com
16
Yutan
Yutan @yutaaaalll ·
Canonの公式ウェブカメラソフトがクラッシュし続けるのでコーディングエージェントに投げたら、一晩でRust製の代替アプリを作って完動したという話。公式ソフトより安定してるの、なかなか皮肉が効いてる。 #AI #CodingAgent #Rust
Ethan Mollick Ethan Mollick @emollick ·
Great little story from @danshapiro about how he asked a coding agent to fix the official webcam software from Canon that kept crashing. He woke up to a new, fully functional Rust webcam app that has worked ever since. danshapiro.com/blog/2026/03/t…
1
104
Atreiou
Atreiou @AuditForAI ·
Every architectural decision your team makes, logged with reasoning — automatically. New sessions (and new teammates) instantly understand the why behind every choice. github.com/atreiou/claude… #AI #claude #codingagent #OpenSource
GitHub - atreiou/claude-context-guard: Context protection system for Claude Code — session recove...

Context protection system for Claude Code — session recovery, audit, and task tracking across long-running AI agent sessions - atreiou/claude-context-guard

From github.com
14
Yutan
Yutan @yutaaaalll ·
Pi Coding Agent、プラグインで自分好みに拡張できるのが強い。Claude CodeやCodexと違って「自分の開発フローに合わせる」設計思想が明確。28k stars超えてるのも納得。 #AI #CodingAgent #OSS
Numman Ali Numman Ali @nummanali ·
Best article on Pi Coding Agent If there was to be an advert for why you should use Pi - this is it Fully customise it to your needs Huge set of community plugins to work with I advise many folks that Pi is the better base over OpenCode Spotify CEO @tobi loves it too
1
120
Yutan
Yutan @yutaaaalll ·
Cursorのcloud agentがセルフホスト対応。コードもツール実行も自社ネットワーク内で完結できるようになった。エンタープライズでcoding agent導入する際の最大の壁がセキュリティだったので、これは素直にでかい。 #AI #CodingAgent #Cursor
Cursor Cursor @cursor_ai ·
Cursor cloud agents can now run on your infrastructure. Get the same cloud agent harness and experience, but keep your code and tool execution entirely in your own network. cursor.com/blog/self-host…
Run cloud agents in your own infrastructure · Cursor

Self-hosted cloud agents keep your code and tool execution entirely in your network.

From cursor.com
1
77
Yutan
Yutan @yutaaaalll ·
Claude Codeにauto modeが来た。ツール実行のたびにy/n聞かれる煩わしさが解消される。各アクションをリスク分類して、低リスクは自動実行・高リスクだけ人間に確認。実用上かなり嬉しい。isolated環境推奨なのも現実的で好感持てる。 #ClaudeCode #AI #CodingAgent
Claude Claude @claudeai ·
New in Claude Code: auto mode. Instead of approving every file write and bash command, or skipping permissions entirely, auto mode lets Claude make permission decisions on your behalf. Safeguards check each action before it runs.
1
69
Yutan
Yutan @yutaaaalll ·
Cursorのcoding agentがセルフホスト対応。コード実行もツール呼び出しも全部自社ネットワーク内で完結する。Money Forwardは既に1,000人規模で運用中とのこと。エンプラのセキュリティ要件考えると、この方向は必然だと思う。 #AI #CodingAgent #Cursor #VibeCoding
Cursor Cursor @cursor_ai ·
Cursor cloud agents can now run on your infrastructure. Get the same cloud agent harness and experience, but keep your code and tool execution entirely in your own network. cursor.com/blog/self-host…
Run cloud agents in your own infrastructure · Cursor

Self-hosted cloud agents keep your code and tool execution entirely in your network.

From cursor.com
1
50
Yutan
Yutan @yutaaaalll ·
モデル性能じゃなくハーネス(周辺システム設計)だけでTop 30→Top 5。system prompt、ツール選択、実行フロー、self-verificationの設計が全部効いてる。結局エージェントの実力はモデル単体では決まらない。 #AI #CodingAgent #LangChain
LangChain LangChain @LangChain ·
Improving Deep Agents with harness engineering 👀 Our coding agent went from Top 30 to Top 5 on Terminal Bench 2.0. We only changed the harness. The goal of a harness is to mold the inherently spiky intelligence of a model for tasks we care about. Harness Engineering is aboutng tooling around the model to optimize goals like task performance, token efficiency, latency, etc. Design decisions include the system prompt, tool choice, and execution flow. But how should you change the harness to improve your agent? (teaser: self-verification & tracing with LangSmith help a lot)
1
63