2026-04-26

Denoise · Twitter

The AI agent is now a shipping primitive, with Anthropic's terminal agent and OpenAI's SDK moving the battleground to developer workflow and orchestration.

Pay attention to the convergence on agent infrastructure: Anthropic's Claude Code is reframing the IDE as the terminal, while OpenAI, Vercel, and Replit are shipping the corresponding deployment and orchestration tools.

2026-04-262026-04-26T09:58:39Zrules twitter-v1Healthytweets 26signals 26

Top 3 changes

@AnthropicAI / Claude Code 1.5: A terminal-native coding agent is released, challenging the visual IDE as the primary developer interface.
@OpenAI / Agent SDK: A new SDK provides protocol-level primitives for tool calling and multi-worker orchestration, signaling a platform shift.
@karpathy / Developer Experience: Articulates the structural shift from IDEs to terminal agents as a fundamental change in coding workflows.

Strategic insights

#01A new platform layer is solidifying around agent orchestration. OpenAI's SDK, Vercel's edge runtime, Replit's harness, and Temporal's workflows all target the same 'durable, multi-worker agent' primitive.

#02The terminal is being seriously positioned as the next AI-native IDE. Anthropic's Claude Code launch, coupled with commentary from @karpathy and adoption reports from @levelsio, suggests a potential displacement of GUI-based tools like VS Code or Cursor.

#03Security concerns are moving up the stack from prompt injection to the agent orchestration layer. Google DeepMind's red team framework and analysis from @AlexAlbert__ identify inter-tool communication and deployment configurations as the new critical vulnerabilities.

#04The market is bifurcating between general-purpose agent frameworks and highly-specialized, embedded AI features. While Anthropic and OpenAI build platforms, products like Linear are winning with narrow, high-value automations like issue triage.

#05Data and systematic optimization are becoming the key differentiators in prompt engineering. The focus is shifting from manual 'tricks' (@dotey) to industrial-scale dataset releases (MistralAI), large-scale benchmarking (Weights & Biases), and programmatic optimization (DSPy).

Categories

Terminal Agents & AI Coding(5)

The release of Claude Code 1.5 establishes a direct competitor to OpenAI's legacy in code generation, with initial benchmarks by @swyx indicating performance divergence on complex, long-context tasks.

Anthropic released Claude Code 1.5, a terminal-native agent positioned as a new primary coding interface, sparking discussion on the shift away from traditional IDEs.

Anthropic@AnthropicAIrising
Responsible disclosure on a Claude jailbreak chain we patched last week. Full write-up including our red team timeline.
♥ 5.2k↻ 910" 160⟲ 220· score 7.5k· +1 related
Anthropic@AnthropicAIrising
Claude Code 1.5 is live. Terminal-native coding agent with full Claude Opus reasoning, file-ops sandbox, and session replay.
♥ 4.8k↻ 820" 140⟲ 190· score 6.9k· +1 related
Andrej Karpathy@karpathyrising
The developer-experience shift from IDE to terminal agent is underrated. Coding workflows are about to look nothing like 2024.
♥ 3.4k↻ 510" 30⟲ 140· score 4.5k
swyx@swyxrising
Codex vs Claude Code terminal agent benchmarks. Pass@1 diverges more than I expected on the long-context editor tasks.
♥ 1.1k↻ 180" 22⟲ 60· score 1.6k
@levelsio@levelsiorising
Switched my whole editor setup to Claude Code this week. Shipping faster than when I used Cursor + Copilot.
♥ 580↻ 40" 6⟲ 80· score 678

AI Infra & Protocols(5)

A clear convergence is happening around agent deployment primitives. OpenAI, Vercel, Replit, and Temporal are all independently building towards a common abstraction of durable, replayable, multi-worker agents.

Major infrastructure providers including OpenAI, Vercel, and Replit all released new tools and runtimes specifically for deploying and orchestrating AI agents.

OpenAI@OpenAIrising
New agent SDK: protocol-level tool calling, deployment harness, and multi-worker orchestration primitives. Docs live.
♥ 4.2k↻ 680" 75⟲ 180· score 5.8k
Vercel@vercelrising
Edge runtime for agent workers is live. Spawn durable background agents from any serverless deployment.
♥ 540↻ 80" 6⟲ 22· score 718
Replit@replitrising
New agent deployment harness. One command to go from local orchestration to hosted agent worker.
♥ 380↻ 55" 5⟲ 18· score 505
Temporal@temporaliorepeated
Orchestrating agents with durable workflows: replayable, resumable, and multi-worker by default. Walkthrough from our infra team.
♥ 310↻ 48" 4⟲ 14· score 418
Jerry Liu@jerryjliu0repeated
Dataset curation for agent training: how we filter synthetic data that looks good but poisons generalization.
♥ 260↻ 36" 2⟲ 11· score 338

Prompt Engineering & Data(5)

There's a shift from artisanal prompt crafting (@dotey) to systematic, data-driven optimization, with tools like DSPy and platforms like Weights & Biases enabling large-scale, automated benchmarking and search.

The field is moving towards industrial-scale practices, highlighted by MistralAI's 100M-row dataset release and new frameworks for automated prompt optimization from DSPy.

Mistral AI@MistralAIrising
Open dataset release: 100M-row web OCR dataset. Cleaned, licensed, ready to train.
♥ 2.6k↻ 390" 30⟲ 88· score 3.5k
DSPy@dspy_airising
DSPy 3.0: prompt optimization via compile-time search over system prompt variations. Benchmarks inside.
♥ 960↻ 150" 12⟲ 42· score 1.3k
Notion@NotionHQrising
Notion workspace automation is out of beta. Auto-fill tables, chained updates across databases, and a new audit log surface.
♥ 820↻ 125" 12⟲ 38· score 1.1k
dotey@doteyrising
Five prompt tricks learned this week from reviewing 200 production prompts. Short thread.
♥ 510↻ 88" 8⟲ 30· score 710
Weights & Biases@weights_biasesrising
System prompt benchmarking at scale: we ran 40k variants across 6 frontier models. The efficient frontier is not where you think.
♥ 420↻ 55" 6⟲ 20· score 548

Memory & Knowledge Management(5)

The vocabulary is changing: @GregKamradt declares 'RAG is dead', while @mem0ai proposes layered memory models. This reflects a consensus that simple vector retrieval is insufficient for complex agentic workflows.

Discourse is evolving from simple RAG to more sophisticated 'context engineering', addressing failure modes in large context windows and proposing layered memory architectures.

Vaibhav Srivastav@reach_vbrising
Tested the new 10M context memory window end to end. Surprising failure modes around rag retrieval cache invalidation, thread below.
♥ 1.9k↻ 260" 22⟲ 75· score 2.5k
LangChain@LangChainAIrising
MCP protocol integration thread. How to wire existing LangGraph agents into the Anthropic Model Context Protocol server spec.
♥ 920↻ 145" 14⟲ 48· score 1.3k
Greg Kamradt@GregKamradtrising
RAG is dead, long live context engineering. My framework for when to cache, when to retrieve, and when to just dump memory into the prompt.
♥ 820↻ 130" 16⟲ 54· score 1.1k
mem0@mem0airising
Memory layer for agents: differentiating working memory from the subconscious store. Vector index isn't enough anymore.
♥ 480↻ 72" 5⟲ 25· score 639
Jason Liu@jxnlcorepeated
Vector db beauty contest. Ran 50k RAG queries against 6 vendors. Results inside, free.
♥ 360↻ 48" 3⟲ 14· score 465

Autonomous Security(3)

The critical vulnerability is migrating from the model to the orchestration layer. Both Google DeepMind and @AlexAlbert__ now point to tool interaction and deployment logic as the primary areas for agent jailbreaks.

Security research is shifting focus to the unique vulnerabilities of autonomous agents, with Google DeepMind releasing a red teaming framework for this new attack surface.

Google DeepMind@GoogleDeepMindrising
New red team framework for prompt injection in autonomous agents. Covers cross-tool leakage, scanner evasion, and sandbox escape patterns.
♥ 880↻ 140" 18⟲ 38· score 1.2k
Alex Albert@AlexAlbert__rising
When your security scanner finds nothing scary on an agent deploy, check the orchestration layer again. That's usually where the jailbreak sneaks through.
♥ 420↻ 60" 8⟲ 35· score 564
MalwareTech@MalwareTechBlogrepeated
Autonomous agent running pentest flows against a real SaaS. First real-world run: fewer false positives than I expected on the vulnerability surface.
♥ 180↻ 28" 3⟲ 15· score 245

Productivity & Specialized Apps(3)

While headlines focus on general agents, the most immediate user value comes from specialized, embedded AI. Linear's auto-triage is a prime example of a non-obvious, workflow-integrated feature that quietly delivers significant productivity gains.

AI is being integrated into established SaaS tools for narrow, high-leverage automation, such as Linear's new auto-triage feature for issues.

Linear@linearrising
Linear now auto-triages incoming issues. Quiet launch, but already our favorite workspace feature of the year.
♥ 460↻ 70" 6⟲ 24· score 618
James Clear@jamesclearrepeated
The best habit tracker is the one you actually open. Three open-source alternatives worth trying.
♥ 280↻ 42" 3⟲ 18· score 373
Ben Andrew@BenAAndrewrepeated
The sourdough app I shipped in 48 hours is now my most-used side project. Source and write-up linked.
♥ 140↻ 22" 2⟲ 10· score 190