Skip to Content

Claude Opus 4.8 Dynamic Workflows Explained: How Anthropic's Multi-Agent System Transforms Large-Scale Coding

Plan, fan out, and verify — the new orchestrator that rewrote 750,000 lines of Bun in days
Sk Jabedul Haque
Jun 1, 2026 5 min read 356 views
Claude Opus 4.8 Dynamic Workflows Explained: How Anthropic's Multi-Agent System Transforms Large-Scale Coding
Navigation
10 Sections
    Claude Opus 4.8 Dynamic Workflows is Anthropic's new research-preview feature, shipped on May 28, 2026, that lets Claude Code plan a large coding task, fan the work out across hundreds of parallel subagents in a single session, and adversarially verify the converged result. The flagship case study is Bun: Jarred Sumner used Dynamic Workflows to port roughly 750,000 lines from Zig to Rust with 99.8% of the existing test suite still passing.

    What You'll Learn

    • What Dynamic Workflows are, and how they differ from subagents and Agent Teams in Claude Code
    • How the Plan → Assign → Verify three-step lifecycle actually executes a codebase-scale migration
    • Inside the Bun case study: 750,000 lines, 99.8% test parity, and the 6-day (or 11-day) wall-clock claim
    • The five Effort Control levels (low, medium, high, xhigh, max) and the ~2.7× token-cost spread
    • Plan availability, the enterprise off-by-default governance decision, and the 1,024-token prompt cache minimum change

    What Anthropic Shipped on May 28, 2026

    On Thursday, May 28, 2026, Anthropic released Claude Opus 4.8 — its new flagship reasoning model — alongside a research-preview feature called Dynamic Workflows for Claude Code. The release came just 41 days after Opus 4.7 and exactly six weeks after Opus 4.6, continuing the cadence Anthropic has been running since March. Same $5 per million input tokens and $25 per million output tokens pricing as Opus 4.7, same 1M-token context window, same model ID claude-opus-4-8 across the API, Claude Code, claude.ai, AWS, Google Cloud, and Microsoft Foundry. The launch also brought a 2.5× faster Fast Mode at 3× lower cost, an Effort Control selector (low, medium, high, xhigh, max), an honest-improvement push that Anthropic says leaves roughly 4× fewer code flaws unflagged than GPT-5.5, and the Messages API changes that let applications update the system prompt mid-conversation without busting the prompt cache.

    But every serious analyst covering the launch landed on the same conclusion: the model itself is a "modest but tangible improvement" over 4.7, and the feature that actually changes how production teams will build software in 2026 is Dynamic Workflows. Anthropic's own framing — "Claude Code has a new dynamic workflows feature that allows it to tackle very large-scale problems" — undersells the architectural shift. The model is now the runtime, not the bottleneck. The plan is the artifact. And the agents are the workers.

    What Dynamic Workflows Actually Are (and What They Are Not)

    Dynamic Workflows is a multi-agent orchestration system built into Claude Code that lets a single Claude session plan a complex job, decompose it into independent units of work, fan that work out across tens to hundreds of parallel subagents in a single session, and iteratively verify the converged result. Per the official Claude Code documentation, workflows are "orchestrate many subagents from a script Claude writes and you can rerun," and they target "codebase audits, large migrations, [and] parallel research."

    The crucial architectural decision is where the plan lives. In a normal Claude Code session, the plan lives inside the model's context window — the orchestrator thinks about the steps, picks the next step, executes it, observes the result, and updates the plan. With Dynamic Workflows, Claude writes a JavaScript orchestration script that the runtime executes. The plan moves from LLM memory into executable code. The model's context window only ever sees the final converged answer, not the intermediate results of hundreds of steps.

    What Dynamic Workflows are not is also worth being precise about. They are not an autonomous background agent (Claude is not running a hidden process while you close the laptop). They are not an open-ended long-running "always-on" mode. They are not a replacement for a human in the loop on high-stakes decisions — Anthropic recommends reviewing the execution plan on first trigger, because "a poorly scoped prompt will fan out agents unnecessarily." And they are not a single thing: the feature is opt-in, plan-gated, and capped at 1,000 total subagents per workflow, with up to 16 running concurrently.

    The Bun Case Study: 750,000 Lines of Zig to Rust in Days

    Anthropic's flagship example — and the case study the company led the launch with — is Bun, the high-performance JavaScript runtime originally written in Zig. Bun's creator Jarred Sumner used Dynamic Workflows to port the project from Zig to Rust. The numbers Anthropic, Bun, and the trade press all converge on are these: roughly 750,000 lines of Rust produced, with 99.8% of the existing Bun test suite still passing, and the rewrite merged to main in the Bun 1.3.14 release (the last Zig-only release) on May 14, 2026, two weeks before Opus 4.8 was announced.

    The exact wall-clock time is reported in two ways. Anthropic's framing of the demo is "11 days." A more granular AIWeekly reconstruction of the actual pull request counts 960,000 lines of Zig converted to Rust across 6,755 commits in 6 days, with 13,044 unsafe-block migrations left for manual review. The discrepancy is mostly a question of what counts as "the rewrite" — the prep work, the test fixups, and the manual unsafe-block audit take the 11-day version; the raw translation step that Dynamic Workflows itself executed is closer to 6 days.

    The pull request itself, Bun PR #30412 "Rewrite Bun in Rust", is the kind of artifact that resets what a software migration looks like in 2026. Two reviewers per file. Hundreds of agents in parallel. Adversarial verification catching the cases where the Rust translation subtly changed the semantics of memory management. The Zig version of Bun had known segfaults and other memory-safety errors; Dynamic Workflows could not have carried those bugs over because the Rust translation was type-checked and MIRI-verified end to end.

    The Bun rewrite is not just a marketing demo. It is a stress test of the entire Dynamic Workflows stack under production conditions: a real codebase with 960,000 lines of memory-managed systems code, an existing test suite that can validate the output, and a maintenance burden that the open-source community would otherwise have absorbed over months of human time. The fact that 99.8% of tests passed on the first iteration is the empirical evidence that the Plan → Assign → Verify loop actually converges on correct code at scale, not just on plausible-looking code.

    How Plan → Assign → Verify Works in Practice

    The Dynamic Workflows lifecycle is a three-step loop that Anthropic describes as Plan, Assign, and Verify. It is not a loop in the traditional sense — the loop is the part Claude writes into the orchestration script, not the part the LLM iterates in its head.

    Plan. When a workflow kicks off, Claude plans dynamically based on your prompt, breaks the task into independent subtasks, and writes the orchestration script that will execute them. You see the plan before any agent runs. The plan is reviewable, version-controllable, and reusable — you can rerun the same workflow on a different repository or a different branch.

    Assign. The JavaScript runtime executes the script and fans the work out across subagents. Up to 16 subagents run concurrently; up to 1,000 subagents can participate in a single workflow. Each subagent does a focused unit of work — migrating one file from Zig to Rust, verifying one test, auditing one security control — and reports its result back to the orchestrator.

    Verify. Other subagents independently review each agent's output, attempt to refute the result, and iterate until the answers converge. This is the adversarial verification step that distinguishes Dynamic Workflows from a simple "fire 100 subagents in parallel" script. Subagents in a static pipeline just report back. Agent Teams collaborate but don't adversarially verify. Dynamic Workflows explicitly try to break each other's work until the result stabilizes.

    The fourth property — and the one that makes long-running workflows viable — is resumability. Progress saves continuously. If Claude crashes mid-task, the workflow picks up where it left off. Agent Teams die with the session. Dynamic Workflows survive interruptions, which is the only reason a 6-day, 6,755-commit migration like Bun is operationally tractable.

    Subagents, Agent Teams, and Dynamic Workflows: The Multi-Agent Hierarchy

    Before Opus 4.8, Claude Code already had two multi-agent primitives. Understanding all three side-by-side is the only way to figure out when each is the right tool — a hierarchy that mirrors the tradeoffs you see when comparing Claude Opus against open-weight models like GLM-4.7.

    Subagents are lightweight workers spawned from a main session. They do a focused task and report back. They cannot talk to each other, and the main agent is still the orchestration bottleneck: every result routes through one context window. Subagents are the right primitive for a single dev who wants to delegate a sub-step — "audit this file for security issues" or "write tests for this function" — while staying in control of the overall task.

    Agent Teams shipped with Opus 4.6 in March 2026. Multiple Claude instances coordinate through a shared task list and message each other directly — Shift+Up/Down or tmux let you watch each teammate's terminal. Agent Teams remove the orchestrator bottleneck that subagents impose. The tradeoff is that Agent Teams top out at 3–5 teammates in practice, sessions don't survive interruptions, and you still need to design the orchestration up front.

    Dynamic Workflows sit above both. The plan moves out of the LLM's context into a JS script the runtime executes. Scale jumps to 16 concurrent / 1,000 total subagents. Adversarial verification replaces single-pass reporting. Resumability replaces the session-bound constraint. And the orchestration is generated, not designed — you describe the task, Claude decides how to split it.

    Primitive Max agents Plan location Resumable? Verification Best for
    SubagentsA handfulIn main agent's contextNoSingle-pass report-backDelegating a sub-step
    Agent Teams3–5 in practiceShared task listNo (session-bound)Peer collaborationMulti-angle exploration
    Dynamic Workflows16 concurrent / 1,000 totalExecutable JS scriptYes (continuous save)Adversarial, iterativeCodebase-scale migrations, audits, sweeps

    The choice of primitive is not ideological. A 200-line refactor is a subagent job. A multi-file architectural exploration is an Agent Teams job. A 750,000-line language migration is a Dynamic Workflows job. The cost profile is also different: workflows consume "orders of magnitude" more tokens than a normal session, and Anthropic recommends starting with a scoped task to calibrate usage before going full-scale.

    Effort Control: Five Levels of Compute vs Cost

    Opus 4.8 also ships a new Effort Control selector that lives in the model picker in claude.ai, Claude Code, and Cowork. There are five levels: low, medium, high, xhigh (also called "extra-high" in some surfaces), and max. The default for new sessions is medium. Higher-effort settings make Claude think more frequently and more deeply before producing a response. Lower-effort settings trade depth for speed and token efficiency.

    The token-cost spread is roughly 2.7× between the cheapest level (low) and the most expensive (max), based on community benchmarks on Opus 4.7 with the same five-tier selector. That ratio is the practical reason the Effort Control menu exists: same model, same context window, same pricing per token — but the user controls how much of the model's reasoning budget the question is allowed to consume. If you ask a question where the answer is in the first paragraph of the context, low is correct. If you ask Claude to redesign a microservices topology that touches 40 services, max is correct.

    For Dynamic Workflows specifically, the recommended pairing is effort = xhigh (sometimes called "ultracode" in community examples). The reasoning is that the orchestrator has to plan a multi-agent fan-out, decide how to split work, and coordinate the adversarial verification loop. Skimping on the orchestrator's reasoning budget is a false economy: a bad plan executed at scale is more expensive than a good plan executed at low cost. The activation recipe in community demos is consistent: /model opus 4.8 + /effort ultracode + include the word "workflow" in the prompt — and the same cost-vs-quality tradeoffs that drive the 2.7× spread are visible in any frontier coding model (see our M3 pricing guide for the parallel).

    If you run Opus 4.8 at max or xhigh, Anthropic also recommends a large output token budget. The reason is that high-effort runs need more output headroom: the model will produce longer, more detailed reasoning chains before committing to an answer, and a tight max_tokens cap can clip the response mid-thought. Practical ceiling: 16K output tokens for typical max-effort runs, more for workflows that generate substantial intermediate artifacts.

    Availability, Plans, and the Enterprise Off-by-Default Decision

    Dynamic Workflows are available in research preview across the surfaces where Claude Code runs. The full distribution list, per the Week 22 changelog and Anthropic's launch post: Claude Code, the Claude Agent SDK, the Claude API, Amazon Bedrock, and Google Cloud Vertex AI. For claude.ai, the surfaces are Claude Code, Cowork, and the chat interface. For end users, the relevant plan tiers are Pro, Max, Team, and Enterprise.

    The default-on vs. default-off split is the governance decision worth paying close attention to. Dynamic Workflows are on by default on Max and Team plans. They are off by default on Enterprise plans, and an organization admin has to opt the workspace in through the Claude Code managed settings. The reason is the same reason any new agent primitive ships disabled in enterprise: an organization running Dynamic Workflows against a real codebase is committing substantial compute, and the blast radius of a poorly-scoped prompt fanning out across 1,000 subagents is significantly larger than the blast radius of a normal Claude Code session.

    The early reactions in enterprise admin channels reflect the tradeoff. A Reddit admin post in r/ClaudeAI on launch day reported: "I asked our company's Anthropic account admin about this, and apparently it was enabled by default in our account. That seems...not great." Anthropic's stance is that admins retain the kill switch via managed settings, and the research-preview label is the explicit signal that the team will iterate on the defaults as enterprise usage data accumulates.

    For anyone running workflows against a production codebase today, the operational checklist is straightforward: (1) start with a scoped task to calibrate token usage (a useful baseline is comparing against the workflow ergonomics of M3 in your own coding tools); (2) enable auto mode so Claude decides when a workflow is appropriate vs. a simpler approach; (3) review the execution plan on the first trigger; (4) set output token budgets to at least 16K when running at xhigh or max effort; (5) confirm your admin policy explicitly allows workflows on Enterprise before assuming you can use them.

    Messages API Changes: Mid-Conversation System Messages and the 1,024-Token Cache

    Two Messages API changes ship with Opus 4.8 that are easy to miss next to the Dynamic Workflows headline, and they are the changes that production teams will feel first.

    1. Mid-conversation system messages. Before Opus 4.8, the system parameter was a static instruction set: whatever you passed at the start of a conversation stayed in effect for the rest of the conversation, and you could not change it without invalidating the prompt cache. Per the official mid-conversation system messages documentation, Opus 4.8 lets you put a system entry inside the messages array and update the system instruction mid-task. The change is designed to be cache-preserving: the prompt prefix that the cache hashes is in the order tools, then system, then messages. Inserting a system entry inside the messages array does not invalidate the cache the way that re-writing the system parameter at the top of the request would.

    The practical impact: a long-running agent that discovers it needs to switch from "be concise" to "be exhaustive, you missed an edge case" can do so without busting the cache. The same pattern works for security-sensitive applications that want to inject new instructions mid-conversation in a cache-safe, prompt-injection-resistant way. RLance Martin's launch-day tip on X: "you can now update the system prompt mid-conversation w/o breaking the prompt cache" — that one line is the entire change for most users.

    2. Prompt cache minimum lowered to 1,024 tokens. Per the What's new in Claude Opus 4.8 documentation, the minimum cacheable prompt length on Opus 4.8 is 1,024 tokens, down from 2,048 on Opus 4.7. Conversations that were too short to cache on 4.7 are now cacheable on 4.8. For applications running thousands of short-conversation agents — customer support chat, in-app assistants, code-completion suggestions — this is the change that unlocks prompt caching on use cases that previously could not justify it.

    The two changes are related. Mid-conversation system messages are the feature that lets an agent change its mind about how to behave as it accumulates evidence. The lower prompt cache minimum is the infrastructure change that makes the behavior change affordable. Together they turn Opus 4.8 into the first Anthropic model that genuinely supports long-running, mid-task-steered agents without the cost penalty of cache invalidation on every adjustment.

    What This Means for the 2026 Agentic Coding Race

    Dynamic Workflows do not just give Claude Code a feature. They change the unit of competition. Through 2024 and most of 2025, coding-agent benchmarks were "can the model pass this SWE-Bench issue." Through early 2026, the question became "can the model finish a multi-file PR with a real test suite." With Opus 4.8 and Dynamic Workflows, the question is "can the model orchestrate 1,000 parallel subagents against a 750,000-line codebase and converge on a correct rewrite with adversarial verification." The frontier is moving from agent to orchestrator.

    The competitive pressure on OpenAI, Google DeepMind, and the open-source agent frameworks (LangGraph, AutoGen, CrewAI) is real — especially against the bar set by recent Claude-vs-GPT comparison benchmarks. None of them ship an out-of-the-box Plan → Assign → Verify primitive with the same scale, resumability, and adversarial verification properties. LangGraph users have to assemble the equivalent stack from graph nodes, custom code, and external state stores. AutoGen has multi-agent patterns but no built-in adversarial verification. CrewAI emphasizes role-based collaboration but does not have a 1,000-subagent cap or a Bun-class case study to point to. Dynamic Workflows is the first production multi-agent system where the orchestration is generated by the model, executed by the runtime, and validated by the model's own adversarial review loop.

    For individual developers, the takeaway is more practical than philosophical. The activation recipe is three lines: switch to Opus 4.8, set effort to xhigh (or "ultracode"), and include "workflow" in the prompt. The first 10 minutes of use should be on a scoped task, not a 750,000-line codebase. The token cost is real and the 2.7× spread across effort tiers means effort selection is a financial decision, not just a quality decision. And the Bun rewrite is the proof point that Plan → Assign → Verify converges on correct code at production scale — not just plausible code, but code that passes 99.8% of an existing test suite on the first iteration.

    Conclusion

    Claude Opus 4.8 Dynamic Workflows is the most consequential Claude Code release since the original launch. The model itself is a steady improvement on Opus 4.7 — same price, same context window, stronger coding and honesty scores, and a 2.5× faster Fast Mode at 3× lower cost. The architectural shift is the multi-agent orchestration system: Plan → Assign → Verify, with up to 16 concurrent and 1,000 total subagents per workflow, adversarial verification, resumability across interruptions, and the Bun case study as the empirical proof. The Messages API changes (mid-conversation system messages and a 1,024-token prompt cache minimum) are the quiet infrastructure shift that makes the agent shift affordable. The five-level Effort Control (low, medium, high, xhigh, max) is the user-facing knob that lets every team tune compute vs cost. The research-preview label and the enterprise off-by-default decision are the honest signals that the team is still iterating on defaults.

    The open question for the rest of 2026 is whether the rest of the agent ecosystem catches up. If you are building on LangGraph, AutoGen, or CrewAI, the bar Dynamic Workflows sets is: generated orchestration, runtime execution, adversarial verification, resumable across interruptions, and a flagship case study that demonstrates convergence at codebase scale. If you are buying agentic-coding infrastructure, the procurement question is no longer "which model writes the best single-file fix" — it is "which platform can orchestrate hundreds of agents against a real codebase with verifiable correctness." Claude Opus 4.8 Dynamic Workflows is the first answer to that question that ships as a default. Treat the launch date — May 28, 2026 — as the dividing line.

    Last Updated: June 01, 2026 | Source: Anthropic Opus 4.8 Announcement (Official), Claude Code Workflows Documentation (Official), and The Register Bun Coverage (Authoritative News)

    Frequently Asked Questions

    Dynamic Workflows is a research-preview feature in Claude Code, released on May 28, 2026, that lets a single Claude session plan a large task, decompose it into independent units of work, fan the work out across up to 16 concurrent and 1,000 total parallel subagents in one session, and adversarially verify the converged result. The plan is written as an executable JavaScript orchestration script so the orchestrator's context window only sees the final converged answer, not the intermediate steps of hundreds of subagents.
    A single Dynamic Workflow on Claude Opus 4.8 can spawn up to 16 subagents running concurrently and up to 1,000 subagents in total. Each subagent does a focused unit of work, and the workflow is resumable across interruptions, which is what makes codebase-scale migrations like the Bun Zig-to-Rust port operationally tractable.
    Anthropic's flagship Dynamic Workflows demo is the port of the Bun JavaScript runtime from Zig to Rust, run by Bun's creator Jarred Sumner. The result was roughly 750,000 lines of Rust with 99.8% of the existing Bun test suite still passing. The work was merged in Bun 1.3.14 on May 14, 2026. Anthropic frames the wall-clock time as 11 days; a granular AIWeekly reconstruction of pull request #30412 counts 960,000 lines converted across 6,755 commits in 6 days, with 13,044 unsafe-block migrations left for manual review.
    Subagents are lightweight workers spawned from a main session who report back; they cannot talk to each other and the main agent is still the orchestration bottleneck. Agent Teams, shipped with Opus 4.6 in March 2026, let multiple Claude instances coordinate through a shared task list but top out at 3-5 teammates in practice and die with the session. Dynamic Workflows sit above both: the plan moves from LLM context into a JavaScript script, scale jumps to 16 concurrent and 1,000 total subagents, the workflow is resumable across interruptions, and an adversarial verification step is built in.
    Opus 4.8 introduces an Effort Control selector with five levels: low, medium, high, xhigh (also called "extra-high" or "ultracode" in some surfaces), and max. The default for new sessions is medium. The token-cost spread between the cheapest (low) and the most expensive (max) level is roughly 2.7x, based on community benchmarks of the same five-tier selector on Opus 4.7. For Dynamic Workflows, the recommended pairing is effort = xhigh, because the orchestrator has to plan a multi-agent fan-out and coordinate the adversarial verification loop.
    Dynamic Workflows are on by default for Max and Team plans, and off by default for Enterprise plans, where an organization admin has to opt the workspace in through Claude Code managed settings. The off-by-default Enterprise decision is a governance choice: an organization running Dynamic Workflows against a real codebase is committing substantial compute, and the blast radius of a poorly scoped prompt fanning out across 1,000 subagents is significantly larger than the blast radius of a normal Claude Code session.
    Two Messages API changes ship with Opus 4.8. First, mid-conversation system messages: you can now insert a 'system' entry inside the 'messages' array to update the system instruction mid-task, and the change is cache-preserving because the cache prefix order is tools, then system, then messages. Second, the minimum prompt-cache length is now 1,024 tokens, down from 2,048 on Opus 4.7, which unlocks prompt caching on shorter conversations that previously could not justify it.
    Claude Opus 4.8 has the same $5 per million input tokens and $25 per million output tokens pricing as Opus 4.7, the same 1M-token context window, and the same model ID 'claude-opus-4-8' across the API, Claude Code, claude.ai, AWS, Google Cloud, and Microsoft Foundry. A 2.5x faster Fast Mode is also available at 3x lower cost. The 2.7x token-cost spread across Effort Control levels is the main additional cost dimension.
    The community activation recipe is three lines: switch to Opus 4.8 with '/model opus 4.8', set effort to xhigh with '/effort ultracode', and include the word 'workflow' in the prompt. Anthropic recommends starting with a scoped task to calibrate token usage before going full-scale, enabling auto mode so Claude decides when a workflow is appropriate, reviewing the execution plan on the first trigger, and setting output token budgets to at least 16K when running at xhigh or max effort.
    Anthropic says Opus 4.8 leaves roughly 4x fewer code flaws unflagged than GPT-5.5 in remote-execution coding tests. The independent signal is the Bun port: 99.8% of the existing test suite still passing on the first iteration of a 750,000-line rewrite is a stronger empirical result than any single benchmark. That said, model-vs-model coding comparisons are still environment-specific, and the right comparison is total cost-to-correctness for the task, not raw benchmark score.
    Sk Jabedul Haque

    Sk Jabedul Haque

    Founder & Chief Editor

    Building India's most trusted finance education platform — simplifying news, calculators, and market trends so anyone can understand and invest confidently.