What task did Claude Code complete for Rakuten?

Rakuten's ML engineer Kenta Naruse gave Claude Code a single task: implement an activation vector extraction method inside vLLM — an open-source library with 12.5 million lines of code in multiple programming languages. Claude Code finished autonomously in 7 hours with 99.9% numerical accuracy.

How much did Rakuten's feature delivery time improve?

Rakuten's average feature delivery time dropped from 24 working days to 5 working days — a 79% reduction. This was achieved by using Claude Code to handle complex implementation tasks autonomously while engineers focused on architecture and review.

Is Claude Code suitable for large enterprise codebases?

Yes. The Rakuten case study specifically validates this. vLLM (the library Claude Code worked on) has 12.5 million lines of code across multiple languages. Claude Code navigated, understood, and wrote correct implementations at 99.9% numerical accuracy — proving it handles production-scale complexity.

How many Fortune 100 companies use Claude in 2026?

As of 2026, 70% of Fortune 100 companies use Claude, and 8 of the Fortune 10 are Claude customers. Over 1,000 enterprise customers spend $1 million or more annually with Anthropic as of May 2026.

Does using Claude Code mean engineers lose their jobs?

Not according to the evidence. Engineers shift from implementers to orchestrators. At Rakuten, engineers used Claude Code to run 4-5 tasks in parallel instead of one. Anthropic's research shows engineers produce more output volume overall — more features shipped, not the same number of features done faster by fewer people.

Claude Code Rakuten Case Study: 79% Faster Feature Delivery

How Rakuten reduced development time from 24 days to 5 days using AI coding

Apr 27, 2026, 16:57 Eastern Daylight Time by

Sk Jabedul Haque

Rakuten's ML engineer Kenta Naruse gave Claude Code one task: implement an activation vector extraction method inside vLLM — a codebase with 12.5 million lines of code. Claude Code finished autonomously in 7 hours with 99.9% numerical accuracy. Naruse wrote zero lines of code. Feature delivery time dropped from 24 days to 5 days — a 79% reduction.

What You'll Learn

Exact details of Rakuten's Claude Code implementation — what was tested, how long it took, and the results
How Claude Code's performance compares to broader enterprise AI adoption data in 2026
What the 79% reduction in delivery time actually means for software teams
Why the Rakuten case study is reshaping how Fortune 500 companies think about AI coding agents

The Rakuten Challenge: 12.5 Million Lines of Code

The Claude Code Rakuten case study is documented in Anthropic's 2026 Agentic Coding Trends Report and has become one of the most-cited examples of enterprise AI coding in production. It's not a promotional benchmark — it's a real engineering problem that a real team faced.

Kenta Naruse, a Machine Learning Engineer at Rakuten, needed to implement a specific activation vector extraction method inside vLLM — a massive open-source inference library. vLLM contains 12.5 million lines of code written across multiple programming languages. Just navigating such a codebase manually takes days. Writing correct new code on top of it takes longer.

Naruse handed the task to Claude Code and stepped back. What followed was 7 hours of fully autonomous work — no code written by Naruse, only occasional guidance provided. When Claude Code finished, the implementation achieved 99.9% numerical accuracy compared to the reference method.

"I didn't write any code during those seven hours," Naruse said. "I just provided occasional guidance."

The Numbers: What 79% Faster Actually Means

Before Claude Code, Rakuten's engineering teams averaged 24 working days from feature request to delivery. After integrating Claude Code into their workflow, that number dropped to 5 working days. That's not a small improvement — it's a structural change in how fast software can ship.

Metric	Before Claude Code	After Claude Code
Feature Delivery Time	24 working days	5 working days (−79%)
vLLM Task Duration	Estimated days of human work	7 hours (autonomous)
Code Accuracy	Human-written baseline	99.9% numerical accuracy
Parallel Task Capacity	1 task per engineer	5 tasks (4 delegated to Claude)

The parallel execution capacity change is especially significant. Instead of one engineer working one task sequentially, each engineer could now delegate four tasks to Claude Code simultaneously while personally handling one high-priority item. This is the shift from "AI helps you type faster" to "AI multiplies your output capacity." Check our analysis of Claude Code's 7-hour autonomous test for the full timeline breakdown.

Why the vLLM Task Was the Perfect Test

Critics of AI coding agents often argue that they work only on toy problems — isolated functions, simple utilities, or well-documented APIs. The Rakuten vLLM task was the opposite of that. Here's why it's a meaningful benchmark:

Scale: 12.5 million lines of code — not a toy repo, a production-grade system used by ML teams at major companies worldwide.
Multi-language: vLLM is not a single-language codebase. Claude Code had to navigate Python, C++, CUDA, and configuration files.
Domain specificity: Activation vector extraction is a niche ML engineering task, not a generic CRUD operation.
No shortcuts: The implementation had to achieve near-perfect numerical accuracy against a reference method. Garbage output would have been immediately obvious.

The task validated something that many in the industry had doubted: AI coding agents can reason across enormous, multi-language codebases and produce mathematically correct implementations with minimal human intervention. For the broader discussion on AI coding cost vs. value, see our AI Coding Agent Cost Analysis 2026.

Claude Code Adoption: Enterprise Numbers in 2026

Rakuten is not an isolated example. The broader enterprise adoption of Claude Code in 2026 shows the same pattern — faster delivery, higher output, with humans shifting to oversight and orchestration rather than implementation.

✓ 8 of the Fortune 10 are now Claude customers (Feb 2026)
✓ 70% of Fortune 100 companies use Claude (2025 industry data)
✓ 1,000+ enterprise customers spending $1M+ annually by May 2026 — doubled from February 2026
✓ $500M+ revenue run rate for Claude Code (surpassed this in 2025)
✓ 45% productivity gains for 6,000 developer participants in IBM's AI-driven IDE, built with Anthropic collaboration
✓ 90-95% complete — Anthropic's own CFO Krishna Rao reported finance outputs are nearly complete before human review

The macro picture from Anthropic's research is even bigger: if Claude Code's current capability were adopted universally across the US economy, Anthropic estimates it would add 1.8% annual labor productivity growth — double the recent baseline rate. That's not a model selling point; that's an economic projection from Anthropic's own research paper. For career implications of this shift, see Agentic AI Engineer Salary 2026: $240K–$325K+.

The Human Role After Claude Code

What happened to Kenta Naruse during those 7 hours? He provided "occasional guidance." This is the new engineering paradigm — not replaced, but repositioned. Engineers in the Claude Code era are becoming orchestrators and reviewers, not just implementers.

Anthropic's internal research reveals a pattern: engineers report a net decrease in time spent per individual task, but a much larger net increase in overall output volume. More features shipped, more bugs fixed, more experiments run — not because each task is faster, but because engineers can run more in parallel. The bottleneck shifts from implementation capacity to review and decision-making capacity.

For Rakuten, this meant restructuring how engineers plan their days. Instead of: "I'll implement feature A this sprint," the workflow becomes: "I'll start Claude Code on features A, B, C, and D — and review the outputs while I handle the architecture decisions for feature E." The human becomes the highest-leverage node in the system, not the bottleneck.

What This Means for Software Teams in India and Asia

Rakuten's case study has particular relevance for software development teams in India, Japan, and Southeast Asia, where large engineering teams handle complex enterprise codebases for global clients. The ability to multiply per-engineer output without adding headcount is especially valuable in markets where talent costs are rising and project timelines are shortening.

Indian IT firms — many of which maintain multi-million-line legacy codebases — face exactly the kind of challenge where Claude Code's strength in navigating large, multi-language repositories would provide the most value. The vLLM task (12.5M lines, multi-language) is a reasonable proxy for the scale many Indian IT teams manage daily.

Conclusion: One Case Study, Many Implications

The Rakuten + Claude Code case study matters because it's specific, verifiable, and replicable. It's not "AI helped our team feel more productive." It's: a real engineer, a real codebase, a real task, documented in 7 hours, producing 99.9% accurate output, and shortening delivery cycles from 24 days to 5.

As enterprise AI adoption accelerates — with 1,000+ companies spending $1M+ annually on Claude by May 2026 — the Rakuten benchmark will serve as a template for how serious engineering teams integrate AI agents into production workflows. Not as a replacement for engineers, but as a force multiplier that changes what a small team can accomplish in a sprint.

Last Updated: May 17, 2026 | Source: Anthropic Agentic Coding Trends Report 2026 (Official), Rakuten Today (Official Blog)

Frequently Asked Questions

in Technology

# AI Agents AI Tools 2026 Coding