Skip to Content

xAI Model Retirement May 15, 2026

Complete Migration Guide for Grok-3 and Fast Model Users 2026
May 19, 2026, 03:52 Eastern Daylight Time by
xAI Model Retirement May 15, 2026
On May 15, 2026, xAI officially retired eight legacy models, including Grok-3 and all Grok-4-fast variants. Requests to these deprecated slugs are now automatically redirected to Grok 4.3. While this ensures service continuity, developers using the "Fast" tier will face a 600% cost increase, as all traffic is now billed at the higher Grok 4.3 standard rates.

What You’ll Learn in This Guide

  • Full list of the 8 xAI models retired on May 15, 2026.
  • Understanding the Grok 4.3 redirect behavior and reasoning parameters.
  • Detailed pricing breakdown and the impact of the 6x "Fast" tier cost hike.
  • Step-by-step migration checklist for xAI API users.

The xAI Model Retirement event on May 15, 2026, represents a significant consolidation of Elon Musk’s AI offerings. Following the trend set by the recent OpenAI reorganization, xAI is killing off fragmented side projects to focus entirely on its new "agentic" flagship: Grok 4.3. While consolidation simplifies the developer experience, it has also introduced a massive "cost shock" for users who relied on the highly subsidized prices of the Grok-4-fast and Grok-code-fast tiers. As the AI Token Pricing War intensifies, xAI is moving away from competing on pure budget and toward high-value Agentic AI capabilities. In this guide, we break down the technical redirects, the financial implications, and the necessary steps to ensure your production apps remain stable and cost-efficient.

The Retirement List: 8 Models Deprecated

Effective May 15 at 12:00 PM PT, xAI has removed eight model slugs from its active directory. These models were primarily previous-generation experimental tiers or specialized sub-models that have now been integrated into the unified Grok 4.3 architecture.

Retired Model Slug New Replacement (Redirect) Reasoning Effort Parameter
grok-4-1-fast-reasoninggrok-4.3Low
grok-4-fast-non-reasoninggrok-4.3None
grok-code-fast-1grok-4.3Medium (Coding Optim.)
grok-3grok-4.3Low
grok-imagine-image-progrok-imagine-imageN/A

The Cost Shock: Legacy Fast vs. Grok 4.3

The most critical impact of this retirement is the end of the "Fast" subsidy. Previously, `grok-4-fast` models were priced aggressively at approximately $0.20 per million input tokens and $0.50 per million output tokens to gain market share.

Grok 4.3, however, is a significantly larger and more capable model, priced at $1.25 per 1M input and $2.50 per 1M output. If you continue to send requests to a deprecated "Fast" slug, your traffic will not break, but it will be billed at the new 4.3 rates—a 6.25x increase for input and a 5x increase for output tokens. For high-volume agent fleets, this could result in thousands of dollars in unexpected monthly overages.

Understanding the Grok 4.3 Redirect Logic

To prevent immediate service outages, xAI has implemented a dynamic redirect logic that maps legacy calls to specific reasoning efforts within Grok 4.3.

  • Reasoning Redirection: Legacy requests to reasoning models are served by `grok-4.3` with the `low` reasoning effort parameter. This maintains the "fast reasoning" behavior while using the newer engine.
  • Non-Reasoning Redirection: Calls to standard fast models are served by `grok-4.3` with `none` reasoning effort, maximizing speed and minimizing latency.
  • Imagine Redirects: The `imagine-image-pro` model now defaults to the standard `grok-imagine-image` endpoint, which has been upgraded to match the "Pro" quality by default.
While these redirects maintain uptime, xAI explicitly recommends updating your codebases to the new standard model slugs to ensure you are utilizing the full 1 million token context window and new agentic tool-calling features.

The 3-Step Migration Checklist

If your applications are still using the deprecated model slugs, follow this checklist immediately to avoid billing surprises and technical debt:

  1. Update Model Slugs: Replace all instances of `grok-4-fast`, `grok-3`, and `grok-code-fast-1` with the unified `grok-4.3` identifier.
  2. Implement Reasoning Parameters: Explicitly set your `reasoning_effort` (low, medium, or high) in your API headers. Do not rely on the default redirect behavior, as it may change in future point releases.
  3. Audit Your Context Windows: Grok 4.3 supports a massive 1 million token context. If your previous workflows were restricted to the 128k limits of legacy models, you can now optimize your RAG (Retrieval-Augmented Generation) pipelines for better accuracy.

Why Grok 4.3? The Strategic Pivot

The retirement of these 8 models signals xAI's transition from a "quantity" approach to a "quality-first" agentic model. Grok 4.3 currently tops leaderboards in instruction following and multi-step tool calling. For businesses building autonomous AI systems, the unified architecture of 4.3 provides a more reliable foundation. It allows a single model to act as both the "planner" (using high reasoning) and the "executor" (using standard mode), significantly reducing the complexity of model-routing logic.

Conclusion: Preparing for the Unified AI Era

In conclusion, while the May 15 model retirement may feel like a disruption, it is a necessary step for the growth of the xAI ecosystem. By streamlining its offerings into Grok 4.3, xAI is ensuring that all developers have access to its fastest and most capable technology. The cost increase is a reflection of the significant compute resources required for high-reasoning models, and the "Fast" tier has essentially been replaced by the superior speed-to-intelligence ratio of the new generation. Update your API calls today, audit your billing alerts, and leverage the new 1M context window to build the next generation of intelligent agents.

Last Updated: May 19, 2026 | Source: xAI Developer Documentation & Official API Migration Bulletins

Frequently Asked Questions

The 8 retired models are grok-4-1-fast-reasoning, grok-4-1-fast-non-reasoning, grok-4-fast-reasoning, grok-4-fast-non-reasoning, grok-4-0709, grok-code-fast-1, grok-3, and grok-imagine-image-pro.
Yes, xAI has implemented a redirect logic. Requests to legacy text models are served by Grok 4.3 with appropriate reasoning effort, while grok-imagine-image-pro redirects to the standard imagine endpoint.
The 'Fast' tier was previously highly subsidized. Grok 4.3 is a more capable model priced at $1.25/M input and $2.50/M output. Continuing to use legacy slugs will result in being billed at these new, higher rates.
You can set the 'reasoning_effort' parameter to 'low', 'medium', or 'high'. For legacy 'Fast' behavior, xAI recommends setting this to 'none' or 'low' to prioritize speed.
Grok 4.3 supports a massive 1 million token context window, significantly larger than the 128k limit of previous generations, making it ideal for deep RAG and long-document analysis.
Explicitly update your model slugs to 'grok-4.3' and configure the 'reasoning_effort' parameter in your code. Relying on default redirects is not recommended for long-term production stability.
# AI