What You’ll Learn in This Guide
- ✓ Full list of the 8 xAI models retired on May 15, 2026.
- ✓ Understanding the Grok 4.3 redirect behavior and reasoning parameters.
- ✓ Detailed pricing breakdown and the impact of the 6x "Fast" tier cost hike.
- ✓ Step-by-step migration checklist for xAI API users.
The xAI Model Retirement event on May 15, 2026, represents a significant consolidation of Elon Musk’s AI offerings. Following the trend set by the recent OpenAI reorganization, xAI is killing off fragmented side projects to focus entirely on its new "agentic" flagship: Grok 4.3. While consolidation simplifies the developer experience, it has also introduced a massive "cost shock" for users who relied on the highly subsidized prices of the Grok-4-fast and Grok-code-fast tiers. As the AI Token Pricing War intensifies, xAI is moving away from competing on pure budget and toward high-value Agentic AI capabilities. In this guide, we break down the technical redirects, the financial implications, and the necessary steps to ensure your production apps remain stable and cost-efficient.
The Retirement List: 8 Models Deprecated
Effective May 15 at 12:00 PM PT, xAI has removed eight model slugs from its active directory. These models were primarily previous-generation experimental tiers or specialized sub-models that have now been integrated into the unified Grok 4.3 architecture.
| Retired Model Slug | New Replacement (Redirect) | Reasoning Effort Parameter |
|---|---|---|
| grok-4-1-fast-reasoning | grok-4.3 | Low |
| grok-4-fast-non-reasoning | grok-4.3 | None |
| grok-code-fast-1 | grok-4.3 | Medium (Coding Optim.) |
| grok-3 | grok-4.3 | Low |
| grok-imagine-image-pro | grok-imagine-image | N/A |
The Cost Shock: Legacy Fast vs. Grok 4.3
The most critical impact of this retirement is the end of the "Fast" subsidy. Previously, `grok-4-fast` models were priced aggressively at approximately $0.20 per million input tokens and $0.50 per million output tokens to gain market share.
Grok 4.3, however, is a significantly larger and more capable model, priced at $1.25 per 1M input and $2.50 per 1M output. If you continue to send requests to a deprecated "Fast" slug, your traffic will not break, but it will be billed at the new 4.3 rates—a 6.25x increase for input and a 5x increase for output tokens. For high-volume agent fleets, this could result in thousands of dollars in unexpected monthly overages.
Understanding the Grok 4.3 Redirect Logic
To prevent immediate service outages, xAI has implemented a dynamic redirect logic that maps legacy calls to specific reasoning efforts within Grok 4.3.
- Reasoning Redirection: Legacy requests to reasoning models are served by `grok-4.3` with the `low` reasoning effort parameter. This maintains the "fast reasoning" behavior while using the newer engine.
- Non-Reasoning Redirection: Calls to standard fast models are served by `grok-4.3` with `none` reasoning effort, maximizing speed and minimizing latency.
- Imagine Redirects: The `imagine-image-pro` model now defaults to the standard `grok-imagine-image` endpoint, which has been upgraded to match the "Pro" quality by default.
The 3-Step Migration Checklist
If your applications are still using the deprecated model slugs, follow this checklist immediately to avoid billing surprises and technical debt:
- Update Model Slugs: Replace all instances of `grok-4-fast`, `grok-3`, and `grok-code-fast-1` with the unified `grok-4.3` identifier.
- Implement Reasoning Parameters: Explicitly set your `reasoning_effort` (low, medium, or high) in your API headers. Do not rely on the default redirect behavior, as it may change in future point releases.
- Audit Your Context Windows: Grok 4.3 supports a massive 1 million token context. If your previous workflows were restricted to the 128k limits of legacy models, you can now optimize your RAG (Retrieval-Augmented Generation) pipelines for better accuracy.
Why Grok 4.3? The Strategic Pivot
The retirement of these 8 models signals xAI's transition from a "quantity" approach to a "quality-first" agentic model. Grok 4.3 currently tops leaderboards in instruction following and multi-step tool calling. For businesses building autonomous AI systems, the unified architecture of 4.3 provides a more reliable foundation. It allows a single model to act as both the "planner" (using high reasoning) and the "executor" (using standard mode), significantly reducing the complexity of model-routing logic.
Conclusion: Preparing for the Unified AI Era
In conclusion, while the May 15 model retirement may feel like a disruption, it is a necessary step for the growth of the xAI ecosystem. By streamlining its offerings into Grok 4.3, xAI is ensuring that all developers have access to its fastest and most capable technology. The cost increase is a reflection of the significant compute resources required for high-reasoning models, and the "Fast" tier has essentially been replaced by the superior speed-to-intelligence ratio of the new generation. Update your API calls today, audit your billing alerts, and leverage the new 1M context window to build the next generation of intelligent agents.
Last Updated: May 19, 2026 | Source: xAI Developer Documentation & Official API Migration Bulletins