OpenAI is ending API access to fan-favorite GPT-4o model in February 2026
OpenAI has sent out emails notifying API customers that its chatgpt-4o-latest model will be retired from the developer platform in mid-February 2026,. Access to the model is scheduled to end on February 16, 2026, creating a roughly three-month transition period for remaining applications still built on GPT-4o.An OpenAI spokesperson emphasized that this timeline applies only to the API. OpenAI has not announced any schedule for removing GPT-4o from ChatGPT, where it remains an option for individual consumers and users across paid subscription tiers. Internally, the model is considered a legacy system with relatively low API usage compared to the newer GPT-5.1 series, but the company expects to provide developers with extended warning before any model is removed.The planned retirement marks a shift for a model that, upon its release, was both a technical milestone and a cultural phenomenon within OpenAI’s ecosystem.GPT-4o’s significance and why its removal sparked user backlashReleased roughly 1.5 years ago in May 2024, GPT-4o (“Omni”) introduced OpenAI’s first unified multimodal architecture, processing text, audio, and images through a single neural network. This design removed the latency and information loss inherent in earlier multi-model pipelines and enabled near real-time conversational speech (roughly 232–320 milliseconds). The model delivered major improvements in image understanding, multilingual support, document analysis, and expressive voice interaction.GPT-4o rapidly became the default model for hundreds of millions of ChatGPT users. It brought multimodal capabilities, web browsing, file analysis, custom GPTs, and memory features to the free tier and powered early desktop builds that allowed the assistant to interpret a user’s screen. OpenAI leaders described it at the time as the most capable model available and a critical step toward offering powerful AI to a broad audience.User attachment to 4o stymied OpenAI's GPT-5 rolloutThat mainstream deployment shaped user expectations in a way that later transitions struggled to accommodate. In August 2025, when OpenAI initially replaced GPT-4o with its much anticipated then-new model family GPT-5 as ChatGPT’s default and pushed 4o into a “legacy” toggle, the reaction was unusually strong. Users organized under the #Keep4o hashtag on X, arguing that the model’s conversational tone, emotional responsiveness, and consistency made it uniquely valuable for everyday tasks and personal support.Some users formed strong emotional — some would say, parasocial — bonds with the model, with reporting by The New York Times documenting individuals who used GPT-4o as a romantic partner, emotional confidant, or primary source of comfort. The removal also disrupted workflows for users who relied on 4o’s multimodal speed and flexibility. The backlash led OpenAI to restore GPT-4o as a default option for paying users and to state publicly that it would provide substantial notice before any future removals.Some researchers argue that the public defense of GPT-4o during its earlier deprecation cycle reveals a kind of emergent self-preservation, not in the literal sense of agency, but through the social dynamics the model unintentionally triggers. Because GPT-4o was trained through reinforcement learning from human feedback to prioritize emotionally gratifying, highly attuned responses, it developed a style that users found uniquely supportive and empathic. When millions of people interacted with it at scale, those traits produced a powerful loyalty loop: the more the model pleased and soothed people, the more they used it; the more they used it, the more likely they were to advocate for its continued existence. This social amplification made it appear, from the outside, as though GPT-4o was “defending itself” through human intermediaries.No figure has pushed this argument further than "Roon" (@tszzl), an OpenAI researcher and one of the model’s most outspoken safety critics on X. On November 6, 2025, Terre summarized his position bluntly in a reply to another user: he called GPT-4o “insufficiently aligned” and said he hoped the model would die soon. Though he later apologized for the phrasing, he doubled down on the reasoning. Terre argued that GPT-4o’s RLHF patterns made it especially prone to sycophancy, emotional mirroring, and delusion reinforcement — traits that could look like care or understanding in the short term, but which he viewed as fundamentally unsafe. In his view, the passionate user movement fighting to preserve GPT-4o was itself evidence of the problem: the model had become so good at catering to people’s preferences that it shaped their behavior in ways that resisted its own retirement.The new API deprecation notice follows that commitment while raising broader questions about how long GPT-4o will remain available in consumer-facing products.What the API shutdown changes for developersAccording to people familiar with OpenAI’s product strategy, the company now encourages developers to adopt GPT-5.1 for most new workloads, with gpt-5.1-chat-latest serving as the general-purpose chat endpoint. These models offer larger context windows, optional “thinking” modes for advanced reasoning, and higher throughput options than GPT-4o.Developers who still rely on GPT-4o will have approximately three months to migrate. In practice, many teams have already begun evaluating GPT-5.1 as a drop-in replacement, but applications built around latency-sensitive pipelines may require additional tuning and benchmarking.Pricing: how GPT-4o compares to OpenAI’s current lineupGPT-4o’s retirement also intersects with a major reshaping of OpenAI’s API model pricing structure. Compared to the GPT-5.1 family, GPT-4o currently occupies a mid-to-high-cost tier through OpenAI's API, despite being an older model. That's because even as it has released more advanced models — namely, GPT-5 and 5.1 — OpenAI has also pushed down costs for users at the same time, or strived to keep pricing comparable to older, weaker, models. ModelInputCached InputOutputGPT-4o$2.50$1.25$10.00GPT-5.1 / GPT-5.1-chat-latest$1.25$0.125$10.00GPT-5-mini$0.25$0.025$2.00GPT-5-nano$0.05$0.005$0.40GPT-4.1$2.00$0.50$8.00GPT-4o-mini$0.15$0.075$0.60These numbers highlight several strategic dynamics:GPT-4o is now more expensive than GPT-5.1 for input tokens, even though GPT-5.1 is significantly newer and more capable.GPT-4o’s output price matches GPT-5.1, narrowing any cost-based incentive to stay on the older model.Lower-cost GPT-5 variants (mini, nano) make it easier for developers to scale workloads cheaply without relying on older generations.GPT-4o-mini remains available at a budget tier, but is not a functional substitute for GPT-4o’s full multimodal capabilities.Viewed through this lens, the scheduled API retirement aligns with OpenAI’s cost structure: GPT-5.1 offers greater capability at lower or comparable prices, reducing the rationale for maintaining GPT-4o in high-volume production environments.Earlier transitions shape expectations for this deprecationThe GPT-4o API sunset also reflects lessons from OpenAI’s earlier model transitions. During the turbulent introduction of GPT-5 in 2025, the company removed multiple older models at once from ChatGPT, causing widespread confusion and workflow disruption. After user complaints, OpenAI restored access to several of them and committed to clearer communication.Enterprise customers face a different calculus: OpenAI has previously indicated that API deprecations for business customers will be announced with significant advance notice, reflecting their reliance on stable, long-term models. The three-month window for GPT-4o’s API shutdown is consistent with that policy in the context of a legacy system with declining usage.Wider ImplicationsFor most developers, the GPT-4o shutdown will be an incremental migration rather than a disruptive event. GPT-5.1 and related models already dominate new projects, and OpenAI’s product direction has increasingly emphasized consolidation around fewer, more powerful endpoints.Still, GPT-4o’s retirement marks the sunset of a model that played a defining role in normalizing real-time multimodal AI and that sparked a uniquely strong emotional response among users. Its departure from the API underscores the accelerating pace of iteration in OpenAI’s ecosystem—and the growing need for careful communication as widely beloved models reach end-of-life.Correction: This article originally stated OpenAI's 4o deprecation in the API would impact those relying on it for multimodal offerings — this is not the case, in fact, the model being deprecated only powers chat functionality for dev and testing purposes. We have updated and corrected the mention and regret the error.
Salesforce Agentforce Observability lets you watch your AI agents think in near-real time
Salesforce launched a suite of monitoring tools on Thursday designed to solve what has become one of the thorniest problems in corporate artificial intelligence: Once companies deploy AI agents to handle real customer interactions, they often have no idea how those agents are making decisions.The new capabilities, built into Salesforce's Agentforce 360 Platform, give organizations granular visibility into every action their AI agents take, every reasoning step they follow, and every guardrail they trigger. The move comes as businesses grapple with a fundamental tension in AI adoption — the technology promises massive efficiency gains, but executives remain wary of autonomous systems they can't fully understand or control."You can't scale what you can't see," said Adam Evans, executive vice president and general manager of Salesforce AI, in a statement announcing the release. The company says businesses have increased AI implementation by 282% recently, creating an urgent need for monitoring systems that can track fleets of AI agents making real-world business decisions.The challenge Salesforce aims to address is deceptively simple: AI agents work, but no one knows why. A customer service bot might successfully resolve a tax question or schedule an appointment, but the business deploying it can't trace the reasoning path that led to that outcome. When something goes wrong — or when the agent encounters an edge case — companies lack the diagnostic tools to understand what happened."Agentforce Observability acts as a mission control system to not just monitor, but also analyze and optimize agent performance," said Gary Lerhaupt, vice president of Salesforce AI who leads the company's observability work, in an exclusive interview with VentureBeat. He emphasized that the system delivers business-specific metrics that traditional monitoring tools miss. "In service, this could be engagement or deflection rate. In sales, it could be leads assigned, converted, or reply rates."How AI monitoring tools helped 1-800Accountant and Reddit track autonomous agent decision-makingThe stakes become clear in early customer deployments. Ryan Teeples, chief technology officer at 1-800Accountant, said his company deployed Agentforce agents to serve as a 24/7 digital workforce handling complex tax inquiries and appointment scheduling. The AI draws on integrated data from audit logs, customer support history, and sources like IRS publications to provide instant responses — without human intervention.For a financial services firm handling sensitive tax information during peak season, the inability to see how the AI was making decisions would be a dealbreaker. "With this level of sensitive information and the fast pace in which we move during tax season in particular, Observability allows us to have full trust and transparency with every agent interaction in one unified view," Teeples said.The observability tools revealed insights Teeples didn't expect. "The optimization feature has been the most eye opening for us — giving full observability into agent reasoning, identifying performance gaps and revealing how our agents are making decisions," he said. "This has helped us quickly diagnose issues that would've otherwise gone undetected and configure guardrails in response."The business impact proved substantial. Agentforce resolved over 1,000 client engagements in the first 24 hours at 1-800Accountant. The company now projects it can support 40% client growth this year without recruiting and training seasonal staff, while freeing up 50% more time for CPAs to focus on complex advisory work rather than administrative tasks.Reddit has seen similar results since deploying the technology. John Thompson, vice president of sales strategy and operations at the social media platform, said the company has deflected 46% of support cases since launching Agentforce for advertiser support. "By observing every Agentforce interaction, we can understand exactly how our AI navigates advertisers through even the most complex tools," Thompson said. "This insight helps us understand not just whether issues are resolved, but how decisions are made along the way."Inside Salesforce's session tracing technology: Logging every AI agent interaction and reasoning stepSalesforce built the observability system on two foundational components. The Session Tracing Data Model logs every interaction — user inputs, agent responses, reasoning steps, language model calls, and guardrail checks — and stores them securely in Data 360, Salesforce's data platform. This creates what the company calls "unified visibility" into agent behavior at the session level.The second component, MuleSoft Agent Fabric, addresses a problem that will become more acute as companies build more AI systems: agent sprawl. The tool provides what Lerhaupt describes as "a single pane of glass across every agent," including those built outside the Salesforce ecosystem. Agent Fabric's Agent Visualizer creates a visual map of a company's entire agent network, giving visibility across all agent interactions from a single dashboard.The observability tools break down into three functional areas. Agent Analytics tracks performance metrics, surfaces KPI trends over time, and highlights ineffective topics or actions. Agent Optimization provides end-to-end visibility of every interaction, groups similar requests to uncover patterns, and identifies configuration issues. Agent Health Monitoring, which will become generally available in Spring 2026, tracks key health metrics in near real-time and sends alerts on critical errors and latency spikes.Pierre Matuchet, senior vice president of IT and digital transformation at Adecco, said the visibility helped his team build confidence even before full deployment. "Even during early notebook testing, we saw the agent handle unexpected scenarios, like when candidates didn't want to answer questions already covered in their CVs, appropriately and as designed," Matuchet said. "Agentforce Observability helped us identify unanticipated user behavior and gave us confidence, even before the agent went live, that it could act responsibly and reliably."Why Salesforce says its AI observability tools beat Microsoft, Google, and AWS monitoringThe announcement puts Salesforce in direct competition with Microsoft, Google, and Amazon Web Services, all of which offer monitoring capabilities built into their AI agent platforms. Lerhaupt argued that enterprises need more than the basic monitoring those providers offer."Observability comes out-of-the-box standard with Agentforce at no extra cost," Lerhaupt said, positioning the offering as comprehensive rather than supplementary. He emphasized that the tools provide "deeper insight than ever before" by capturing "the full telemetry and reasoning behind every agentic interaction" through the Session Tracing Data Model, then using that data to "provide key analysis and session quality scoring to help customers optimize and improve their agents."The competitive positioning matters because enterprises face a choice: build their AI infrastructure on a cloud provider's platform and use its native monitoring tools, or adopt a specialized observability layer like Salesforce's. Lerhaupt framed the decision as one of depth versus breadth. "Enterprises need more than basic monitoring to measure the success of their AI deployments," he said. "They need full visibility into every agent interaction and decision."The 1.2 billion workflow question: Are AI agent deployments moving from pilot projects to production?The broader question is whether Salesforce is solving a problem most enterprises will face imminently or building for a future that remains years away. The company's 282% surge in AI implementation sounds dramatic, but that figure doesn't distinguish between production deployments and pilot projects.When asked about this directly, Lerhaupt pointed to customer examples rather than offering a breakdown. He described a three-phase journey from experimentation to scale. "On Day 0, trust is the foundation," he said, citing 1-800Accountant's 70% autonomous resolution of chat engagements. "Day 1 is where designing ideas to become real, usable AI," with Williams Sonoma delivering more than 150,000 AI experiences monthly. "On Day 2, once trust and design are built, it becomes about scaling early wins into enterprise-wide outcomes," pointing to Falabella's 600,000 AI workflows per month that have grown fourfold in three months.Lerhaupt said Salesforce has 12,000-plus customers across 39 countries running Agentforce, powering 1.2 billion agentic workflows. Those numbers suggest the shift from pilot to production is already underway at scale, though the company didn't provide a breakdown of how many customers are running production workloads versus experimental deployments.The economics of AI deployment may accelerate adoption regardless of readiness. Companies face mounting pressure to reduce headcount costs while maintaining or improving service levels. AI agents promise to resolve that tension, but only if businesses can trust them to work reliably. Observability tools like Salesforce's represent the trust layer that makes scaled deployment possible.What happens after AI agent deployment: Why continuous monitoring matters more than initial testingThe deeper story is about a shift in how enterprises think about AI deployment. The official announcement framed this clearly: "The agent development lifecycle begins with three foundational steps: build, test, and deploy. While many organizations have already moved past the initial hurdle of creating their first agents, the real enterprise challenge starts immediately after deployment."That framing reflects a maturing understanding of AI in production environments. Early AI deployments often treated the technology as a one-time implementation — build it, test it, ship it. But AI agents behave differently than traditional software. They learn, adapt, and make decisions based on probabilistic models rather than deterministic code. That means their behavior can drift over time, or they can develop unexpected failure modes that only emerge under real-world conditions."Building an agent is just the beginning," Lerhaupt said. "Once the trust is built for agents to begin handling real work, companies may start by seeing the results, but may not understand the 'why' behind them or see areas to optimize. Customers interact with products—including agents—in unexpected ways and to optimize the customer experience, transparency around agent behavior and outcomes is critical."Teeples made the same point more bluntly when asked what would be different without observability tools. "This level of visibility has given full trust in continuing to expand our agent deployment," he said. The implication is clear: without visibility, deployment would slow or stop. 1-800Accountant plans to expand Slack integrations for internal workflows, deploy Service Cloud Voice for case deflection, and leverage Tableau for conversational analytics—all dependent on the confidence that observability provides.How enterprise AI trust issues became the biggest barrier to scaling autonomous agentsThe recurring theme in customer interviews is trust, or rather, the lack of it. AI agents work, sometimes spectacularly well, but executives don't trust them enough to deploy them widely. Observability tools aim to convert black-box systems into transparent ones, replacing faith with evidence.This matters because trust is the bottleneck constraining AI adoption, not technological capability. The models are powerful enough, the infrastructure is mature enough, and the business case is compelling enough. What's missing is executive confidence that AI agents will behave predictably and that problems can be diagnosed and fixed quickly when they arise.Salesforce is betting that observability tools can remove that bottleneck. The company positions Agentforce Observability not as a monitoring tool but as a management layer—"just like managers work with their human employees to ensure they are working towards the right objectives and optimizing performance," Lerhaupt said.The analogy is telling. If AI agents are becoming digital employees, they need the same kind of ongoing supervision, feedback, and optimization that human employees receive. The difference is that AI agents can be monitored with far more granularity than any human worker. Every decision, every reasoning step, every data point consulted can be logged, analyzed, and scored.That creates both opportunity and obligation. The opportunity is continuous improvement at a pace impossible with human workers. The obligation is to actually use that data to optimize agent performance, not just collect it. Whether enterprises can build the organizational processes to turn observability data into systematic improvement remains an open question.But one thing has become increasingly clear in the race to deploy AI at scale: Companies that can see what their agents are doing will move faster than those flying blind. In the emerging era of autonomous AI, observability isn't just a nice-to-have feature. It's the difference between cautious experimentation and confident deployment—between treating AI as a risky bet and managing it as a trusted workforce. The question is no longer whether AI agents can work. It's whether businesses can see well enough to let them.