OpenAI Enhances Codex and Agent APIs with Internet Access and Voice Agent Features

Published by Leah Han on June 4, 2025
OpenAI Codex

tl;dr

OpenAI has expanded Codex accessibility to ChatGPT Plus users and introduced controlled internet access during task execution to enhance functionality. Additionally, the Agents SDK now supports TypeScript and features for building advanced voice agents with improved monitoring and speech-to-speech capabilities.

OpenAI has recently rolled out significant updates for its Codex model and Agent APIs designed to broaden user access and deepen development capabilities. One key change now makes Codex available to ChatGPT Plus subscribers, extending its reach beyond enterprise, team, and pro users. This represents a meaningful step in democratizing access to advanced coding assistance.

Moreover, users on Plus, Pro, and Team plans can enable internet connectivity during task runs. This new feature facilitates seamless installation of dependencies and execution of scripts that require external staging servers, all without manual environment setup. Importantly, the internet access is disabled by default and fortified with domain restrictions, HTTP method limits, and prompt-injection monitoring to mitigate security risks. Users maintain full control over when and how the connection is used, striking a balance between increased utility and protective measures.

On the Agent side, OpenAI has aligned its SDK for TypeScript parity with the existing Python version, supporting critical functionalities such as handoffs, guardrails, tracing, and human-in-the-loop approvals. These additions empower developers to pause agent actions for manual verification before execution, enhancing oversight and reliability.

Complementing this, OpenAI introduced a new RealtimeAgent capability that enables voice agents running either client-side or on private servers through the Realtime API. These agents mimic text-based workflows by supporting tool calls, transitions, guardrails, and automatic handling of audio input and interruptions. To aid developers, the updated Traces dashboard now visualizes Realtime API sessions, including audio streams, tool interactions, and interruption events, regardless of origin.

Finally, OpenAI’s speech-to-speech model has been substantially improved to better follow instructions, maintain consistency during tool calls, and handle interruptions more naturally. This updated model is accessible via the Realtime API and Chat Completions API under the tag gpt-4o-realtime-preview-2025-06-03.

Together, these changes demonstrate OpenAI’s focused effort to enhance agent development frameworks and user capabilities while embedding robust security and monitoring. They offer promising tools for building more interactive, reliable, and context-aware AI agents across text and voice.