OpenAI Launches Flex Processing API Option to Compete with Google
In a move to stay competitive with AI industry giants like Google, OpenAI has introduced Flex processing, a new API option that offers lower prices for AI model usage in exchange for slower response times and occasional resource unavailability.
Flex processing is currently in beta for OpenAI’s o3 and o4-mini reasoning models, targeting lower-priority and non-production tasks such as model evaluations, data enrichment, and asynchronous workloads.
This cost-saving feature reduces API expenses by exactly half. For o3, Flex processing is priced at $5 per million input tokens (~750,000 words) and $20 per million output tokens, compared to the standard rates of $10 per million input tokens and $40 per million output tokens. Similarly, for o4-mini, Flex processing brings the price down to $0.55 per million input tokens and $2.20 per million output tokens from the original rates of $1.10 per million input tokens and $4.40 per million output tokens.
The introduction of Flex processing comes at a time when the cost of cutting-edge AI technology is on the rise, with competitors releasing more affordable and efficient budget-friendly models. Google, for example, recently unveiled Gemini 2.5 Flash, a reasoning model that offers comparable or superior performance to DeepSeek’s R1 at a lower input token cost.
In a communication to its customers regarding the launch of Flex pricing, OpenAI also announced that developers in tiers 1-3 of its usage hierarchy will need to undergo the newly implemented ID verification process to access o3. This verification requirement also applies to other models’ reasoning summaries and streaming API support.
OpenAI has stated that the ID verification process is designed to prevent misuse of its services by unauthorized individuals.