On May 20, 2026, OpenAI published a customer story about Ramp, a fintech company where engineering teams restructured their code review process around Codex with GPT-5.5. The results were immediate: engineers who previously waited hours for a first review now received substantive feedback in minutes.
The interesting part is not that AI can review code. It is that Ramp identified code review as the correct bottleneck to optimize, and in doing so, revealed a pattern that most engineering teams are missing.
The Bottleneck That Everyone Sees and Nobody Optimizes
Modern software delivery has a pipeline: write code, submit a pull request, wait for review, address feedback, merge, deploy. Over the past two years, AI coding tools have dramatically accelerated the first step. Cursor, Copilot, Codex, and Claude Code have made it possible to generate substantial code changes in minutes rather than hours.
But here is the constraint: faster code generation means more pull requests. More pull requests mean more review load. More review load means longer wait times. The pipeline's throughput has not improved. It has just shifted the queue from one stage to another.
Ramp's engineering team recognized this. Austin Ray, who leads AI developer experience at Ramp, observed that engineers were writing code faster but still waiting hours for meaningful feedback. The bottleneck was no longer code production. It was code review.
This is a textbook case of a principle that applies far beyond software: in any pipeline, the system's throughput is determined by its slowest stage. Optimizing any stage other than the bottleneck creates inventory (unreviewed PRs) without improving delivery speed. The engineers get faster at writing code, only to wait longer for someone to read it.
What Ramp Actually Did
Ramp integrated Codex with GPT-5.5 into their review workflow. The specifics matter less than the structural change.
Before Codex, the review process was: developer submits PR, waits for a human reviewer to have bandwidth, reviewer reads the full diff, provides feedback, developer iterates. The total cycle time was dominated by the wait for human attention, not the review itself.
After Codex, the process became: developer submits PR, Codex immediately performs a thorough review against the full codebase context, developer receives and addresses automated feedback, then a human reviewer focuses on architectural and business logic concerns that Codex cannot evaluate.
The structural insight is that Codex did not replace human reviewers. It filtered the review workload. By handling mechanical issues, style violations, potential bugs, and consistency checks, Codex freed human reviewers to focus on judgment-intensive work. The constraint moved from "how fast can we read all this code" to "how fast can we make good architectural decisions about this code."
Austin Ray's assessment was direct: "Codex code review catches things that I miss and that other engineers miss and that other AI code reviewers definitely miss." The key phrase is "catches things that I miss." This is not about replacing human attention. It is about supplementing it with a different kind of attention, one that can maintain perfect consistency across thousands of lines while the human reviewer focuses on the parts that actually require judgment.
The On-Call Agent: When the Bottleneck Moves Again
Ramp did not stop at code review. Ray used Codex to build an internal tool called On-Call Assistant, an agent that handles the burden of on-call rotations: investigating incidents, correlating events, tracking down root causes across complex distributed systems.
Ray described the challenge: "a ton of complexity including concurrency bugs, external/internal event balance, long-running incident investigations." His assessment of Codex with GPT-5.5 on this task: "it handles it like it's nothing."
This is the bottleneck principle in action again. Once you remove the code review constraint, the next constraint appears. For Ramp, it was on-call incident response. The engineering team had already seen that AI could handle structured, context-heavy tasks like code review. On-call investigation is a natural extension: it requires reading logs, correlating events, and forming hypotheses, all tasks where exhaustive coverage matters more than creative insight.
The pattern is worth noting: each time you optimize a bottleneck, the next one becomes visible. Teams that adopt AI tools for code generation but stop there will find their delivery pipeline constrained by the next slowest stage. The teams that get the most value are the ones that keep identifying and addressing constraints as they appear.
Three Lessons from Ramp's Approach
Ramp's experience offers three practical lessons for engineering teams considering AI adoption.
Demonstrate value in the first session. Ray's advice: "Get your engineers to install Codex, sit down with them, and guide them through a really solid first session." This is not about training. It is about creating a positive experience before skepticism sets in. Engineers who have been burned by overhyped AI tools will not invest time in learning a new one unless the first interaction is compelling. One good session beats any amount of documentation.
Build trust through iteration, not mandates. "Most engineers don't fully understand or trust that they're going to have a good experience with this," Ray observed. "By guiding them through that first experience, you change their perspective." Top-down adoption mandates for AI tools fail because they skip the trust-building phase. The engineers who use the tool most effectively are the ones who discovered its value themselves, with guidance but not coercion.
Invest in the feedback loop with vendors. Ray emphasized that Ramp works directly with the Codex team on feedback: "That feedback loop is what makes a vendor relationship worth investing in." This is a point that procurement-focused organizations often miss. The value of an AI tool is not static. It improves with feedback. Teams that treat AI tools as finished products miss the compounding returns that come from active engagement with the vendor.
The Bigger Picture: Reviewers Become Orchestrators
Ray's forward-looking observation deserves attention: "Engineers are going to become orchestrators. The skill is no longer writing every line of code yourself. It's knowing how to direct AI tools like Codex, when to trust them, and when to push back."
This is a description of a role transition, from individual contributor to manager of AI capabilities. The reviewer who used to read every line of a diff now reads what the AI flagged, evaluates whether the flags are correct, and makes architectural decisions about the overall change. The on-call engineer who used to spend hours tracing logs now reviews the AI's investigation and decides where to focus human attention.
The skill that becomes more valuable is judgment: knowing when the AI is right, when it is wrong, and where human attention is most needed. The skill that becomes less valuable is mechanical thoroughness: reading every line, checking every convention, tracing every log entry. This is the same transition that happens whenever automation enters a profession. The work does not disappear. It shifts up the abstraction stack.
FAQ
What is OpenAI Codex?
Codex is OpenAI's AI coding agent powered by models like GPT-5.5. It can read code, write code, review pull requests, and perform multi-step engineering tasks within a sandboxed environment.
How did Ramp use Codex for code review?
Ramp integrated Codex with GPT-5.5 into their review workflow. Codex performs an immediate, thorough review against the full codebase context when a PR is submitted, providing substantive feedback in minutes rather than the hours typically required for human review.
Did Codex replace human code reviewers at Ramp?
No. Codex handles mechanical review tasks (style, consistency, potential bugs), freeing human reviewers to focus on architectural decisions and business logic. The structural change was in how review workload was distributed, not in removing human judgment.
What is Ramp's On-Call Assistant?
An internal tool built by Ramp engineers using Codex. It automates much of the incident investigation work during on-call rotations: reading logs, correlating events, and tracing root causes across distributed systems.
What model does Codex use at Ramp?
Ramp uses Codex powered by GPT-5.5, OpenAI's latest reasoning model optimized for coding tasks.
How can other teams replicate Ramp's approach?
Start by identifying your actual delivery bottleneck. If code review is the constraint, introduce AI review to handle mechanical checks while human reviewers focus on judgment. Then measure whether the bottleneck moves. Adopting AI tools at the wrong stage of the pipeline produces activity without improvement.
References
- OpenAI. "How Ramp engineers accelerate code review with Codex." May 20, 2026. https://openai.com/index/ramp/
- OpenAI. "Introducing Codex." https://openai.com/index/introducing-codex/
- OpenAI. "Running Codex Safely at OpenAI." https://openai.com/index/running-codex-safely-at-openai/
- Ramp. https://ramp.com