"GPT-5.5 是 OpenAI 自 GPT-4.5 以来首个完全重新训练的基础模型。SWE-bench Verified 88.7%、Terminal-Bench 2.0 82.7%、1M 上下文检索质量从 36.6% 跃升至 74.0%。本文完整拆解 benchmark 数据、定价策略,以及
"GPT-5.5 is OpenAI's first fully retrained foundation model since GPT-4.5. It delivers 88.7% on SWE-bench Verified, 82.7% on Terminal-Bench 2.0, and m
"Claude Sonnet 4.6 在 SWE-bench Verified 上达到 79.6%,定价 $3/$15 每百万 token,与 Opus 4.6 仅差 1.2 分但成本只有 60%。深度解析 Anthropic 如何在中端模型上实现前沿编程和 Agent 性能。"
"Claude Sonnet 4.6 delivers 79.6% on SWE-bench Verified at $3/$15 per million tokens — within 1.2 points of Opus 4.6 at 60% of the cost. A technical d