
OpenAI's new o1-preview model introduces System 2 thinking to large language models, significantly improving performance in math and coding.
OpenAI has released its latest model series, dubbed o1-preview, which focuses on advanced reasoning capabilities. Unlike traditional models that generate tokens sequentially based on probability, o1 incorporates a Chain of Thought process during inference, allowing it to "think" through complex problems before responding.
In recent benchmarks, o1 performed at the level of PhD students on difficult physics, biology, and chemistry tasks. It also showed a remarkable jump in coding and mathematics proficiency, solving 83% of the problems in the International Mathematics Olympiad qualifying exam compared to GPT-4o's 13%.
This shift toward inference-time compute suggests a new scaling law in AI development. By spending more time processing a query, models can achieve better results without necessarily needing billions of additional training parameters. However, this comes at the cost of higher latency and increased API pricing for end-users.


