1: AI is advancing fast
AI has existed since the 1950s, but in the 2010s and especially 2020s, “deep learning” systems have made great strides in capability. The most prominent deep learning systems are “large language models” (LLMs) trained on completing human text, such as GPT-4 (which powers ChatGPT). These models end up being able to do a surprisingly wide range of tasks.
This progress has been largely driven by rapid scaling of the resources going into it. A few different factors are growing exponentially:1
-
The amount of money spent (2.6x growth per year), which is now on the order of a hundred million for the largest training runs.
-
The quality of computing hardware technology (1.35x growth per year), measured in computations available per dollar — see also “Moore’s Law”.
-
The quality of software algorithms (3x growth per year), measured according to how much computing power it takes to reach some given level of performance.
-
The amount of data used to train language models (3x growth per year).
The total computing power applied to AI increases exponentially based on the first two points, and once you include software improvement, what you might call the “effective computing power” increases at an even faster exponential rate.
This growth compounds quickly. The computing power used to train the largest models in 2024 was billions of times greater than in 2010 — and that’s before taking into account better software.
As a result of this scaling, AI has become smarter. Various AI systems can now:
-
Drive cars. (For real this time.)
-
Generate realistic images, speech, music, and even video from a short text prompt.
-
Operate robots in factories and on construction sites.
-
Beat the best humans at many board and video games.
-
Predict the structure of proteins.
Scaling has also made some AI systems more general. A single language model can:
-
Hold a conversation like a human.
-
Use its reasoning to play video games like Pokemon without specific training.
-
Play chess at the level of an intermediate human player, just from having seen lots of game transcripts.
-
Give sensible advice on a wide range of topics.
-
Tutor users competently.
-
Search the web for new information.
-
Translate between most languages, even if they haven’t been trained on the specific pair.
To get a sense of the capabilities of current language models, you can try them for yourself — you may notice that current freely-available systems are much better at reliably answering complex questions than anything that existed two or three years ago.
You can also see AI progress on quantitative benchmarks. AI has rapidly improved on measures of its ability to:
-
Understand written text.2
-
Answer science questions suited to graduate students.3
-
Solve coding competition problems.4
-
Solve Olympiad-level problems in mathematics.5
The “horizon length” of AI has also doubled every seven months.6 That is: newer AI systems are increasingly able to remain coherent in their thinking while doing coding tasks, succeeding at new types of tasks that take humans longer and longer amounts of time.
In fact, researchers are struggling to keep up and create new benchmarks. One benchmark, called “Humanity’s Last Exam”7, is a collection of problems from various fields that are hard even for experts. As of April 2025, AI progress on solving it has been only modest — although we haven’t checked in the past hour.
If progress is so fast, it’s natural to ask: will such trends take AI all the way to the human level?
These numbers are from Epoch AI (retrieved April 2025, but will probably change in the future). ↩︎
https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/ ↩︎
https://paperswithcode.com/sota/humanity-s-last-exam-on-humanity-s-last-exam ↩︎