The article is quite general. Here's some notes on how AI is being used to do AI research at frontier labs specifically. It's not the singularity (yet?) but it's heading in that direction.Most training is now actually inference, not directly gradient descent. Reinforcement learning requires the generation of lots of 'rollouts' that are then compared with each other via an algorithm like GRPO. Or they might be compared using a critic model - AI judging AI and causing it to self improve. Generating a rollout means inference. And there's lots of data cleaning by older models. This has been called in the past 'textbook' or 'curriculum' learning, not sure what it's called now. But AI is also used for things like data/document labelling, transcription of videos, detection of images/videos with watermarks or subtitles, elimination of content that shouldn't be in the dataset, creation of new content that should and so on.
AI has proven capable of some routine work, like brute-force optimizing GPU kernels or doing hyperparameter sweeps.
Obviously, researchers are all using coding agents too.
So that's a few ways AI is self-improving. But there are lots of other ways in which even frontier models are still beaten by human researchers. Experiments in closing the loop have failed. For instance, people have tried giving the latest models access to some GPUs and an old version of an AI codebase that was recently optimized by human researchers (a NanoChat speed run goal, I believe). Could the models match the performance of the AI researchers? Nope. They only got 10% as far as the humans did, mostly because their approach was uninspired. They wasted a lot of time and budget doing low-IQ stuff like hyperparameter tuning. The humans had many other tactics like studying the research literature and inventing new algorithms that the models didn't even attempt.
The bottleneck is therefore currently the level of insight and inspiration the models are capable of. I've also seen this in my own work. I come up with an idea I think is novel and see if I can get a frontier model to reach the same idea. It never works without questions so leading it's more or less pointless.
It's very unclear why AI struggles so much with innovation yet can invent new songs, poems etc without apparent difficulty. Obvious answers like "it's not in the training set" don't feel right to me, the issue is deeper.