Software automation will make us into Crazed-Super-Scientist Barons
10x cheaper and faster software development will be here in ~1 year via AI agents. How will the world change?
We’re quickly moving up the abstraction ladder for software development! Claude 3.5 Sonnet is more evidence that the cutting edge is continuing to improve.
I propose 5 levels of automation for software development, akin to the self-driving cars levels.
Levels of Software Automation
In the last few years, we’ve moved from no automation, to auto-completing lines of code, to writing whole functions:
For experienced engineers, levels I and II are tools that result in only a modest speedup, currently between 1-2x.
However, I think we’re on the brink of a major shake-up.
Level III automation
The next level of automation, where a human guides the AI toward implementing whole features, will be totally different. Software will become much, much cheaper.
A feature that might take 3 hours of concentrated work today could be done in 15 minutes by spec’ing it out in a paragraph and leaving a few comments on an AI agent’s proposed changes.
This is the vision of the startup Mentat.ai, which claims the highest score on a software engineering benchmark.
I tried it out yesterday, with little success.
First, I created an issue on our open source repo, and tagged “@MentatBot” to trigger it:
The resulting Pull Request seemed impressive at first, but on closer inspection, almost every change it made was a little bit wrong. The bot edited some of the right files, but didn’t call the helper function that it created elsewhere. It also edited some wrong files that were more like library code. It created type errors that it couldn’t fix, and didn’t always follow my instruction.
Still, MentatBot is a promising early stab at the problem, currently powered by GPT 4o (they hope to upgrade to Claude Sonnet 3.5 soon).
With another year of improvements to base LLMs, plus further unhobbling via efforts of startups to chain LLM calls productively, I can imagine us at automation level III in a year (50% chance) or two years (75% chance).
Below, I created a market on roughly the criteria for level III automation by July 2025:
The world will change appreciably with a 10x speedup in software creation.
There are 4.4M software engineers in the US. They collectively earn approximately $500B per year. If we’re able to do all that work with 10% of the engineers, that naively implies ~$450 billion in value created.
Of course, decreasing the cost of software by 90% will dramatically increase the demand, as economists know. That’s why the value created from level III automation is likely much larger, though hard to predict. With an explosion in use cases for cheap software, the value created could be in the trillions annually.
Suddenly, software will become more polished. Bank apps will take less time to load. There will be fewer bugs in day-to-day usage.
Most importantly, there’ll be an explosion of startups. Niches will be filled where it was not profitable previously. We’ll have more personalized software, generated even for individuals. And of course faster software development will feed into better AI.
However, the key unlock of level III automation is not cost savings. It’s iteration speed.
Crazed-Super-Scientist Barons
It’s well-known Manifold lore that we push changes at a breakneck pace, sometimes to the detriment of our users.
For moving this quickly, as a small team of 6 full-time, Manifold was said to be a “fiefdom run by crazed scientist barons.”
I ran with this idea and proposed that all organizations would be more effective if they operated on this model. See my “Mad Scientists Theory of Governance” market for further elaboration!
Now, imagine what will happen when you hand us another 10x speedup. I added “Super-” into the phrase, but it’s hard to envision where exactly that will lead us.
In day-to-day work, we receive a giant volume of requests for bug fixes and features. There’re so many ideas and experiments to try that we are incredibly bottlenecked on execution.
Going ten times faster could change this bottleneck from execution to getting feedback (and deciding what to do next). You need to test product changes on users to see whether your idea was good, and that takes time in the real world.
Running A/B tests can take a while to give statistically significant results. However, qualitative feedback can be richer and faster. I predict startups like ours will collect more individual feedback from users, because they will have the capacity to act on it. (Just like Discord has been critical for increasing user feedback in our journey so far.)
If today it takes two years to find product-market fit for a new product, then crazed-super-scientist barons should be able to do it in a few months.
We’ll thus see an acceleration of the serial-entrepreneur phenomenon, including more parent companies that spin up dozens of products. Such could be the future of Manifold. Our name allows for it, at least!
The speed limit of progress
Innovation is currently driven by small teams pushing hard against the frontier. This sets the global speed limit for progress.
In the next 1-2 years, AI will increase that speed limit by a factor of ten, at least for software startups. Exciting times!
As AI continues to develop, and the level of automation increases, the speed limit will continue to be pushed back across all fields. I look forward to this world of abundant frontier advances!
Are you unironically enthusiastic about it? If so, is this because you believe that the progress will stop there/plateau there? Or just that this won't lead to any kind of catastrophic scenario?