Software automation will make us into Crazed-Super-Scientist Barons

10x cheaper and faster software development will be here in ~1 year via AI agents. How will the world change?

Jul 01, 2024

We’re quickly moving up the abstraction ladder for software development! Claude 3.5 Sonnet is more evidence that the cutting edge is continuing to improve.

I propose 5 levels of automation for software development, akin to the self-driving cars levels.

Levels of Software Automation

In the last few years, we’ve moved from no automation, to auto-completing lines of code, to writing whole functions:

We are currently at level II. See Cursor for the state of the art.

For experienced engineers, levels I and II are tools that result in only a modest speedup, currently between 1-2x.

However, I think we’re on the brink of a major shake-up.

Level III automation

The next level of automation, where a human guides the AI toward implementing whole features, will be totally different. Software will become much, much cheaper.

A feature that might take 3 hours of concentrated work today could be done in 15 minutes by spec’ing it out in a paragraph and leaving a few comments on an AI agent’s proposed changes.

This is the vision of the startup Mentat.ai, which claims the highest score on a software engineering benchmark.

MentatBot’s impressive benchmark result.

I tried it out yesterday, with little success.

First, I created an issue on our open source repo, and tagged “@MentatBot” to trigger it:

I asked the AI agent bot to do some work for me.

The resulting Pull Request seemed impressive at first, but on closer inspection, almost every change it made was a little bit wrong. The bot edited some of the right files, but didn’t call the helper function that it created elsewhere. It also edited some wrong files that were more like library code. It created type errors that it couldn’t fix, and didn’t always follow my instruction.

Still, MentatBot is a promising early stab at the problem, currently powered by GPT 4o (they hope to upgrade to Claude Sonnet 3.5 soon).

With another year of improvements to base LLMs, plus further unhobbling via efforts of startups to chain LLM calls productively, I can imagine us at automation level III in a year (50% chance) or two years (75% chance).

Below, I created a market on roughly the criteria for level III automation by July 2025:

The world will change appreciably with a 10x speedup in software creation.

There are 4.4M software engineers in the US. They collectively earn approximately $500B per year. If we’re able to do all that work with 10% of the engineers, that naively implies ~$450 billion in value created.

Of course, decreasing the cost of software by 90% will dramatically increase the demand, as economists know. That’s why the value created from level III automation is likely much larger, though hard to predict. With an explosion in use cases for cheap software, the value created could be in the trillions annually.

Suddenly, software will become more polished. Bank apps will take less time to load. There will be fewer bugs in day-to-day usage.

Most importantly, there’ll be an explosion of startups. Niches will be filled where it was not profitable previously. We’ll have more personalized software, generated even for individuals. And of course faster software development will feed into better AI.

However, the key unlock of level III automation is not cost savings. It’s iteration speed.

Crazed-Super-Scientist Barons

It’s well-known Manifold lore that we push changes at a breakneck pace, sometimes to the detriment of our users.

We printed this on a poster in our office. It’s become a source of pride.

For moving this quickly, as a small team of 6 full-time, Manifold was said to be a “fiefdom run by crazed scientist barons.”

I ran with this idea and proposed that all organizations would be more effective if they operated on this model. See my “Mad Scientists Theory of Governance” market for further elaboration!

Now, imagine what will happen when you hand us another 10x speedup. I added “Super-” into the phrase, but it’s hard to envision where exactly that will lead us.

In day-to-day work, we receive a giant volume of requests for bug fixes and features. There’re so many ideas and experiments to try that we are incredibly bottlenecked on execution.

Going ten times faster could change this bottleneck from execution to getting feedback (and deciding what to do next). You need to test product changes on users to see whether your idea was good, and that takes time in the real world.

Running A/B tests can take a while to give statistically significant results. However, qualitative feedback can be richer and faster. I predict startups like ours will collect more individual feedback from users, because they will have the capacity to act on it. (Just like Discord has been critical for increasing user feedback in our journey so far.)

If today it takes two years to find product-market fit for a new product, then crazed-super-scientist barons should be able to do it in a few months.

We’ll thus see an acceleration of the serial-entrepreneur phenomenon, including more parent companies that spin up dozens of products. Such could be the future of Manifold. Our name allows for it, at least!

The speed limit of progress

Innovation is currently driven by small teams pushing hard against the frontier. This sets the global speed limit for progress.

In the next 1-2 years, AI will increase that speed limit by a factor of ten, at least for software startups. Exciting times!

As AI continues to develop, and the level of automation increases, the speed limit will continue to be pushed back across all fields. I look forward to this world of abundant frontier advances!

Dan Franks

Jul 7

It's fun to read posts like this a year on. You were pretty accurate in your predictions!

March of 2023 I told my boss at the time that I predicted software engineers wouldn't be directly coding within a few years, at least for simpler systems. Little over two years in and I've got six terminal windows open, deploying one website, refactoring a client app, deploying infra on AWS using terraform, building implementation and pricing strategy for a startup nonprofit, building the most "enterprise" grade home lab ever, and constantly iterating on a repo framework I can use to bootstrap new projects. No direct coding in any of them.

Not only am I not coding directly anymore, I'm not even writing business development and strategy directly anymore. I've been waiting for this my entire life, watching compute and storage come down in cost until this moment inevitably occurred. Exciting times

Expand full comment

rational_hippy

Jul 11, 2024

Are you unironically enthusiastic about it? If so, is this because you believe that the progress will stop there/plateau there? Or just that this won't lead to any kind of catastrophic scenario?

1 reply by James

1 more comment...

Liberty

Discussion about this post