data Blog = Blog { me :: Programmer, posts :: [Opinion] }

Generative AI and the Programmer

If you’re one of my loyal readers(?), you’ve probably noticed that I don’t write much anymore. It’s a common fate for blogs, but in this case I can point to a specific cause that has impacted my willingness to write: large language models (LLMs). Since ChatGPT burst onto the scene and illustrated the capabilities of generative AI, I’ve found it impossible to finish drafts because I hesitate to throw more information into the datavore. A sense of dread accompanies the sense that all of my creativity will simply feed the beast.

That’s not an idle fear, either. Generative AI relies on consuming human creativity in order to recombine it into fundamentally uncreative output. Without new human creativity, tools like ChatGPT and MidJourney will gradually experience a phenomenon known as “model collapse” because they recursively feed on their own output. Another term for this is a “degenerative quality cascade.”

There’s no putting the genie back in the bottle, though, and we have to live within the world as it is, so here I am, faced with the same question that all writers, artists, musicians—and programmers—are facing. Since the last of those identities pays my bills and is presumably the primary reason anyone reads this blog, I’ll offer my current perspective here (from a more personal point of view than I usually use) and accept its digestion into the machine as a fait accompli:

  1. LLMs as they currently exist cannot make programmers obsolete. Even the simplest of software systems exceed their current capabilities in terms of context capacity and analysis. I am confident in this prediction: without a serious change in approach, not just scaling, programmers are here to stay.
  2. Generative AI is a valuable tool for enhancing programming work. It should not be used to generate full systems, or even subsystems and modules.
  3. Current LLMs cannot be trusted as a learning aid or a replacement for search engines. This misunderstanding, and its impact on the talent pipeline, represents the most imminent threat to the industry from AI tools.

Let’s examine each of these points in detail.

Programmers are Here to Stay

There’s hype around the idea that you can “just ask” an LLM to generate a program for you, and it will produce something that resembles a working program. The first examples that I saw of this phenomenon were pong games, where ChatGPT produced Python code that could run and, in fact, could display “paddles” and a “ball.” If programming is under no threat, why did that work, and why were people so quick to jump on them as evidence of our forthcoming obsoletion?

Those examples, along with “create a web app to add numbers together” and other such trivia, are prominent among tutorials and learning materials that AI has already digested. Furthermore, they are the first things that non-programmers tend to think about when considering an “app.” They’re interactive, visual, and self-contained.

As a proportion of all the software written, though, almost nothing significant is “interactive, visual, and self-contained.” Most of the software that makes the world work is buried somewhere deep on the back end, visualized only through user interfaces that interact with it indirectly, and functioning as a cog in the vast machine of our technome. As frustrating and uninspiring as that may be, sometimes, it’s a moat. It takes human cognition to comprehend the things we build.

Nothing makes this clearer than trying to use generative AI tools to expand upon the examples that it so deftly disgorges. Yes, it can create a calculator web app, but asking it add the capability to graph a function that the user provides won’t work: it’ll try to re-emit the app from first principles, with new opportunities for error at every token. If you find an error and tell the tool to correct itself, it will say: “I’m sorry! I will fix the problem” and immediately re-emit the entire thing again, sometimes with new errors or even with the same error untouched.

This occurs at the simplest levels beyond example code.

Generative AI is Still Valuable

We don’t say hammers are useless because you can’t use them to saw a plank in half, and the inability of current AI technology to completely replace programmers doesn’t mean that it’s useless. Instead, we should apply the tool in a way that plays to its strengths. It can’t understand complete modules, let alone systems, so using a tool like GitHub Copilot is a strong middle ground. When configured correctly, this tool can digest your own code and provide generative suggestions in short bursts, allowing you to avoid the tedium of writing things like guard clauses or switch statements with clear patterns.

To make this effective, it’s absolutely necessary that you understand the tools you’re using separate from the AI. Set up your editor so that the generative output is presented as an autocompletion option, and make sure that you have to explicitly “accept” the completion. Don’t allow it to fully replace your editor’s existing autocompletion, because sometimes Copilot will misunderstand your intent and generate nonsense. Make sure that accepting a suggestion does not break your editor’s “undo” feature.

Furthermore, you must always read the generated code fully, preferably before accepting the suggestion at all. Treat it as the output of a junior developer who you don’t trust.

With this approach, I estimate that I have improved the speed with which I can implement clearly defined features by around 30%. That’s nothing close to what the “10X your productivity” marketing trumpets, but it’s not insignificant, either. I think that if you’re getting much more of a boost than that, you need to reexamine your use of the tool and make sure that you’re rigorously vetting its output, or you need to step back and improve your core system design skills; it’s likely that you are producing architectures with too much boilerplate, which generative AI can tirelessly extrude. Its tirelessness can provide the illusion of productivity and cause you to miss core improvements that would simplify the code. Of course, your learning environment isn’t free from the influence of AI, either.

AI, Learning, and the “Talent Pipeline”

The danger of learning in an era of LLMs is that it’s easy to conflate their functions with Internet technologies that emerged earlier. They can provide verbiage that looks like books and essays. They claim to know about resources like a search engine. They will answer questions like a forum or Stack Overflow. They are none of these things, and you must understand their relationship with truth and reality if you are going to use them effectively.

I have found the best pattern for me is to explain something to the chatbot and ask it to critique my understanding. It is usually decent at this, and this mode of operation dovetails with what the LLM is actually doing. It can tell you how close you are to the consensus understanding of a topic and areas that might be of interest. It can also give you “related terms” or “alternate terms” that might yield better results on search engines or allow you to read more deeply on a subject.

You cannot trust it to fully explain a topic to you, though, and it’s likely to invent resources if you ask it for citations. Code that it generates for a specific library is always riddled with errors unless that library is so old and stable that it has digested a large corpus of examples.

The best paper that I’ve seen discussing these phenomena is called “ChatGPT is Bullshit.” I won’t reiterate all of its points here—it’s free to read, so I suggest that you do so—but its key insight is that LLMs aren’t “lying” to you, and they aren’t “hallucinating.” Their output has no relationship to objective truth. Instead, it is producing “stuff” that looks plausible in relation to its digested corpus.

The “bullshit nature” of LLM output is where the largest threat to the technology industry lies. Using LLMs as tools for programming and learning, as I’ve outlined above, is effective. Both contexts require deep knowledge of the subject matter in order to fact check the output, though, and that means that they are unsuited to beginners and outsiders. LLMs are giving you information that seems plausible in relation to their corpus. As a human user, you are responsible for judging its plausibility in relation to reality. How can a beginner provide that service?

In other words, as LLMs become more prominent and produce larger proportions of available technical content, the ability of new programmers to bootstrap themselves into competence diminishes because they lack the context necessary to judge the content that they are reading. In effect, only content that is free of LLM influence is suitable for beginner use.

How many aspiring programmers will possess the caution and insight necessary to realize that, and the humility required to accept that older resources might be preferable?

The answer to that question will determine the fate of the industry.