My Mostly Positive Experience with GitHub Copilot

You might assume I’m entirely against the use of AI as it relates to building software. But I’m actually relatively bullish on the tech. I have subscriptions to ChatGPT and Saner.ai; have tabs for ChatGPT, Microsoft Copilot, and Saner.ai always open; and have ChatGPT and Gemini apps on my phone’s home screen.

As mentioned in part 1, I’ve been a GitHub Copilot customer since sometime in 2022 or 2023. I was at my last day job, and while I had the normal array of executive and leadership responsibilities, I also got to code more. It was also almost all standard Ruby on Rails, and Ruby is a delightful language to code in. But I was rusty. After I subscribed to Copilot and got it running in VSCode, I had a nice little coding buddy right there who could answer my stupid questions about syntax, or some Rails-specific methodology. Here are the specific things I’ve found useful using Copilot and other GenAI tools for software development:

  1. As intoned above, I find Copilot helpful when returning to a language or framework one might have some familiarity with, but it’s been a while. I’m sure the same could be said of a new framework that closely resembles another framework a developer is familiar with. Often in my career I’ve thought, “I know how to do this in Framework of Lore, surely it is similar here…” and after scratching my head and searching the web, discovered, yes it was, except for a few important details. With Copilot it often just suggests a code completion block, which I will read and either think, “Yep, that’s it!” or “That’s super close, I’ll just have to change these two things.” About 20% of the time, I know, “Nope, I can see why you think that’s what I want to do here, but it’s not.” If the code completion isn’t adequate, I just chat with my coding buddy in the sidebar and usually arrive at enough to get me started. Writing good comments seems to help Copilot make better suggestions as well, which carries the added benefit of encouraging me to comment my code.

    It’s important to realize that I have over two decades of coding experience at this point, so I have enough exposure to different languages and frameworks, and enough gut instinct, to make these determinations about what the AI is suggesting.

  2. Similarly, I recently picked up TypeScript. I’ve been a JavaScript developer since the mid 00’s, but was out of individual contributor roles by the time TS came around. The dev environment already helps out when working in a typed language, but with the addition of Copilot, I could just hover over the red squiggly underlines and have the issue explained to me, with a suggestion of how to fix it. It accelerated my learning and understanding of how TypeScript works, and is a little bit nicer than the built-in type references available in an IDE without it.

    I imagine a similar technique could be used when picking up any new language or framework. Clone a repo, then start selecting chunks of code you don’t quite grok and ask Copilot to simply explain it to you. Again, it’s less useful if you’re brand new to coding, but if you’ve already worked with a half dozen languages and at least as many frameworks, it’s handy. It beats flipping to the docs constantly.

  3. Tests. This one is controversial. But I found Copilot very useful for writing a certain type of unit test in Ruby. One example was a method I wrote for normalizing various string inputs. This type of code does not lend itself to manual testing, and trying to think of all the edge cases while actually coding the method is impossible. This is when TDD becomes even more valuable. But it is still up to you to think of all those inputs, even the edge cases. With Copilot, it quickly got the gist of what I was doing, and suggested numerous tests, one right after the other. I wrote one, maybe two, then Copilot started suggesting variations and I thought, “Yes, thanks…Yep, that one…Oh! Yes, thanks, I hadn’t thought of that one but we are going to get it!”

GenAI will probably never write entire systems for you based on your description of your domain problems. But here are the other things it doesn’t yet do well enough, or never will because it’s not its job:

  1. Let you skip actually understanding the code. It can help explain it to you, as I suggested above, but at the end of the day the developer has to understand everything that’s happening. Honestly, if your hope is to develop software without having to understand how it works, become a product manager.
  2. Provide accurate results around new APIs, or APIs that are changing frequently. The foundational model can’t keep up. I mentioned the specific example of Copilot’s (or ChatGPT’s) attempts to provide working code for utilizing ChatGPT’s API, which recently underwent some significant changes between v1 and v2, to the point their own docs are hard to follow, and forum posts from last year are confusingly out of date.

    If you’re a large enterprise developing APIs for your customers, it would benefit you to partner with GitHub and the like to get your new docs into these GenAI code buddies as quickly as possible. I haven’t seen it yet, but I wouldn’t be surprised if there is already a way to augment Copilot with your own API docs.

  3. Know when the code in your codebase is bad. At aforementioned job, we inherited the largest, most poorly designed and written legacy codebase from a well-intentioned but under-equipped solo dev who had built the app from rails new on up for a number of years. Almost all of the extensive and complex business logic was contained in a single controller. I want to say it was in the neighborhood of 30,000 lines long. The flog scores on most of the main files in the repo were multiples higher than the highest score described as Someone please think of the children.

    What this meant is that Copilot often suggested new methods and patterns that repeated these same mistakes and antipatterns it “learned” from the project it was “living” in. So instead of, “Yes, please” it was, “No, no, no!” Classic garbage in, garbage out.

There are also a few general areas I would be interested in learning how GenAI could be applied for good. I assume people are already working on these, so if you are reading this and know of them, please get in touch to let me know:

  1. QA. We already automate a lot of QA, but I imagine GenAI would be good at writing the code or instructions that automate clicking buttons and observing the results. We’ll still need quality QA engineers that think of all the weird ways to break software, though. I’m also unsure, given the state of the art in QA automation, if AI would really speed things up in any significant way.

    Random find while still writing this post: Flowtest.ai: AI agent for end-to-end QA testing

  2. Scraping. We actually started on such a project at my last company. Scraper development is generally understood to be one of the worst domains to find oneself in. The tools used require the structure of the web pages they are interfacing with to remain constant. This might be one of those rare cases where the ability to give less specific instructions to an AI-augmented scraping bot would actually create more resilience in the system.
  3. Initial code review, along the lines of linters.
  4. Helpers around the occasional friction of collaboration on the same areas of the code. A PR bot that just gives a heads up that Barbara from Team Thunder also just pushed a branch that is going to cause merge conflicts. That particular use case might not require any AI, but something more elaborate might.
  5. Refactoring. The opposite of the antipattern regurgitation described above. The AI could suggest, “Here’s a more common way of doing this.” Or we could ask of it, “Write a PR that refactors all instances of this pattern with the pattern as demonstrated below.” Or, “Write a PR that moves these two methods into their own class.” Or, and God help me I’m not making this example up, “Write a PR that refactors these 50 lines of string concatenation by using a template.”

AI as a tool for developing software can be very useful in the hands of a reasonably experienced developer who understands the business, principles of good software design, and knows deep in their soul that the best code is that which is never written. Like an expensive, sharp knife in the hands of a chef. You wouldn’t hand one of those to your kid when they’re helping out in the kitchen. You get those plastic ones for them, the ones that don’t cut off fingers.

Previous: I Almost Touched the Supreme

Archives | Blogroll | RSS