Coding with Copilot Part 2 - Agent Edition

The Paradigm Shift: Coding agents transform your todo list from a planning tool into an execution engine.

6 min read

Tags: software engineering gen-ai ai

Generated by ChatGPT 5

I recently read a blog post lauding the experience of using Claude Code for Web. Specifically there was quote in the post that goes as follows:

I’ve been using it as a “to-do list that does itself” — when I think of something small that I want to tweak, across a variety of projects (work, work-related side project, side project, open source project) I just throw it into a thread. Then I come back, sometimes later in the day and sometimes days later, to see what Claude did and to finish things up.

I read this and thought "This is EXACTLY how I use AI for coding today" and it prompted me to start this post. In this post, I'll share how coding agents have fundamentally changed not just my productivity, but my entire approach to personal project management. Consider this a follow-up to my previous post, outlining my current workflow for getting the most out of the latest and greatest AI tools available for software engineering.

It's been almost 6 months since I started practicing AI-assisted coding and wrote about my very positive experience with it. At the time I had just started with a basic setup including Github Copilot in VSCode with Claude Sonnet 3.5 as the driving model.

Using Github Copilot alongside my standard setup in VSCode to work on coding projects (primarily this website) was a significant jump in my overall productivity and ability to deliver working software via augmentation. However, it didn't fundamentally change the overall workflow or cadence of how I came up with ideas, tracked them, worked on them, and eventually (maybe) shipped a finished version of them. Having AI in the passenger seat chopped the time to work on and ship the feature, however there were plenty of times that an idea just languished on my "coding projects todo list" until I got even 30 minutes to sit and focus on a coding session.

Enter coding agents. My first exposure to this paradigm came when Jules was modestly announced at Google IO 2025. I'm not deeply familiar with the history of "who came first", as far as I know Google Labs was the first to ship an asynchronous coding agent as a free, albeit "experimental," product. The user experience was incredibly simple and in a few clicks I had the agent hooked up the repo for my personal website and I was off to sending it prompts to work on for me.

After a few sessions with it, I could confidently say it worked 100% of the time...about 80% of the time. But I'm more than satisfied with that kind of hit rate on translating a few lines of plain English into working code. From there I threw it a few random tasks that I otherwise would have just thrown to Copilot but was simple enough to open up a browser, submit the prompt, and move on to something else without having to keep an editor open or monitor the agent's coding progress.

After a few months of light usage was when the paradigm shift happened. I thought up a new enhancement I'd like to have added to the site. I reflectively reached for my todo list to add it, but paused when I realized I could just pull up Jules in my browser and send it the todo instead. And so I did. I typed out my todo item in a few sentences, clicked submit, and watched Jules spin up and get to work. Suddenly , I felt like I was leveled up. I had just created a "todo list and does itself".

The next "ah-hah" moment happened after I received this lovely email from Mozilla about the sunsetting of Pocket.
Pocket retirement email

I used Pocket to bookmark links to interesting things I found online and built a Github Actions-based integration that hooked into the pocket API and populated the Daily Reads section of my website homepage with links I specifically tagged to share. With pocket going away in the next few months, I had the task ahead of me to quickly find an alternative else some core website functionality would break.

So I gave it to my self-doing todo list: Jules. I started with the following prompt

Propose 3 plans to convert the pocket integration to other services that can be used to save and retrieve links via an api to populate the latest reads section

After some research and consideration via some other AI tools, including chats with ChatGPT and Gemini, I landed on Raindrop as my choice for replacement. And so I instructed Jules in the same task context (yes I included the typo from the original prompt I sent on purpose):

make the code changes to covert the site features that rely on pocket to raindrop io as outlined

Jules did it on the first try, nearly flawlessly. A full refactor to a completely new bookmark manager system and API integration, seamlessly swapped into my existing Github Actions integrations with the only real update I had to make was swapping out an API key. You can see the full Pull Request here. In total, Jules edited 8 files with around 100 lines updated. Unlike most other PRs opened by Jules, this one required zero fixes or additional edits after my review. This will probably remain the example that cemented my optimism for this being the next step in how AI will continue to change the game for Software Engineers.

Since then I've tried other hosted agentic coding products like Open AI's Codex and the recently released Claude Code for Web. Even though I find Claude's models to be the best at coding via Github Copilot or Claude Code, nothing I've used has matched the level of polish, quality, and accuracy that Jules has achieved for me thus far. I'm far from a power user, clocking maybe 1 or 2 hours of coding time a week, but from my experience its usually the product that can get it right quick and early that ends up holding its own.

Me and my agents

As with most things, its not all rainbows and sunshine. As I mentioned, Jules gets it right about 80% of the time on good days. Even then, most PRs it opens requires at least a couple of passes of reviews from me and usually some final polish and tweaks via a local coding session with Copilot. In some cases the feature is simply too complex for Jules to handle and it will ship a completely broken PR that I end up just trashing and starting from scratch. Even with these pitfalls, my overall productivity is up another 2x. Tasks that would have taken 2 hours with no AI, went down to less than 1 hour with Copilot, and have in some cases gone down to minutes with Jules. My trusty todo list has literally been outclassed.

As I continue dabbling in these tools on weekends and free evenings, I'm looking forward to the next paradigm shift that is inevitably on the way that will let me do even more with less.