← Blog

Finally using Claude Code

Finally using Claude Code

I’ve been a Claude Pro user in my personal life for more than 15 months now. I started a serious side project around that time, and I wanted Claude to drive frontend for it since my React skills were pre-historic at that time.

At work, I am also allowed to use Copilot, and have a Gemini Pro subscription – although I almost never use the Copilot interface in an Agent mode. I only use it sometimes for the convenience of not copy pasting files to and fro with browser from an IDE. After I wrote baler, I almost always use Gemini. 

I also wrote about using baler last year, and things have changed since. Claude Code was introduced and a lot of people like that. Not me, nuh uh. My initial impressions were the following

The problems

  1. I was and still am a big fan of Claude models for coding related tasks. I believed that Claude Code could produce great results given the context was small and the solutions repetitive. For small technical questions, specifically like “How to enable Gitea shell runners?” – it oftentimes solves the question or leaves me at a situation where I can get to the appropriate Gitea documentation section easily. Given a larger task, I could manage context easily with something like baler – conveniently including and excluding files and directories that seemed not useful for the questions.spo
  2. I was afraid to give agency to Claude such that the critical technical aspects of my project was hidden behind “English” abstractions.
  3. I was also afraid of the security aspects of running a third party tool in my computer and provide access to
    1. editing my files without consulting me
      1. Yes I give permission to do it repeatedly in the session – if I don’t it’s unusable.
      2. Yes I have git, but if you’re also used to committing on a reasonable chunk of work that takes an hour on average, it’s too much context switch to
    2. If I connect to MCPs, I can also have injection attacks – in my otherwise isolated system.
  4. Not being able to apply minor corrections to the suggestions easily.
    1. Yes I can suggest a follow up answer in a corrective prompt – but I run the same risk of not getting the expected output – and it’s fatiguing to get errors in a multiplier in multi-turn conversations.

The solution

A few months of being on the fence about using Claude Code for larger projects, some of the answers that I could think of to this problem were

  1. Find a way to sandbox Claude code:
    1. I know that Claude Code asks for permissions for things but it soon turns into fatigue and one “blanket approves” changes and they codebase may get real smelly real soon.
    2. There wasn’t a popular container sandbox for Claude Code, if one used a unpopular one – soon the a sandbox update would try to create a container running as root mounts /var and removes everything inside it. So containers aren’t a solution.
  2. Have a local centralized code storage and another computer in my home – which I only use for ssh and Claude Code. This allows me to have an isolated runtime for Claude Code – at the same time being able to verify changes quickly in my development machine, and have a fast iteration cycle in general.
    1. But maintaining another computer – especially without LAN connection is hard – wifi disconnects often, and I have to connect it to a display and restart the system
  3. [Have a VM running in my primary development machine] - Last year, I tried UTM and unfortunately didn’t have a great experience running arm64. I tried again in September'25 and I was able to spawn a VM and it didn’t impact performance of my base system so heavily. So I settled on this solution

How does the current workflow look like?

let’s call my primary development machine X1, and the VM with Claude Code running X2.

This is the workflow I want to get at.

For new features:

  1. I create a branch named feature:new-feature-y in both X1, X2.
  2. I use X2 to prompt Claude Code and generate code. I push the suboptimal and sometimes unverified changes to a branch. I pull this branch in X1, make the changes I desire ( which is easier than prompting Claude to get it right ), and push it to git server.
  3. If there are more changes required for this feature like writing integration tests, I use X2 again with updated code – and repeat the validation cycle.

For bug-fixes

  1. I try the Chat interface with baler to check if the fix is small. If yes, I don’t have to involve X2.
  2. Once I have the fix, I write a desired unit test myself and ask Claude in the chat interface to write more test.
  3. If the fix takes longer than 15 minutes, I move the context to X2 through a branch, and try with Claude Code, sometimes with logs of tests.

In both the cases, I almost never trust the output Claude gives me. I rarely (2-5%) am able to one-shot all levels(5 lines - 1000 lines) of change requirements to my projects.

But I ignore hallucinations, i.e if I ask Claude to change all reference to schema1.users table in the SQL code to use schema2.users, and it has changed it in the 1st occurrence in a file’s select statements, I don’t check it in the next ones, I still keep an eye out for when it references indices or foreign key references.

For technical questions irrelevant to the project context like (“Foregin keys syntax sqlite3”), I still use the Chat interface to get a stub which I can verify in the project’s documentation site. Unless I’m familiar with the structure of the documentation site already – like that of https://pkg.go.dev or https://docs.python.org/3/.

What can be improved?

  1. As it’s a VM with limited resources, the UX isn’t fast. The ideal thing would be a second GUI computer to do this, but I don’t have that space.
  2. For frontend tasks, it’d be ideal if it can look at the React application and iterate on a visual output. But I have to at least improve the VM to make it a dev environment to achieve this. Right now, it’s only 60% there.