Finally using Claude Code
Finally using Claude Code
I’ve been a Claude Pro user in my personal life for more than 15 months now. I started a serious side project around that time, and I wanted Claude to drive frontend for it since my React skills were pre-historic at that time.
At work, I am also allowed to use Copilot, and have a Gemini Pro subscription – although I almost never use the Copilot interface in an Agent mode. I only use it sometimes for the convenience of not copy pasting files to and fro with browser from an IDE. After I wrote baler, I almost always use Gemini.
I also wrote about using baler last year, and things have changed since. Claude Code was introduced and a lot of people like that. Not me, nuh uh. My initial impressions were the following
The problems
- I was and still am a big fan of Claude models for coding related tasks. I believed that Claude Code could produce great results given the context was small and the solutions repetitive. For small technical questions, specifically like “How to enable Gitea shell runners?” – it oftentimes solves the question or leaves me at a situation where I can get to the appropriate Gitea documentation section easily. Given a larger task, I could manage context easily with something like
baler
– conveniently including and excluding files and directories that seemed not useful for the questions.spo - I was afraid to give agency to Claude such that the critical technical aspects of my project was hidden behind “English” abstractions.
- I was also afraid of the security aspects of running a third party tool in my computer and provide access to
- editing my files without consulting me
- Yes I give permission to do it repeatedly in the session – if I don’t it’s unusable.
- Yes I have git, but if you’re also used to committing on a reasonable chunk of work that takes an hour on average, it’s too much context switch to
- If I connect to MCPs, I can also have injection attacks – in my otherwise isolated system.
- editing my files without consulting me
- Not being able to apply minor corrections to the suggestions easily.
- Yes I can suggest a follow up answer in a corrective prompt – but I run the same risk of not getting the expected output – and it’s fatiguing to get errors in a multiplier in multi-turn conversations.
The solution
A few months of being on the fence about using Claude Code for larger projects, some of the answers that I could think of to this problem were
- Find a way to sandbox Claude code:
- I know that Claude Code asks for permissions for things but it soon turns into fatigue and one “blanket approves” changes and they codebase may get real smelly real soon.
- There wasn’t a popular container sandbox for Claude Code, if one used a unpopular one – soon the a sandbox update would try to create a container running as root mounts
/var
and removes everything inside it. So containers aren’t a solution.
- Have a local centralized code storage and another computer in my home – which I only use for ssh and Claude Code. This allows me to have an isolated runtime for Claude Code – at the same time being able to verify changes quickly in my development machine, and have a fast iteration cycle in general.
- But maintaining another computer – especially without LAN connection is hard – wifi disconnects often, and I have to connect it to a display and restart the system
- [Have a VM running in my primary development machine] - Last year, I tried UTM and unfortunately didn’t have a great experience running arm64. I tried again in September'25 and I was able to spawn a VM and it didn’t impact performance of my base system so heavily. So I settled on this solution
How does the current workflow look like?
let’s call my primary development machine X1, and the VM with Claude Code running X2.
This is the workflow I want to get at.
For new features:
- I create a branch named
feature:new-feature-y
in both X1, X2. - I use X2 to prompt Claude Code and generate code. I push the suboptimal and sometimes unverified changes to a branch. I pull this branch in X1, make the changes I desire ( which is easier than prompting Claude to get it right ), and push it to git server.
- If there are more changes required for this feature like writing integration tests, I use X2 again with updated code – and repeat the validation cycle.
For bug-fixes
- I try the Chat interface with
baler
to check if the fix is small. If yes, I don’t have to involve X2. - Once I have the fix, I write a desired unit test myself and ask Claude in the chat interface to write more test.
- If the fix takes longer than 15 minutes, I move the context to X2 through a branch, and try with Claude Code, sometimes with logs of tests.
In both the cases, I almost never trust the output Claude gives me. I rarely (2-5%) am able to one-shot all levels(5 lines - 1000 lines) of change requirements to my projects.
But I ignore hallucinations, i.e if I ask Claude to change all reference to schema1.users
table in the SQL code to use schema2.users
, and it has changed it in the 1st occurrence in a file’s select
statements, I don’t check it in the next ones, I still keep an eye out for when it references indices or foreign key references.
For technical questions irrelevant to the project context like (“Foregin keys syntax sqlite3”), I still use the Chat interface to get a stub which I can verify in the project’s documentation site. Unless I’m familiar with the structure of the documentation site already – like that of https://pkg.go.dev or https://docs.python.org/3/.
What can be improved?
- As it’s a VM with limited resources, the UX isn’t fast. The ideal thing would be a second GUI computer to do this, but I don’t have that space.
- For frontend tasks, it’d be ideal if it can look at the React application and iterate on a visual output. But I have to at least improve the VM to make it a dev environment to achieve this. Right now, it’s only 60% there.