The Refactored Road: I Don't Write YAML Anymore: How an AI Agent Runs My Home Lab

AI agent managing Kubernetes home lab infrastructure through GitOps

The Problem With YAML

I run over 20 services on a Kubernetes cluster at home. Photo management, vehicle tracking, DNS filtering, home automation, monitoring, a self-hosted AI chatbot, a public website — the usual collection that starts with "I'll just run one thing" and ends with a four-node cluster.

For a long time, the bottleneck wasn't Kubernetes itself. It was me, writing YAML. Every new service meant a deployment, a service, an ingress, network policies, persistent volumes, backup cronjobs, maybe a HorizontalPodAutoscaler. All following the same conventions. All tedious to get right. All boring after the first dozen times.

So I stopped writing it. I taught an AI agent to do it instead.

The Pipeline: Git In, Cluster Out

Before we get to the AI part, the foundation matters. The entire cluster is managed through GitOps — every manifest lives in a Git repository, and a GitOps controller watches for changes and reconciles the cluster state automatically.

The pipeline looks like this:

A change gets pushed to a feature branch
A pull request is opened
CI runs schema validation and a Kubernetes linter
The PR gets reviewed and squash-merged
The GitOps controller picks up the change and applies it within minutes

On top of that, a dependency bot watches for container image updates and opens PRs automatically. Minor and patch updates get auto-merged after CI passes. I don't touch those at all.

The key insight: once you trust the pipeline, you don't care who writes the YAML. Human or AI — the same CI gates apply, the same review process, the same GitOps reconciliation. The pipeline is the safety net.

The Builder: AI as Infrastructure Engineer

Here's what deploying a new service looks like now. I open a terminal, start an AI coding agent, and say something like:

Deploy a time-series database exposed to the LAN on port 8086
with persistent storage and automated first-boot setup.

The agent reads the existing repository — the directory structure, the naming conventions, how other services handle networking, storage, and security policies. Then it writes a complete set of manifests: namespace, deployment, service, persistent volume claims, network policies, and a kustomization file that ties it all together. It opens a PR with a clear description of what it did and why.

I review the PR. CI has already validated the schemas and linting. If it looks good, I merge. Five minutes later, the service is running.

What makes this work isn't magic — it's conventions. The repository has consistent patterns: every app lives in its own directory, network policies follow a default-deny model, secrets use a specific format, kustomizations follow the same template. The agent picks up on these patterns and replicates them. It's essentially doing what I would do, minus the part where I mistype an indentation level and spend twenty minutes debugging.

Real Examples

A recent favorite: I discovered that four backup cronjobs scheduled between 2:00 and 2:59 AM were silently skipping during the spring DST transition. That hour simply doesn't exist on the last Sunday of March. I told the agent to fix it. It shifted all four schedules to 1:00 AM, kept the 15-minute stagger between jobs, wrote a commit message explaining the DST edge case, and opened a PR. Total time from "huh, backups didn't run" to merged fix: about three minutes.

Another one: deploying a self-hosted AI chatbot with a messaging sidecar. That one was... less smooth. It took around 35 commits to get right — OOM kills, authentication mode confusion, init container issues, gateway binding problems, and a long detour through model provider configurations. The agent wrote every commit. But I was the one saying "no, try this instead" and "that's still not working, check the logs." The AI was fast at iterating, but it needed a human who understood the actual runtime behavior.

That's an honest picture of the dynamic. Some tasks are five-minute slam dunks. Others are collaborative debugging sessions where the AI does the typing and you do the thinking.

The Operator: AI for Cluster Diagnostics

Building infrastructure is one half. Keeping it healthy is the other.

I built a custom AI command — a single slash command in the terminal — that acts as a cluster management assistant. When I type it and ask "what's the cluster health status?", it spawns multiple sub-agents in parallel: one checks node health, another validates the control plane, another scans pod status across all namespaces, another verifies backups, another checks certificate expiration. They all run simultaneously and the results get synthesized into a single report.

It's not just a dashboard. I can ask it questions like "why is the photo service using so much memory?" and it'll pull metrics, check logs, review resource limits, compare to historical patterns, and give me a diagnosis with recommendations. I can ask it to troubleshoot a crash-looping pod, and it'll trace through the events, check for common causes, and suggest specific fixes.

For OS-level upgrades, the agent follows a strict safety protocol: preflight checks, backup verification, one node at a time, health gates between each step, mandatory soak time between control plane nodes, and automatic rollback triggers if something goes wrong. It offers dry-run mode by default before any destructive operation.

On top of that, there's a small custom monitoring service running in the cluster that exposes health data as a JSON API. An ESP32 microcontroller with a tiny display polls this endpoint and shows real-time cluster health — a physical dashboard on my desk. When the overall health score drops below a threshold, I know to open a terminal and ask the AI what's going on.

What Works, What Doesn't

What works well:

Convention enforcement. The agent is better than me at following the repository's own patterns consistently. It doesn't forget network policies or skip liveness probes because it's Friday afternoon.
Speed of iteration. Going from intent to PR in minutes instead of half an hour of YAML wrangling.
Parallel diagnostics. The operator command checking six things at once instead of me running kubectl commands one by one.
Knowledge retention. The agent remembers past deployment patterns, known gotchas, and operational procedures across sessions.

What doesn't work well:

Runtime awareness. The agent can read manifests and git history, but it doesn't inherently know what's happening in the cluster right now. You have to tell it to check, or give it access to the right tools.
Over-engineering. Left unchecked, it'll add three layers of abstraction to a problem that needed two lines of config.
Novel problems. When something genuinely new goes wrong — like a NAS outage cascading into postgres startup failures and pod scheduling issues — the agent helps execute the recovery, but the human still has to understand the failure mode and direct the response.

The Human's Job Now

I still make every architectural decision. I decide what services to run, how they should be networked, what security model to use, when to upgrade. I review every PR before it merges. I'm the one who notices that DST is eating my backups in the first place.

What I don't do anymore is translate those decisions into YAML by hand. I don't copy-paste network policy boilerplate. I don't look up the kustomization schema for the fourth time this month. I don't manually run health checks across a dozen namespaces.

The AI handles the mechanical parts. I handle the parts that require judgment, context, and understanding of what the infrastructure is actually for.

Is this overkill for a home lab? Maybe. But a home lab was always about learning more than it was about the services themselves. And learning how to work effectively with AI agents on real infrastructure — with real consequences when things break — feels like exactly the right thing to be doing right now.

The YAML still gets written. I just don't write it anymore.

The Refactored Road

Monday, February 16, 2026

I Don't Write YAML Anymore: How an AI Agent Runs My Home Lab