The Refactored Road

Monday, April 6, 2026

Solar-Aware Appliance Scheduling with Homey — How I Built It

Since last time

A week ago I wrote about teaching my washing machine to pick its own schedule — one of my favorite smart home projects so far. At the end of that post I mentioned wanting cumulative savings tracking. That shipped in v1.3.0 about two days later — the device dashboard now shows per-schedule and lifetime savings.

A few other things landed between then and now:

v1.3.0 — Savings tracking, Frank Energie all-in pricing (spot + tax + markup + VAT), and a bunch of bug fixes including a UTC offset issue that served yesterday's prices during summer time.
v1.4.0 — Flow triggers for price changes and status changes, so you can build automations like "notify me when the price drops below X."
v1.5.0 — A "schedule cheapest start by time" action card. Instead of "within 6 hours," you say "before 14:00" and the scheduler figures out the window.
v1.5.1 — The rounding bug from the last post came back in disguise: low-power appliances like a phone charger showed EUR 0.00 savings because the rounding happened too early in the pipeline. Fixed by deferring all rounding to the display layer.

All useful, all driven by actually living with the app. But then I looked at my roof.

The solar problem

I have solar panels. When the sun is shining, I'm generating electricity that I either use or export to the grid. The feed-in tariff — what my energy company pays me for exported power — is about EUR 0.07/kWh. Meanwhile, buying from the grid costs EUR 0.15–0.35/kWh depending on the hour.

The old Power Profiler didn't know about any of this. It looked at grid prices and scheduled my washing machine at 3 AM because that's when electricity is cheapest. Technically correct. But if the sun is going to produce 3 kW of surplus at noon, running the washing machine then effectively costs me EUR 0.07/kWh (the feed-in I'm giving up) instead of EUR 0.25/kWh from the grid at 3 AM.

Simple fix, right? Just tell each device "solar hours are cheap." Except there's a catch.

I have a washing machine, a dryer, and a dishwasher. If all three independently decide that the noon slot is basically free, they'll all schedule into it. Their combined demand — maybe 4.5 kW — exceeds my 3 kW solar surplus, and the difference comes from the grid at full price. None of them accounted for the others. I call this the stampede problem.

Effective pricing

The solution is a centralized pricing service — a Virtual Energy Supplier — that computes an effective price per time slot. The effective price blends what your cheap sources (solar) and the grid contribute, weighted by how much capacity is actually available.

The math works like a waterfall:

Take all cheap sources (PV panels, future: battery), sorted by cost
Subtract base load — your fridge, router, standby devices eat into PV first
Subtract capacity already reserved by other scheduled appliances
Whatever cheap capacity is left goes to this appliance
The rest comes from the grid

The blended price is then:

effectivePrice = (cheapKwh × cheapCost + gridKwh × gridPrice) / totalKwh

For a zero-cost source like solar, the "cost" is the feed-in tariff — the opportunity cost of not exporting. This is economically correct: using your own solar isn't free, it's worth whatever you'd have earned by selling it.

One guard: the effective price never exceeds the grid price. If the math somehow produces a higher number (it shouldn't, but floating point), we cap it.

Knowing your base load

Here's something I didn't think about initially: not all solar production is available for scheduled appliances. Your house has a base load — the fridge cycling, the router, standby power, HVAC — that eats into PV output before anything else gets a turn.

My house draws about 200–400W overnight, 500–800W during the day. A fixed estimate of "500W base load" works okay, but it's wrong for every specific half-hour of the day.

If you have a smart meter connected to Homey (most Dutch homes have a P1 dongle), the app now automatically detects it and learns your actual base load pattern. The formula is simple:

baseLoad = gridPower + pvPower

Where gridPower is your smart meter reading (positive = importing, negative = exporting) and pvPower is your inverter reading. This works regardless of whether you're importing or exporting. The app builds a 48-slot profile (one per 30 minutes) using an exponential moving average that adapts over about two weeks of observations.

No smart meter? The app falls back to a configurable constant (default 400W). Less accurate, but the system still works.

Solcast and correction factors

For solar forecasts, I integrated Solcast. Their Hobbyist plan gives you 10 API calls per day for free — enough for two forecast fetches and two actuals fetches, with budget to spare.

The forecasts are good, but they're not perfect for your specific installation. Maybe your panels face slightly north-west and get shaded by a tree after 15:00. Solcast doesn't know that.

So the app learns. For each 30-minute slot, it compares what Solcast predicted with what your inverter actually produced:

correctionFactor = actualYield / solcastEstimate

This ratio is smoothed with an EMA (window of 14 observations, so roughly two weeks). After a few days, the app knows that Solcast overestimates your afternoon production by 15% and adjusts accordingly. The correction factor is capped at 3.0× to prevent one cloudy outlier from making the forecast permanently pessimistic.

One thing I got wrong initially: I recorded an API call against the daily budget before the fetch completed. If the request failed due to a network error, the call was wasted. Now it only counts on success — or on a 429 rate limit response, since that means the API actually processed it.

The stampede problem

Back to our three appliances all wanting the noon slot. The Virtual Energy Supplier solves this with a reservation ledger.

When prices or forecasts update, the app triggers a full replan:

Release all existing capacity reservations
Sort all scheduled devices by deadline (earliest first — highest priority)
For each device, compute effective prices with the current reservations factored in, find the optimal window, and reserve that capacity

The key insight: device #2 sees device #1's reservation already deducted from the available surplus. If the noon slot only has 1.5 kW left after the washing machine reserved its share, the dryer computes a higher effective price for that slot and might pick a different time instead.

Sequential scheduling by deadline priority means the most urgent device gets first pick of the cheap slots. Less urgent devices adapt. No coordination protocol, no negotiation — just a deterministic ordering that produces good results.

The dashboard

All this math is invisible to the user if they don't care about it. The scheduler just picks better times. But if you do want to see what's happening, there's a new Virtual Energy Provider device — a dashboard that shows:

Current effective price vs. grid price
Feed-in tariff
PV surplus and available capacity
Household base load
Number of scheduled appliances

It also comes with flow cards. You can trigger automations when surplus exceeds a threshold ("turn on the EV charger when surplus > 2000W"), when the effective price drops below a value, or when the spot price goes negative (yes, that happens in the Netherlands — you get paid to consume).

What I learned

The biggest lesson: solar scheduling isn't a pricing problem — it's a resource allocation problem. Giving each device a "solar discount" is simple but wrong. You need a central arbiter that knows total supply, base demand, and existing commitments.

The second lesson: base load matters more than you'd think. Without accounting for the 600W my house draws at noon, I was overestimating available PV by about 20%. That's the difference between "washing machine runs on solar" and "washing machine runs on solar plus a bit of grid at full price."

Third: correction factors are essential. Weather-based forecasts for solar are impressively good at the regional level and consistently wrong for your specific roof. The EMA correction is cheap to compute and makes the scheduling decisions noticeably better after just a few days.

v1.6.0 is currently in the Test channel. If you have a Homey Pro with solar panels and want to try it, I'd love feedback. Next up: battery storage support and getting this through Homey's app certification.

Saturday, April 4, 2026

I Built a $15 Smart Home Controller (and Why Phones Are Bad Dashboards)

ESP32 Cheap Yellow Display smart home controller showing appliance scheduling cards

In my previous post I wrote about how my washing machine and dryer pick their own schedule based on energy prices. That post was about the concept — a Homey app that finds the cheapest window to run your appliances. What I didn't mention was the thing on the kitchen wall that makes it actually usable.

Because here's the truth about smart home automation: if the only way to interact with it is through an app on your phone, it won't survive contact with your household.

The Problem With Apps

I call it the spouse test. If your partner needs to unlock their phone, find the right app, navigate to the right screen, and tap three buttons just to start the dryer at a cheap time — they're going to press the button on the dryer instead. And they'd be right to.

A physical device on the wall changes that dynamic entirely. It's always on, always showing the current state, and requires exactly one tap to do the thing. No login, no loading spinner, no "update available" popup. It's the difference between a light switch and a lighting app.

So I decided to build one. A small touch screen near the laundry area that shows energy prices, appliance status, and lets you schedule a run with a single tap.

$15 of Hardware, Infinite Ambition

The ESP-2432S028R — affectionately known as the "Cheap Yellow Display" or CYD — is one of those products that shouldn't exist at its price point. For about $15, you get an ESP32 microcontroller, a 2.8-inch color TFT display with touch input, WiFi, and enough GPIO pins to feel dangerous.

The screen is 320 by 240 pixels. That's not a lot. For context, the icon for your weather app is probably bigger than this entire display. But for a single-purpose device that shows two appliance cards and a price indicator, it's plenty.

The ESP32 handles WiFi, MQTT communication with my Homey hub, NTP time sync, and over-the-air firmware updates. All on a chip that draws about half a watt. The whole thing runs off a USB-C cable.

The First Pivot

I didn't start with custom firmware. Like any reasonable person, I started with ESPHome — the YAML-based framework that lets you configure ESP32 devices without writing C++. Define your sensors, your display layout, your automations, and ESPHome generates the firmware for you.

It worked. For about two hours.

The problem was MQTT topic structure. My Homey app publishes appliance data on specific topics with JSON payloads — state, pricing, scheduling, savings data. ESPHome's MQTT integration is designed for Home Assistant's auto-discovery format, and bending it to work with custom topic structures felt like writing C++ with extra steps. Worse steps, actually, because you're debugging generated code you didn't write.

So I pivoted to PlatformIO with the Arduino framework. Full C++ control, direct access to proven libraries like TFT_eSPI for the display and espMqttClient for MQTT, and — crucially — the ability to structure my code the way the problem demanded rather than the way a YAML schema allowed.

Was it more work? Absolutely. Was it the right call? Without question. Some projects fit neatly into a configuration-driven framework. This one needed a real codebase.

Building a UI on 320 by 240 Pixels

Designing a touch interface for a 2.8-inch screen is an exercise in brutal prioritization. There's no room for nice-to-haves. Every pixel has a job.

The layout I landed on has three elements. A top bar showing the current time, date, energy price, and WiFi status. Two appliance cards — one for the washer, one for the dryer — each showing their current state with a color-coded badge, key stats, and an action button. That's it.

The action button is context-sensitive. If the appliance is idle, it says PLAN and opens a scheduling modal. If it's already scheduled, it says CANCEL. The modal lets you pick a deadline: 4, 8, 12, or 24 hours from now. Tap your choice and the system finds the cheapest slot within that window.

Color does a lot of heavy lifting on a small screen. Green badge means idle and ready. Blue means scheduled. Yellow means the system is still learning this appliance's power profile. Red means something needs attention. You can read the state of both appliances from across the room without your glasses.

One design choice I'm particularly happy with: when an appliance is scheduled, the card shows the start time, the expected price per cycle, and how much you're saving compared to running it right now. That last number — "Saving €0.38" — turns out to be incredibly motivating. It makes the abstract concept of energy optimization tangible.

The Bugs That Taught Me Things

Embedded development has a way of humbling you. Here are three problems that took longer to solve than I'd like to admit.

Dual SPI buses. The CYD board has two SPI peripherals: one for the display (ILI9341) and one for the touch controller (XPT2046). Most example code assumes they share a bus. They don't — and they can't, because the display's SPI runs at 40 MHz while the touch controller maxes out at 2.5 MHz. Sharing a bus means reconfiguring speed on every swap, which causes timing glitches. The fix was putting them on separate hardware SPI buses (HSPI and VSPI), each with their own pins. Obvious in hindsight. Took a full afternoon to figure out.

SPI reentrancy crashes. This one was subtle. MQTT messages arrive asynchronously via callbacks. My first implementation updated the display directly from the MQTT callback — parse the JSON, update the card, done. It worked great until a message arrived while the display was mid-draw. Two SPI transactions on the same bus at once: instant crash, no useful stack trace.

The solution is almost embarrassingly simple: the MQTT callback sets a boolean flag. The main loop checks the flag, and if it's set, redraws the screen. No concurrent SPI access, no crashes. It's the embedded equivalent of "don't update the DOM from a web worker" — except the consequence isn't a console warning, it's a hard reset.

Touch calibration. Every CYD unit has slightly different touch calibration values. The raw coordinates from the XPT2046 don't map 1:1 to screen pixels — they need to be scaled and offset. My first unit worked perfectly with the default calibration. My second unit registered taps about 30 pixels to the left. The fix was a calibration routine and storing per-unit values, but the debugging process involved a lot of tapping one spot and watching a dot appear somewhere else. It felt like playing Operation with an invisible board.

The Result

The finished device sits on the wall near our washing machine. It shows the current energy price, the status of both appliances, and lets you schedule a run with two taps: hit PLAN, pick your deadline. The system finds the cheapest slot within your window and confirms the schedule.

It updates over-the-air, reconnects automatically if WiFi drops, and uses about as much power as a phone charger. The total hardware cost was under €20.

But the real measure of success isn't technical. It's that my partner uses it without thinking about it. There's no app to open, no concept to explain. Tap the card, pick when you need it done. The rest happens automatically.

Sometimes the best smart home upgrade isn't smarter software — it's a simple screen on the wall that does exactly one thing well.

Monday, March 30, 2026

My Washing Machine Picks Its Own Schedule (and Saves Money)

Power Profiler app dashboard showing appliance energy scheduling

My electricity price changes every hour. Some hours it’s 0.05 EUR/kWh. Other hours it’s 0.38. The difference between running the dishwasher at 4 PM versus 2 AM can easily be half a euro — per cycle, every day.

I have a washing machine, a dishwasher, and a dryer. They run almost daily. None of them came with a “wait for cheap power” button.

So I built one.

The setup

I run a Homey smart home hub. Each appliance is plugged into a smart plug that reports real-time power consumption. The idea was simple: if I know how much power an appliance uses and when electricity is cheapest, I can schedule the run automatically.

The result is Power Profiler, a Homey app — part of my ongoing home lab builds — that watches your appliances, learns their patterns, and triggers a flow at the cheapest moment. Here’s how it works — and what I learned building it.

Teaching a smart plug to recognize a wash cycle

A smart plug gives you one number: watts. Right now, my dishwasher is drawing 3 watts. Boring. But when it starts a cycle, that number jumps to 2,200 during the heating phase, drops to 50 during a pause, spikes again for the rinse, and eventually settles back to idle.

The challenge is knowing when a cycle starts and — more importantly — when it actually ends. My first attempt was simple: if power goes above 50 watts, the cycle started. If it drops below 50, it ended.

That lasted about one wash.

The problem is mid-cycle dips. A washing machine drops to near-zero between the wash and spin phases. The dishwasher pauses between wash and rinse. My naive detector saw these pauses and thought each one was a separate cycle.

The fix was a cooldown period: a 2-minute grace window after power drops. If the appliance kicks back in during those 2 minutes, it’s still the same cycle. If it stays quiet, the cycle is done. This one change turned messy data into clean recordings.

The finite state machine has three states: idle (waiting), active (recording), and cooldown (waiting to see if it’s really over). Simple enough to reason about, robust enough to handle the real world.

Three cycles and you’re profiled

After three complete runs, the app has enough data to build a power profile — a minute-by-minute picture of what the appliance does during a full cycle.

And this is where the appliances show their personalities.

The dishwasher runs for about 3 hours and 20 minutes. That surprised me — I always thought of it as a quick appliance. It heats water to 2,200 watts in bursts, runs pumps at moderate power, pauses, rinses, and dries. Three profiled cycles averaging 199 minutes each, using about 1 kWh per run.

The washing machine is faster but more dramatic. About 2 hours and 20 minutes on a standard program, peaking at 2,285 watts when the heating element kicks in. The power curve looks like a mountain range — big spikes for heating, quiet valleys during soaking, and a final burst for the spin cycle.

The dryer is the gentle one. A steady 550 watts for the duration of the run — no dramatic spikes, just a long, patient hum. It’s still being profiled, so it’s not scheduling yet. Three cycles and it’ll join the team.

The profile captures more than just averages. For each minute, the app records the minimum, maximum, and average power across all recorded cycles. And it keeps learning — every new cycle gets added to a rolling buffer of the last 20 runs. Run your dishwasher on eco mode a few times, then switch to intensive? The profile gradually shifts to reflect your actual usage, not just the first three cycles you happened to record. Three cycles gets you started. Twenty keeps you accurate.

Finding the cheapest hour

Here’s where the money comes in. In the Netherlands, day-ahead electricity prices are published every afternoon. Prices for each hour of the next day, straight from the EPEX spot market. Some hours are 5 cents per kWh. Others are 35. Occasionally they go negative — yes, you can get paid to use electricity.

The algorithm is a sliding window. Take the power profile and slide it across every possible start time within your deadline. For each position, multiply the minute-by-minute power draw by the energy price at that moment. The cheapest total wins.

Here’s what that looks like in practice, step by step:

You trigger “Schedule cheapest start within 12 hours” — say, at 8 PM. That gives the app a window from 8 PM tonight until 8 AM tomorrow.
The app grabs the power profile — for the dishwasher, that’s 199 minutes of minute-by-minute power data, learned from three previous cycles.
It grabs tonight’s energy prices — a price for each hour (or quarter-hour, depending on your provider) within the 12-hour window.
It tries every possible start time. “What if I start at 8:00 PM? 8:01? 8:02?” For each one, it overlays the power profile onto the price timeline and calculates the total cost: watts times price, minute by minute, summed up.
The cheapest start time wins. Say starting at 1:15 AM costs EUR 0.18, while starting at 8 PM would have cost EUR 0.34. The app picks 1:15 AM.
A timer is set. The app counts down to 1:15 AM. When it fires, it triggers your Homey flow — which turns on the smart plug, and the dishwasher starts.

The dishwasher’s 3.5-hour cycle needs a big window — it can’t just squeeze into any single cheap hour. The algorithm has to find the best stretch of hours, weighing the expensive heating minutes against the cheaper idle phases. The washing machine’s 2.5-hour cycle has a bit more flexibility, but its high-power heating spikes mean the price during those specific minutes matters a lot.

The math is almost embarrassingly simple. No machine learning, no neural networks. Just a loop that tries every start time and picks the cheapest one. It runs in milliseconds. Sometimes brute force is the right answer.

The things that bit me

The algorithm worked quickly. Getting the details right took longer. During testing, I displayed estimated cost with two decimal places — sensible for money, except when the dishwasher runs at 2 AM during cheap hours and the total cost is EUR 0.003, which rounds to EUR 0.00. Looks like the calculation is broken. Then there was the schedule that showed “Tomorrow, 23:00” for tonight’s run — a timezone comparison bug where UTC and local time disagreed about which day it was. Three lines to fix, two hours to find. The kind of bugs that only show up at midnight, in a timezone you didn’t consider.

What happens when you press “go”

Here’s the actual user experience.

You install the app, pick your energy provider (EasyEnergy, EnergyZero, or one of three others), and add a device. The app shows all your smart plugs — pick the one under your dishwasher. Done. You now have a “Dishwasher Profiler” on your Homey dashboard.

For the next few days, just use your appliances normally. The app watches. After three cycles, your profile is ready. The dashboard shows average cycle duration, energy per run, and a cycle count.

Now you create a Homey flow: “Schedule cheapest start within 12 hours.” The app slides your profile across tonight’s prices and picks the winner. Your dashboard shows “Next start: 02:15” and “Estimated cost: EUR 0.1825.”

At 2:15 AM, the app fires a trigger. Your flow turns on the dishwasher. The dishwasher starts. You’re asleep.

Is it life-changing money? No. But it’s money I save by doing absolutely nothing. The app watches, learns, waits, and acts. I just load the dishes.

What I’d do differently

If I started over, I’d track cumulative savings from day one. Right now, users can see their total energy and cost, but not “how much you saved compared to running at peak.” That comparison would make the value instantly visible. This will be a future improvement.

But the core idea — watch, learn, schedule — holds up. It’s the kind of automation that disappears into the background, which is exactly where the best smart home tech should be.

Update: I've since added solar-aware scheduling to Power Profiler — it now accounts for PV production, household base load, and prevents appliances from fighting over the same sunny slots. Read how I built it.

Monday, March 16, 2026

My Solar Panels Only Work 30% — Why Air Conditioning Beats a Battery

Solar panel energy analysis showing production vs consumption mismatch

The Measuring Part

Last year I wired up my Dutch household with more sensors than a hospital ICU. A HomeWizard P1 meter on the smart meter, a separate kWh meter on the solar inverter, a Homey Pro hub tracking every smart plug, and an InfluxDB instance running on Kubernetes because apparently I can't do anything casually.

The house: a regular family home in Limburg, Netherlands. 1,400 kWh of solar panels on the roof, a gas boiler for heating and hot water, and roughly €3,100 per year in energy costs. After twelve months of collecting data at 10-second intervals, I sat down to figure out where the money was actually going.

The answer was not what I expected.

The 30% Problem

Here's the uncomfortable truth about my solar panels: only 30% of the electricity they generate is actually used by my household. The other 70% gets exported to the grid.

The reason is timing. My panels produce peak power between 11:00 and 15:00 — up to 1,169 watts on a sunny day. But my household's peak consumption is between 16:00 and 21:00, hitting 1,600-2,500 watts when everyone's home, cooking, and running appliances. By then, the sun has moved on.

During the night, we're pulling 300-540 watts of baseload from the grid with zero solar contribution. In the morning, we're consuming 270-400 watts while the panels are barely waking up.

In the Netherlands, this hasn't mattered much until now. A policy called saldering (net metering) lets you offset exported kWh against imported kWh at the same rate. Export a kilowatt-hour at noon, get credited the full €0.32 when you import one at dinner time. Free battery, essentially.

That ends January 1, 2027. After that, exported power earns roughly €0.09/kWh. Imported power still costs €0.32. Same solar panels, same house, but now that 70% export becomes a real problem.

Everyone Says: Buy a Battery

The obvious fix is a home battery. Store surplus solar during the day, use it in the evening. I looked at the Marstek Venus E 3.0: 5 kWh capacity, lithium iron phosphate, €1,224.

The math:

Store ~3.8 kWh/day of solar surplus
90% round-trip efficiency, so 3.4 kWh usable
Shift from €0.09 export to €0.22 evening import (on dynamic tariff)
Add some price arbitrage on cheap night hours
Total savings: ~€149/year
Payback: just over 8 years

Fine. Not terrible. But not exactly thrilling either, for a device with a 10-year warranty. I kept digging.

The Counterintuitive Fix

What if instead of storing solar energy, I could just use more of it — right when it's being generated?

This is where air conditioning enters the picture, and where most people's eyebrows go up.

Think about when you want cooling. Hot, sunny days. Now think about when solar panels produce the most. Hot, sunny days. The correlation between cooling demand and solar generation is almost perfect. June through August, my AC would consume roughly 160 kWh — nearly 100% coincident with peak PV output.

But here's the number that actually matters. When AC uses 1 kWh of solar electricity for cooling at a SEER of 6.1, that's 6.1 kWh of cooling delivered. Useful, but cooling doesn't displace another fuel. The real magic happens with heating.

Wait, Heating With AC?

Modern split-unit air conditioners aren't the window-rattling boxes from the '90s. A unit like the TCL Elite F2 is a full heat pump: it heats down to -15°C and carries a heating efficiency (SCOP) of 4.0. That means for every 1 kWh of electricity, it delivers 4 kWh of heat.

Compare that to my gas boiler, which converts 1 m³ of gas (9.77 kWh) into about 8.8 kWh of heat at 90% efficiency. At current prices:

Gas heating: 1 m³ × €1.54 = €1.54 for 8.8 kWh of heat → €0.175/kWh_thermal
AC heating (dynamic tariff): 1 kWh × €0.22 = €0.22 for 4.0 kWh of heat → €0.055/kWh_thermal
AC heating (on solar): 1 kWh × €0.00 = free for 4.0 kWh of heat → €0.00/kWh_thermal

Even buying electricity at dynamic grid rates, the AC heats at a third of the cost of gas. On solar power, it's free heat.

Spring and fall are the sweet spot. March through May and September through October, outdoor temperatures hover between 5°C and 15°C — exactly where heat pump efficiency is highest (COP 4.0-4.5) and where there's still meaningful solar generation. About 70% of spring/fall AC consumption can run directly on PV. In those months, I'm effectively converting surplus solar into free warmth.

Winter is a different calculation. November through February there's barely any solar to work with, and the COP drops to 2.2-3.5 as temperatures dip. But even at COP 2.5 on grid electricity at €0.22/kWh, that's €0.088/kWh of heat — still half the price of gas. The AC won't replace the boiler in a cold snap, but it can handle the mild winter days (above 5°C) and shave 45-55% off annual gas consumption.

The branding is the problem, really. Call it "air conditioning" and people think summer luxury. Call it what it is — a multi-split heat pump that also cools — and suddenly it makes sense year-round.

Now compare the value of a single solar kilowatt-hour through each path:

Battery route: 1 kWh → battery (90% efficient) → 0.9 kWh evening electricity. Value: €0.20
AC heating route: 1 kWh → heat pump (SCOP 4.5) → 4.5 kWh heat, replacing 0.46 m³ gas. Value: €0.71

AC extracts 3.5× more value from the same solar kilowatt-hour. Self-consumption jumps from 30% to 47%.

Why a Hybrid Heat Pump Makes It Worse

Here's where it gets properly counterintuitive. If I told you I was considering a €6,000 hybrid heat pump (Daikin Altherma, €3,875 after subsidy), you'd nod approvingly. Responsible. Green. Sensible.

Except for the timing problem.

A hybrid heat pump for central heating consumes most of its electricity between October and March — roughly 1,100 kWh in the cold months. My solar panels produce about 250 kWh in those same months. That's 850 kWh of extra grid imports, piled onto the season when electricity is already most expensive.

It saves gas, absolutely. Around 450 m³/year, worth €390-433 in savings. But it actively deepens the solar mismatch. You're swapping one fuel bill for another while making your solar panels even less relevant.

The multi-split AC, by contrast, spreads its load across the year. Summer cooling consumes power when solar is abundant. Spring and fall heating hits the PV sweet spot. Even winter heating at least benefits from dynamic tariff optimization. The heat pump's consumption pattern is the wrong shape for a solar household.

The Actual Plan

After staring at spreadsheets long enough, the boring-but-correct answer emerged. Not one silver bullet, but three steps in the right order:

AC this spring (€3,500-4,000) — Start using surplus solar for heating and cooling immediately. Gas bill drops 45%, self-consumption jumps to 47%.
Dynamic electricity tariff in October 2026 (€0) — When my fixed contract ends, switch to a dynamic rate. In all twelve months of data I analyzed, dynamic was cheaper than my fixed rate. Saves ~€441/year for zero investment.
Home battery in early 2027 (€1,224) — Now the battery math improves. It captures what AC couldn't use, and does arbitrage on dynamic pricing. Pushes self-consumption to 55%.

Total investment: ~€5,200. Annual savings: €900-1,000. Payback in about 5 years. Annual energy costs drop from €3,100 to roughly €2,100-2,200.

The hybrid heat pump? Maybe in 2028 or 2029, once the easier wins are captured and gas prices give a clearer signal. It's not a bad investment — it's just not the first investment.

What I Actually Learned

The lesson isn't "buy an AC." The lesson is that twelve months of real data told a completely different story than the one I'd assumed walking in. I was ready to order a battery. The data said: wait, there's a better sequence.

Good measurement beats good intuition. The best energy investment isn't always the one that sounds greenest, and the right order matters more than the right equipment.

If you've got solar panels and a timing problem, run the numbers on your own household before listening to anyone — including me. Your mismatch might look nothing like mine. That's the whole point of measuring.

Monday, March 2, 2026

Building a NAS With AI: What Claude Got Right, Wrong, and Hilariously Confused About

Custom 3D-printed NAS enclosure with ZimaBlade and dual hard drives

Last month, I built a home NAS infrastructure from scratch: two TrueNAS servers, ZFS mirrored pools, automated replication, encrypted cloud backups, 28 monitoring checks, and a failover system that can switch servers in five minutes. The whole thing — design, documentation, configuration, migration — was done in collaboration with an AI coding assistant.

It was genuinely, surprisingly useful. It also tried to make my backup server unwritable, which would have broken the entire replication chain. So let's talk about what actually happened.

What I Built (and Why)

I wanted my family's data — photos, documents, project files — on infrastructure I control. Not on Google Drive, not on iCloud, not on any service that can change its terms, raise its prices, or hand my data to a government I didn't elect. That ruled out cloud-only storage. It also made me skeptical of vendor-locked NAS solutions like Synology, where you're one firmware update away from features disappearing or subscriptions appearing.

So I built it myself, using two small single-board servers (a ZimaBlade 7700 and a ZimaBoard 832) running TrueNAS — open-source, ZFS-based, no license fees. I already had one of the boards and one hard drive from a previous setup. The new hardware — a ZimaBlade, three 4TB drives — came to around €400. That's less than a single Synology DS224+ with equivalent drives, and I got two fully redundant NAS units out of it. A single Synology gives you one copy of your data in one location — you'd need a second unit plus a cloud subscription to achieve 3-2-1. I got both local copies covered for less than most people spend on their first NAS.

Both boards are tiny — credit-card-sized — so off-the-shelf cases weren't an option. I designed custom enclosures in FreeCAD that house a board and two 3.5" drives each, and printed them on my 3D printer. It's not the prettiest rack in the world, but it fits neatly in a utility closet and keeps everything ventilated.

The project follows the 3-2-1 backup rule: three copies of data, two different media, one off-site. Alpha is the primary NAS serving files via SMB and NFS to a Kubernetes cluster. Beta receives ZFS replication every six hours. Beta then uploads encrypted backups to a European cloud provider daily. Three copies, two locations, one off-site — with the off-site copy encrypted before it leaves my network.

There are 16 Architecture Decision Records documenting choices like "why are the pools unencrypted?" and "why does cloud backup run from Beta instead of Alpha?" There are 24 Standard Operating Procedures covering everything from drive replacement to disaster recovery. An 11-phase migration plan moved ~565 GB from the old NAS to the new setup.

All of that was produced in conversation with Claude Code, Anthropic's AI coding assistant. Not by typing prompts into a chat window — by working in a terminal where the AI could read files, run commands, SSH into the NAS boxes, and execute ZFS operations directly.

What AI Got Right

Documentation was the killer feature. I'm an engineer who knows what he wants but doesn't love writing 24 SOPs. Each procedure has YAML frontmatter (category, trigger, risk level, approval requirements), numbered steps, verification checks, and rollback instructions. Claude generated these from our conversations about how each operation should work. I described the intent, reviewed the output, and corrected the details. What would have taken me a weekend of reluctant writing happened naturally as a byproduct of the design process.

The Architecture Decision Records were similar. I'd explain a trade-off — "should I encrypt the ZFS pools or just encrypt the cloud backup?" — and get back a structured ADR with considered options, pros/cons, and a clear decision rationale. Sixteen of these, each capturing a decision I'd otherwise have kept in my head and forgotten the reasoning for six months later.

Architecture review caught real gaps. During one session, Claude pointed out that my monitoring system (Uptime Kuma, running on the Kubernetes cluster) depends on NFS storage from the very NAS it's monitoring. If Alpha dies, Kubernetes loses storage, Uptime Kuma goes down, and nobody gets alerted. I knew this intellectually, but having it surfaced during design — not during a 2 AM outage — meant I could add TrueNAS native email alerts as a fallback layer before it mattered.

Another catch: Beta wasn't actually ready for failover. The services and snapshot tasks hadn't been pre-configured. It would have taken 30-60 minutes of manual configuration during an actual failure. Claude flagged this during verification, we pre-configured everything in disabled state, and failover time dropped to five minutes.

Migration execution was where things got wild. Claude had SSH access to both NAS units via MCP (Model Context Protocol) servers. It could run zpool status, create datasets, transfer data, configure NFS exports, and verify replication — all while tracking progress in a migration document. Eleven phases — hardware build over a week, then software configuration through migration in a single intense day. The AI handled the tedious parts (creating 27 ZFS datasets with consistent properties, generating 60+ NFS subdirectory exports) while I made the judgment calls (when to cut over, whether the NFS mount errors were acceptable).

What AI Got Wrong

Here's where it gets interesting.

The readonly incident. During failover procedure design, Claude suggested setting zfs set readonly=on on Beta's datasets to "protect" the replication targets from accidental writes. Sounds reasonable, right? ZFS readonly blocks all writes — including zfs recv, the command that receives replication data. If I'd applied this, Beta would have silently stopped accepting replicas while looking perfectly healthy. My backup copy would have grown staler by the day with no alerts.

This is the kind of mistake that's terrifying precisely because it's plausible. An engineer who doesn't know ZFS internals might accept that suggestion. The fix was simple (Beta's write protection comes from not having active SMB/NFS shares, not from a ZFS property), but finding this in production instead of in review would have been ugly.

The Time Machine saga. Claude helped set up a Time Machine dataset on the NAS for Mac backups, complete with snapshot tasks. Then I realized a local USB drive was simpler for my one-laptop setup. Removing it took three fix commits: removing the snapshot tasks, destroying the dataset, and then chasing down Time Machine references scattered across multiple SOPs and documentation files. The AI was solving the general problem ("Mac users need backups") instead of my specific problem ("one Mac laptop that's already backed up to iCloud").

Ghost references from training data. Nextcloud kept appearing in suggestions and documentation, even though I never planned to run Nextcloud. Claude's training data is full of home lab setups that use it, so it kept assuming I would too. Similarly, I'd occasionally get suggestions referencing FreeBSD behavior — reasonable for older TrueNAS versions, but wrong for TrueNAS 25.10 which is Debian-based. The NFS subdirectory export issue during migration (75 minutes of unexpected downtime) was partly because the AI initially suggested approaches that work on FreeBSD but not on Linux.

Over-engineering suggestions. An AI assistant has no sense of "this is a home lab with two users." It treats your project with the same architectural rigor it would apply to a production system serving thousands. I had to repeatedly push back on suggestions for automated failover (I have two NAS boxes in the same house — I can walk to them), complex monitoring dashboards (Uptime Kuma is fine), and elaborate access control (it's my family's files).

The Guardrails That Saved Me

After the readonly incident, I got serious about constraints.

A safety rules file lives in the repository: never run zfs destroy without approval, never modify Beta's datasets directly, never change the primary NAS IP without coordinating with the Kubernetes cluster. Claude reads this file at the start of every conversation and respects it. It's the equivalent of putting "DANGER: HIGH VOLTAGE" signs on the electrical panel — obvious to humans, genuinely useful for AI.

ADRs as guardrails, not just documentation. The instruction "before proposing changes, check whether an ADR covers that area" turns 16 decision records into active constraints. When Claude suggests encrypting the ZFS pools, ADR-0001 stops it. When it suggests running cloud backup from Alpha, ADR-0002 stops it. The decisions compound — each one narrows the space of bad suggestions.

A memory system records mistakes and lessons across conversations. The readonly bug is in there. The Time Machine saga is in there. "CRC errors on Beta sdb suggest a possible SATA cable issue" is in there. This means the AI doesn't repeat known mistakes and carries forward context that would otherwise be lost between sessions.

Explicit approval gates for anything destructive. The AI can read pool status and list snapshots all day long, but it cannot destroy a dataset or roll back a snapshot without me typing "yes." This isn't a technical limitation — it's a convention enforced by the safety rules and, honestly, by me paying attention.

Would I Do It Again?

Without hesitation. But with different expectations than when I started.

AI is excellent at:

Documentation — generating structured, consistent docs from conversational design sessions
Consistency checking — finding mismatches between SOPs, ADRs, and README sections
Tedious execution — creating 27 datasets, 60 NFS exports, 28 monitoring checks without typos
Gap detection — "you have a failover procedure but no failback procedure" is exactly the kind of thing humans miss

AI is unreliable at:

Platform-specific behavior — ZFS on FreeBSD vs. Linux, TrueNAS GUI limitations, hardware quirks
Knowing when to stop — it will happily over-engineer a home lab into a production data center
Physical context — it doesn't know your NAS boxes are in the same house, your only Mac is already backed up, or that you have two users not two thousand

The key insight I keep coming back to: AI doesn't replace knowing your system — it amplifies what you already know. I understood ZFS, the 3-2-1 backup strategy, and what my family actually needs from a NAS. Claude helped me document that understanding, catch my blind spots, and execute the boring parts at speed. When it suggested something wrong, I caught it because I understood the domain.

If I'd been a complete beginner trusting the AI to design my backup strategy, the readonly bug would have made it to production. The Time Machine detour would have stayed. The FreeBSD assumptions would have caused more than 75 minutes of downtime.

The best use of AI in infrastructure isn't "build this for me." It's "I know what I want — help me document it, verify it, and catch what I missed." That framing kept the project on track through 82 commits, 16 decisions, 24 procedures, and one very educational mistake about ZFS write semantics.

Monday, February 16, 2026

I Don't Write YAML Anymore: How an AI Agent Runs My Home Lab

AI agent managing Kubernetes home lab infrastructure through GitOps

The Problem With YAML

I run over 20 services on a Kubernetes cluster at home. Photo management, vehicle tracking, DNS filtering, home automation, monitoring, a self-hosted AI chatbot, a public website — the usual collection that starts with "I'll just run one thing" and ends with a four-node cluster.

For a long time, the bottleneck wasn't Kubernetes itself. It was me, writing YAML. Every new service meant a deployment, a service, an ingress, network policies, persistent volumes, backup cronjobs, maybe a HorizontalPodAutoscaler. All following the same conventions. All tedious to get right. All boring after the first dozen times.

So I stopped writing it. I taught an AI agent to do it instead.

The Pipeline: Git In, Cluster Out

Before we get to the AI part, the foundation matters. The entire cluster is managed through GitOps — every manifest lives in a Git repository, and a GitOps controller watches for changes and reconciles the cluster state automatically.

The pipeline looks like this:

A change gets pushed to a feature branch
A pull request is opened
CI runs schema validation and a Kubernetes linter
The PR gets reviewed and squash-merged
The GitOps controller picks up the change and applies it within minutes

On top of that, a dependency bot watches for container image updates and opens PRs automatically. Minor and patch updates get auto-merged after CI passes. I don't touch those at all.

The key insight: once you trust the pipeline, you don't care who writes the YAML. Human or AI — the same CI gates apply, the same review process, the same GitOps reconciliation. The pipeline is the safety net.

The Builder: AI as Infrastructure Engineer

Here's what deploying a new service looks like now. I open a terminal, start an AI coding agent, and say something like:

Deploy a time-series database exposed to the LAN on port 8086
with persistent storage and automated first-boot setup.

The agent reads the existing repository — the directory structure, the naming conventions, how other services handle networking, storage, and security policies. Then it writes a complete set of manifests: namespace, deployment, service, persistent volume claims, network policies, and a kustomization file that ties it all together. It opens a PR with a clear description of what it did and why.

I review the PR. CI has already validated the schemas and linting. If it looks good, I merge. Five minutes later, the service is running.

What makes this work isn't magic — it's conventions. The repository has consistent patterns: every app lives in its own directory, network policies follow a default-deny model, secrets use a specific format, kustomizations follow the same template. The agent picks up on these patterns and replicates them. It's essentially doing what I would do, minus the part where I mistype an indentation level and spend twenty minutes debugging.

Real Examples

A recent favorite: I discovered that four backup cronjobs scheduled between 2:00 and 2:59 AM were silently skipping during the spring DST transition. That hour simply doesn't exist on the last Sunday of March. I told the agent to fix it. It shifted all four schedules to 1:00 AM, kept the 15-minute stagger between jobs, wrote a commit message explaining the DST edge case, and opened a PR. Total time from "huh, backups didn't run" to merged fix: about three minutes.

Another one: deploying a self-hosted AI chatbot with a messaging sidecar. That one was... less smooth. It took around 35 commits to get right — OOM kills, authentication mode confusion, init container issues, gateway binding problems, and a long detour through model provider configurations. The agent wrote every commit. But I was the one saying "no, try this instead" and "that's still not working, check the logs." The AI was fast at iterating, but it needed a human who understood the actual runtime behavior.

That's an honest picture of the dynamic. Some tasks are five-minute slam dunks. Others are collaborative debugging sessions where the AI does the typing and you do the thinking.

The Operator: AI for Cluster Diagnostics

Building infrastructure is one half. Keeping it healthy is the other.

I built a custom AI command — a single slash command in the terminal — that acts as a cluster management assistant. When I type it and ask "what's the cluster health status?", it spawns multiple sub-agents in parallel: one checks node health, another validates the control plane, another scans pod status across all namespaces, another verifies backups, another checks certificate expiration. They all run simultaneously and the results get synthesized into a single report.

It's not just a dashboard. I can ask it questions like "why is the photo service using so much memory?" and it'll pull metrics, check logs, review resource limits, compare to historical patterns, and give me a diagnosis with recommendations. I can ask it to troubleshoot a crash-looping pod, and it'll trace through the events, check for common causes, and suggest specific fixes.

For OS-level upgrades, the agent follows a strict safety protocol: preflight checks, backup verification, one node at a time, health gates between each step, mandatory soak time between control plane nodes, and automatic rollback triggers if something goes wrong. It offers dry-run mode by default before any destructive operation.

On top of that, there's a small custom monitoring service running in the cluster that exposes health data as a JSON API. An ESP32 microcontroller with a tiny display polls this endpoint and shows real-time cluster health — a physical dashboard on my desk. When the overall health score drops below a threshold, I know to open a terminal and ask the AI what's going on.

What Works, What Doesn't

What works well:

Convention enforcement. The agent is better than me at following the repository's own patterns consistently. It doesn't forget network policies or skip liveness probes because it's Friday afternoon.
Speed of iteration. Going from intent to PR in minutes instead of half an hour of YAML wrangling.
Parallel diagnostics. The operator command checking six things at once instead of me running kubectl commands one by one.
Knowledge retention. The agent remembers past deployment patterns, known gotchas, and operational procedures across sessions.

What doesn't work well:

Runtime awareness. The agent can read manifests and git history, but it doesn't inherently know what's happening in the cluster right now. You have to tell it to check, or give it access to the right tools.
Over-engineering. Left unchecked, it'll add three layers of abstraction to a problem that needed two lines of config.
Novel problems. When something genuinely new goes wrong — like a NAS outage cascading into postgres startup failures and pod scheduling issues — the agent helps execute the recovery, but the human still has to understand the failure mode and direct the response.

The Human's Job Now

I still make every architectural decision. I decide what services to run, how they should be networked, what security model to use, when to upgrade. I review every PR before it merges. I'm the one who notices that DST is eating my backups in the first place.

What I don't do anymore is translate those decisions into YAML by hand. I don't copy-paste network policy boilerplate. I don't look up the kustomization schema for the fourth time this month. I don't manually run health checks across a dozen namespaces.

The AI handles the mechanical parts. I handle the parts that require judgment, context, and understanding of what the infrastructure is actually for.

Is this overkill for a home lab? Maybe. But a home lab was always about learning more than it was about the services themselves. And learning how to work effectively with AI agents on real infrastructure — with real consequences when things break — feels like exactly the right thing to be doing right now.

The YAML still gets written. I just don't write it anymore.