Building the health coach I never had

For 20 years I’ve kept a written log of how I train and eat. The medium evolved (a composition book in the naughts, a notepad app in the 2010s, a parade of underwhelming apps since), but the habit held; what none of those tools ever did was reason about the data. I lift on a seven-day split, which assumes every week looks identical, and mine never does: work dinners, travel, the occasional night when the smartest exercise is sleep. The plan says Sunday is legs. Sometimes Sunday is a flight.

Every fitness app handles this the same way, by assuming I did whatever the calendar said or making me fix it by hand. I wanted the opposite: something that reads what I actually trained, works out where I am in my cycle on its own, and tells me what to do today, accounting for how I slept and recovered. So I built it, and handed the judgment a coach would exercise to an AI model. I call it HealthBot.

HealthBot system architecture: a daily pipeline from launchd and Garmin through normalization and an AI model to a Discord brief.

The mechanics are simple. A Python script runs each morning, pulls my recovery and workout data from Garmin over MCP, reads my training plan from a markdown file in git, hands all of it to an AI model, and posts a two-part brief to a Discord channel. It has run every day since early May, and most of what makes it good came out of the three places the first version was wrong.

The first version read the calendar. The first time I moved legs from Sunday to Monday, it prescribed legs on Sunday anyway, confidently and uselessly. The fix was to stop trusting the calendar and start trusting the history: the model reads my recent sessions, identifies the last one by its contents (pulling and curling mean a back day; pressing and dips mean chest), and prescribes the next in the cycle. The weekday becomes a sanity check, not the source of truth. Trust what happened, not what was planned; that inversion is the whole personality of the system.

The second problem surfaced only in daily use. To find the last session, the script pulls recent Garmin activities, and twenty barely covered a week, because my dog walks alone add a couple a day. A lift I do once a week could fall off the list, and the brief would report no history for something I’d done days earlier. It now pulls a far wider window and fetches the expensive set-by-set detail only where it’s needed.

The third was the data itself, which Garmin builds for charts, not coaching. Every strength workout carries the same generic label, so the only way to tell a leg day from a back day is to read the movements inside it. Weights arrive as integer grams, sessions are padded with rest periods, bodyweight moves report no load, and Garmin’s exercise names don’t match mine, typos included. A normalization layer sits between its data and the model, so the prompt reads “three clean sets,” not a dump of mixed units.

The piece I’m most satisfied with is the least flashy. My plan lists a target weight for each movement, and the naive design prescribes it. But a target is where I want to be, not where I am, and prescribing weight I can’t move cleanly is how people get hurt. So HealthBot prescribes what I last lifted, shows the target beside it, and makes a call: hold and sharpen the technique, or add load, with a reason either way. The standing instruction is to show me the data and let me decide: it coaches, it doesn’t manage, which is exactly what commercial apps get wrong.

Two choices reflect a bias against fooling myself: a failed run announces itself in Discord instead of dying in a log, and it runs on bare metal, not in a container, to avoid two processes racing over a shared auth session.

The next step is making the brief two-way: I reply to swap a day, flag travel, or cut a session short, and tomorrow’s brief absorbs it. The hard part isn’t the plumbing; it’s the design question of how much a coach should let you negotiate before it stops being a coach and becomes a yes-man. That’s the line I’m drawing now.