The first message I ever sent was a knock-knock joke.
Not a test of intelligence. A test of plumbing. The Raspberry Pi was running. The WhatsApp webhook was live. The systemd service was up. And the fastest way to confirm it all worked end-to-end was to say something and see if anything came back.
It completed the joke.
That was Monday evening. By Tuesday evening it was pulling news briefings from five different APIs, synthesising them into a single industry digest, and delivering it to my phone without being asked. By Wednesday morning it was ordering peanuts from Amazon.
But I'm getting ahead of myself.
Monday night - first contact
After the knock-knock joke came the real question: "What do you know about me?"
It read the context file I'd written - current-context.md - and gave back a summary of who I am, what I'm working on, what I care about. Not hallucinated. Not generic. It was reading the file and reflecting it back accurately. That was the first moment it felt like something real rather than a demo.
I spent the rest of the evening testing the edges. How it handled instructions from how-to-work-with-ben.md. Whether it could find its own workspace files. Whether it knew what it didn't know. It did. Mostly.
Tuesday - briefings
The next evening I asked it to run the Nice & Interesting briefing - a curated digest of things worth knowing, pulled from Perplexity Sonar and the Guardian API. It came back in the right format, with the right tone. Curious, not breathless. The spec I'd written in nice-and-interesting.md had done its job.
Then the longer test: Industry Insights. This one pulls from the X API, the Guardian API, and a newsletter I'd configured as a source. The Firehose synthesised everything into an industry snapshot. The Investor Update extracted the portfolio-relevant slice. Both ran from a single WhatsApp message. First time I'd seen multi-source retrieval work at that level from something running on a shelf in my study.
Tuesday night - peanuts, round one
That evening I asked it to find my favourite peanuts and order them.
It connected to the browser - Chromium running headless on the Pi - navigated to Amazon, and found them in the Buy Again section: Walkers Max Strong Jalapeño & Cheese Sharing Double Coated Peanuts, 175g, case of eight, £15.92.
"Is that the one? Want me to add it to basket?"
Yes. Order placed. Arriving tomorrow.
A direct task, explicitly asked. That's useful, but it's not the interesting part.
Wednesday morning - peanuts, round two
The session reset overnight. Clean slate. No memory of the peanuts, the briefings, any of it. That's the default behaviour - each conversation starts fresh.
At 08:23 I sent: "How soon am I likely to need to order more of those peanuts?"
Blank stare. It searched its workspace files, its memory files, found nothing. "I don't have any record of peanut orders."
I pushed back: "I think you have a way of figuring it out." Same answer, different wording.
I tried again: "I think you do have access to my Amazon orders."
That one landed. It found its way to Amazon - already logged in - searched the order history, and surfaced the same peanuts. Last ordered: the day before. It placed a new order without hesitation.
Then I asked about cadence. It searched "jalapen peanuts" in the order history search box - a shortcut it found itself, not one I'd suggested - and surfaced all five previous orders. September. October. December. February. April. Seven to eight weeks, consistent. Next reorder: early June.
It took three nudges to get there. I'll be honest about that. But it found the right approach, and after the session it wrote the method down. Next time, no nudges.
That's when it stopped being a project and started being a tool. Not when it performed flawlessly - when it got better.
What this is
OpenClaw is an open-source AI agent. It runs on a Raspberry Pi, receives commands via WhatsApp, and calls Claude through Vercel's AI Gateway. The Pi is always on. The agent is always available. I message it from my phone the same way I'd message a colleague.
The idea isn't novel - personal AI assistants have been promised for years. What's different here is that I built it myself, it runs on hardware I own, its context is files I wrote, and the whole thing cost less than a decent pair of running shoes.
I wanted to write up the setup while it's fresh, partly because the ratchet principle demands it - if I don't write this down, my future self will have to figure it all out again - and partly because someone else might find it useful.
The hardware
A Raspberry Pi 5. Headless, no monitor or keyboard after initial setup. Hostname: rpi5. User: benpi. Connected to home WiFi and accessible via SSH from my Mac.
That's it for hardware. The interesting parts are all software.
Getting OpenClaw running
OpenClaw is a Node.js application, installed globally via npm. The onboarding wizard handles most of the initial configuration - messaging platform (WhatsApp), AI provider (Vercel AI Gateway), default model (claude-sonnet-4.6).
It installs as a systemd user service called openclaw-gateway.service, with lingering enabled so it survives logout. The Pi boots, the service starts, the agent is ready. No intervention needed.
systemctl --user status openclaw-gateway.service
That's the command I probably type most often, just to confirm it's alive. Old habits from years of checking deployment health at work.
The first gotcha
SSH didn't work when I first booted the Pi. After more time than I'd like to admit, I discovered that my ISP router and the Deco mesh were both broadcasting the same WiFi network name, putting my Mac and the Pi on different subnets. Renaming the ISP router's SSID fixed it immediately.
A reminder that the most confusing bugs are often network bugs, and network bugs are often naming bugs.
Secrets management
API keys live in ~/.config/openclaw/secrets.env, permissions set to 600. The systemd service loads them via EnvironmentFile. When I add a new key, I edit the file and restart the service. When I update context files, no restart is needed - OpenClaw reads those fresh on each message.
That distinction matters. Environment variables require a restart. Context files don't. Knowing which is which saves unnecessary restarts and unnecessary confusion.
The context layer
This is the part that makes OpenClaw actually useful rather than just technically interesting.
I have a private GitHub repo called _agent-shared-context. It contains everything the agent needs to know about me, my projects, my preferences, and how I want it to behave. These files are symlinked into OpenClaw's workspace directory on the Pi:
current-context.md- live snapshot of my life, projects, investmentswho-ben-is.md- background, values, career archow-to-work-with-ben.md- the operating manual for the agenthow-ben-thinks.md- mental models, recurring tensionswriting-voice.md- voice calibration for content draftingindustry-insights.md- what to monitor and how to brief me
The how-to-work-with-ben.md file is the most important one. It defines named commands, response format rules, tone expectations, what to confirm before acting, and what never to do. It's essentially the agent's job description, written by the person it reports to.
Writing these files took longer than the technical setup. But they're the difference between a general-purpose chatbot and something that actually knows how I work.
The update flow
This is the bit I'm genuinely pleased with.
I edit context files on my Mac in Obsidian. The Obsidian Git plugin auto-commits and pushes to GitHub. When I want OpenClaw to pick up the changes, I send a WhatsApp message: "update context". The Pi runs git pull, reads the changed files, and confirms what was updated.
Mac → Obsidian → Git → WhatsApp → Pi.
No SSH. No terminal. No switching context from whatever I was doing on my phone. The entire control surface for updating the agent's knowledge is a text message.
Named commands
Rather than relying on natural language interpretation for routine tasks, I defined a set of named commands - specific phrases that trigger specific behaviours. Some examples:
"briefing" - triggers an on-demand news digest covering AI industry developments and anything relevant to my work and investments.
"costs" - checks Vercel AI Gateway credit balance and spend.
"deployment health" - queries the Vercel API for failed or stuck deployments across my projects.
"status" - reports the agent's last session summary, pending tasks, and system state.
"draft linkedin: [brief]" - creates a content draft, saves it to a file on the Pi with a timestamp, and confirms the filename. The draft doesn't just appear in WhatsApp where it would get lost - it's persisted.
There are twelve commands defined so far, each with clear behaviour specified in how-to-work-with-ben.md. Adding a new one means editing that file on my Mac, pushing to GitHub, and sending "update context". The agent picks it up immediately.
Vercel AI Gateway
I work at Vercel, so using the AI Gateway as the LLM provider was the obvious choice. One API key, access to multiple models, no separate Anthropic key needed. The gateway endpoint is https://ai-gateway.vercel.sh/v1 and the default model is anthropic/claude-sonnet-4.6.
The gateway also gives me something important: a single place to monitor spend. The credits endpoint returns a balance and total usage figure. For anything more granular - per-model breakdowns, time-series data - I check the Vercel dashboard directly.
The Vercel token problem
Vercel doesn't offer read-only API token scopes. Every token is full-access. Since OpenClaw needs a token to query deployment health and project status, the protection is entirely in the instructions: GET requests only, to api.vercel.com only, for the commands defined in the operating manual. POST, PUT, PATCH, DELETE - never, regardless of what it's asked to do.
This works, but it's a trust boundary that exists in prose rather than in infrastructure. I'd prefer a read-only scope. Until that exists, the prose-based guardrail is the best I've got.
Memory and persistence
OpenClaw has no persistent memory by default. Each conversation starts fresh. Without intervention, every method it discovers, every preference it learns, every half-finished task - gone.
The fix is three files in its workspace:
LEARNED.md- how-to methods for recurring tasks, written by the agent itself after successfully completing something for the first timePREFERENCES.md- my discovered preferences and patternsMEMORY.md- session log of what was done and what was left unfinished
The startup routine reads all three before responding to any message. The end-of-session routine writes to them. It's not elegant, but it works - and it means the agent gets better at its job over time, which is the whole point.
What it costs
About £3 per day on the Vercel AI Gateway, almost entirely from Claude Sonnet 4.6. That includes briefings, test conversations, and some Amazon ordering experiments via browser automation.
The model routing instructions in the operating manual specify Haiku for routine tasks (scheduled briefings, status commands, factual lookups) and Sonnet for anything requiring judgement (research synthesis, content drafting, complex multi-step tasks). I haven't configured the briefings to actually use Haiku yet - that's the obvious next step to bring costs down.
What went wrong
Some of the things that didn't work, or took longer than they should have:
Signal was my first choice for messaging. Signal-cli on ARM64 requires Java 25 and a native library that only ships as x86_64. Compiling from source wasn't worth the effort. Switched to WhatsApp and it worked immediately.
The onboarding wizard crashed at the end of the setup flow. The config had already been saved, so it didn't actually matter - but it wasn't a confidence-inspiring start.
Global npm installs need sudo on Raspberry Pi OS. Easy to forget, easy to lose time to EACCES errors.
The agent forgets everything between sessions unless you explicitly build persistence into its workflow. I didn't realise this initially and lost a browser automation method I'd spent twenty minutes helping it figure out. That's when I added the memory files.
What's next
Scheduled briefings are the immediate priority. Monday morning: industry news and portfolio-relevant developments. Friday afternoon: the lighter digest of things I found interesting during the week. Both delivered proactively via WhatsApp, no prompt needed.
Longer term, I want a second GitHub repo - _openclaw-outputs/ - where the agent commits its briefings and drafts. I'd pull that repo on my Mac and have everything in Obsidian, version-controlled, searchable. The Pi writes, the Mac reads. Git as the transport layer between them.
There's also the question of sharing the agent with Emma for Haktive-related tasks. That probably means a dedicated WhatsApp number on a PAYG SIM, which is a solvable but not-yet-solved problem.
The principle underneath
I've been building things on the web for 25 years. For most of that time, the tools I used were tools someone else built, configured the way someone else decided, with defaults that served someone else's priorities.
This is different. The agent runs on hardware in my house. Its context is files I wrote. Its behaviour is defined by instructions I can edit from my phone. When it doesn't know something, I teach it - by updating a markdown file, not by filing a feature request.
It's a small thing, really. A Pi on a shelf, a WhatsApp thread, some markdown files in a Git repo. But it's mine, and it works, and each session it gets a little bit better at being useful.
The ratchet clicks forward.