[CODE]

Google Sheets as a State Machine for Automation Workflows

Instead of a database, I used Google Sheets as the source of truth for an ads automation system. Clients can see and edit the state directly — no admin panel needed.

6 min read
google-sheets automation python architecture

When I started building bnb-ads-manager — a keyword automation system for B&B owners running Google Ads — I had the usual instinct: spin up a Postgres table, write a small admin panel, connect it all together. The whole thing.

But then I stopped and asked: who’s actually going to look at this admin panel?

The clients are family-run B&B owners in Taiwan. They don’t have IT staff. They’re not developers. They run their business from a phone and a laptop, and the software they actually use every day is Google Sheets. So I made a bet: what if the spreadsheet is the database?

This wasn’t just a convenience decision. It was a question about what “operational” means for a client who can’t call IT when something looks wrong. If the state of the system lives in a Postgres table, I’m the only one who can read it. If it lives in a spreadsheet, the client can open it right now and see exactly what’s happening — and if they disagree with a recommendation, they can change it themselves.

The Keyword Inventory Sheet

The core of the system is a sheet called Keyword_Inventory. Each row is a keyword, and each keyword has a bunch of columns that track its current state:

ColumnDescription
CriterionIDGoogle Ads criterion ID for the keyword
KeywordThe search term
AdGroupIDThe ad group this keyword belongs to
StatusActive or Removed
Date_AddedWhen the keyword was added (YYYY-MM-DD)
Date_RemovedWhen it was removed; empty if still active
ReasonFree-text explanation for the last state change

The Reason column deserves attention. Every time a keyword gets removed, Python writes back a plain-English explanation: “30 days, 3 clicks, 0 conversions” or “overlaps with higher-performing synonym.” The B&B owner can read that on their phone without asking me anything. That traceability is what makes the system feel like something they own, not something they’re trusting blindly.

The Date_Added column is where the state machine logic actually lives. After a keyword is added, Python checks whether it has been in the account for at least 14 days before passing it to Claude for evaluation. New keywords need time to accumulate data before a judgment call makes sense — sending a keyword to Claude after three days of impressions produces noise, not signal. The protection window filters that out.

from datetime import date

def is_protected(row: dict) -> bool:
    date_added = row.get("Date_Added", "")
    if not date_added:
        return False
    try:
        added = date.fromisoformat(date_added)
        return (date.today() - added).days <= 14
    except ValueError:
        return False

Simple. No SQL, no ORM, no migration files. Just Python reading a date string from a cell and doing arithmetic.

Reading and Writing via Sheets API v4

The workflow is:

  1. Python reads the whole sheet via Sheets API v4
  2. Filters out protected rows
  3. Passes the rest to Claude with performance data
  4. Claude returns decisions
  5. Python writes the decisions back to the sheet

Reading is straightforward — one call to spreadsheets.values.get, then I zip the header row with each data row into a list of dicts. Writing back is a batch update so I’m not hammering the API row by row.

The one thing that bit me early was rate limits. The Sheets API allows 100 requests per 100 seconds per user. When I was testing with naive sequential writes, I’d occasionally hit a 429. The fix was exponential backoff with jitter:

import time
import random

def sheets_write_with_retry(service, spreadsheet_id, range_, values, max_retries=5):
    for attempt in range(max_retries):
        try:
            service.spreadsheets().values().update(
                spreadsheetId=spreadsheet_id,
                range=range_,
                valueInputOption="RAW",
                body={"values": values},
            ).execute()
            return
        except HttpError as e:
            if e.resp.status == 429 and attempt < max_retries - 1:
                wait = (2 ** attempt) + random.uniform(0, 1)
                time.sleep(wait)
            else:
                raise

Not glamorous, but it works reliably.

The Race Condition Problem (and a Boring Solution)

The scariest part of using a spreadsheet as a state machine is concurrent writes. If two runs happen at the same time, you could get interleaved writes that corrupt the state. Two processes both read row 5 as “active”, both decide to pause it, both write back — the second write might overwrite the first with slightly different data.

I thought about file locks, lease columns, even a lightweight Redis flag. In the end I did something much simpler: the job only runs via workflow_dispatch in GitHub Actions. No scheduled crons that could overlap. If I want it to run, I trigger it manually. If I eventually add a schedule, I’ll run it with concurrency: cancel-in-progress in the workflow YAML.

It’s a constraint, not a solution. But it matches the actual usage pattern — a B&B owner with a handful of campaigns doesn’t need this running every 5 minutes. Weekly is enough, and weekly means race conditions aren’t a real risk yet.

Why This Works for the Client

The thing I keep coming back to is the audit trail. Every removal gets written back as a row with Status: Removed, Date_Removed, and a plain-English Reason. The client can open the sheet on their phone and see: “Removed — 30 days, 3 clicks, 0 conversions.” They don’t need to ask me. They don’t need a dashboard.

And if they disagree with a decision? They can flip Status back to Active themselves and clear Date_Removed. The system will treat it as an active keyword on the next run. No API calls, no support tickets. The client has direct control without needing any technical knowledge beyond knowing what a spreadsheet cell is.

This matters more than it sounds. Most small business automation tools put a wall between the client and the data — you get a dashboard that shows you summaries, but you can’t touch the underlying state. Here the client is a first-class actor in the system. Their edits are respected on the next run, no questions asked.

The tradeoff is that I’m trusting users not to accidentally delete columns or paste bad data. That’s a real risk. I handle it defensively in Python by validating every row before processing, and skipping rows with missing required fields. It’s not bulletproof, but for a tool used by a handful of clients who care about their own ad spend, it’s been fine.