Lab 1 — Scheduling with launchd & cron; Durable State and Idempotency in SQLite
Time: ~3.5 hrs · Difficulty: Core · Builds on: Month 6 (agent loop, cost math) and Month 5 (structured Python)
Objective
Make an agent run on a clock without you, and make it survive a crash without redoing or duplicating work. You will schedule a small agent with macOS launchd (the canonical local scheduler) and write the equivalent cron line, then give it a SQLite-backed job queue with idempotency keys so that a re-run after a crash never double-processes a unit of work. By the end you can kill the agent mid-task and watch it resume cleanly on its next scheduled wake. This is the durable, scheduled foundation every later lab deploys on top of.
Setup
mkdir -p ~/agentic/month-11 && cd ~/agentic/month-11
uv init --python 3.12 . 2>/dev/null; uv add --quiet anthropic 2>/dev/null || true
mkdir -p logs inputs
# Free model layer (used as the agent's "work"):
brew install ollama 2>/dev/null; ollama pull qwen2.5:3b
You will reuse the agent loop from Month 6. For this lab the “work” can be trivial (summarize one input file) — the point is the scheduling and durability, not the model. Everything runs on Ollama for $0.
Background
Recall first (from memory): In Month 6, when your agent crashed mid-run, what happened to its progress — and where, if anywhere, was that progress stored? Hold that answer; this lab is the fix for it.
An unattended agent is defined by what happens when it is interrupted — and it will be interrupted (crash, reboot, launchd restart, network blip). Two properties make interruption a non-event: idempotency (running twice = running once) and durable state (progress committed to disk, not held in memory). SQLite is the free default store: one file, ACID transactions, ships with macOS and Python. See README §3–§4 for the concepts; this lab makes them concrete.
The crash-and-resume flow you will build and then prove:
flowchart TD
A["Claim pending job from SQLite"] --> B["Do the work"]
B --> C{"Crash before commit?"}
C -->|Yes| D["Job stays 'pending' in DB"]
D --> A
C -->|No| E["Commit status='done'"]
E --> F["Next claim skips it"]
Notice: state lives in the database, not in memory — a crash before the commit just leaves the job ‘pending’, so the next run re-claims it and the work happens exactly once.
Steps
1. Model the work as a durable SQLite queue
The genuinely new skill here is idempotent durable state — a queue where enqueuing the same work twice is a no-op and progress survives a crash. We build it in three stages: study a complete worked example, fill in a faded one, then extend it independently.
Stage 1 — Worked example (I do)
Create store.py exactly as below and run it. The database is the single source of truth — the agent holds no progress in memory. Read every annotated line; you are not inventing anything yet, just running and understanding.
# store.py — durable job queue with idempotency keys
import sqlite3, hashlib
def connect(path="agent.db"):
db = sqlite3.connect(path)
db.execute("""CREATE TABLE IF NOT EXISTS jobs(
key TEXT PRIMARY KEY, -- idempotency key: dedup happens here
payload TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
attempts INTEGER NOT NULL DEFAULT 0,
result TEXT,
updated REAL DEFAULT (strftime('%s','now')))""")
db.commit()
return db
def enqueue(db, payload: str) -> str:
key = hashlib.sha256(payload.encode()).hexdigest()[:16] # stable key from content
# INSERT OR IGNORE: enqueuing the same work twice is a harmless no-op (dedup)
db.execute("INSERT OR IGNORE INTO jobs(key,payload) VALUES(?,?)", (key, payload))
db.commit()
return key
def claim_pending(db):
return db.execute(
"SELECT key,payload FROM jobs WHERE status='pending' ORDER BY updated LIMIT 1"
).fetchone()
def mark_done(db, key: str, result: str):
db.execute("UPDATE jobs SET status='done', result=?, updated=strftime('%s','now') "
"WHERE key=?", (result, key))
db.commit()
Checkpoint: uv run python -c "import store; db=store.connect(); print(store.enqueue(db,'hello')); print(store.enqueue(db,'hello'))" prints the same key twice — the second enqueue is a silent no-op. Run sqlite3 agent.db "SELECT count(*) FROM jobs" and confirm there is exactly one row, not two. That is deduplication.
If not: two different keys usually means you re-derived the key from something non-stable (a timestamp or id()), not the payload content — the key must be sha256(payload). Two rows means you wrote INSERT instead of INSERT OR IGNORE. Delete agent.db and re-run after fixing.
Stage 2 — Faded practice (we do)
Add one function to store.py yourself: mark_failed(db, key), which increments attempts and leaves status='pending' so the job will be retried (you will use this in Lab 3). The skeleton and expected behavior:
def mark_failed(db, key: str):
# TODO: increment the attempts counter for this key by 1
# TODO: keep status = 'pending' so claim_pending picks it up again
# TODO: update the 'updated' timestamp, then commit
...
Checkpoint: enqueue one job, call mark_failed on its key twice, then sqlite3 agent.db "SELECT attempts,status FROM jobs" shows attempts=2 and status='pending'.
If not: if attempts stayed 0, you forgot the db.commit() or wrote =1 instead of = attempts + 1 (use SET attempts = attempts + 1 so SQLite does the increment). If the row vanished, you used DELETE — only update, never delete, in this function.
Stage 3 — Independent (you do)
With no skeleton, add a counts(db) function that returns a dict of {status: count} across the jobs table (e.g., {"pending": 3, "done": 1}). You will reuse this as the seed of Lab 2’s status dashboard. Definition of done: uv run python -c "import store; print(store.counts(store.connect()))" prints a dict whose totals match SELECT count(*) FROM jobs.
2. Write the agent that drains the queue idempotently
Create agent.py. It claims one pending job, does the work (a tiny Ollama call), and commits done — and it logs each step so the absent operator can follow along.
# agent.py — claims and processes one job; safe to run repeatedly
import json, sys, time, subprocess
import store
def do_work(payload: str) -> str:
# The "agent" — swap in your Month 6 loop. Here: a one-shot local summary.
out = subprocess.run(
["ollama", "run", "qwen2.5:3b", f"Summarize in one line: {payload}"],
capture_output=True, text=True, timeout=120)
return out.stdout.strip()
def main():
db = store.connect()
job = store.claim_pending(db)
if not job:
log(event="idle", msg="no pending jobs"); return
key, payload = job
log(event="claim", key=key)
result = do_work(payload) # if we crash here, status is still 'pending'
store.mark_done(db, key, result) # committed atomically -> durable
log(event="done", key=key, result=result[:80])
def log(**ev):
ev["ts"] = time.time()
print(json.dumps(ev), flush=True) # JSONL to stdout -> launchd captures it
if __name__ == "__main__":
main()
Seed a few jobs and run it once:
uv run python -c "import store; db=store.connect(); [store.enqueue(db,x) for x in ['the cat sat','the dog ran','rain fell']]"
uv run python agent.py
Checkpoint: The first run prints a claim then a done JSON line for one job. Run uv run python agent.py two more times and confirm it processes the other two jobs, then prints {"event": "idle", ...} — every job is processed exactly once, no duplicates.
If not: if the same job is processed twice, mark_done is not committing (check the db.commit() inside it) so claim_pending keeps returning it. If you get an Ollama error rather than a done line, see the “Ollama call hangs or errors” item in Troubleshooting.
3. Prove durability: crash mid-task and resume
Add a crash you can trigger, to prove progress survives. Temporarily edit do_work to raise SystemExit("simulated crash") after the model call but before mark_done, then run the agent.
uv run python agent.py # crashes "mid-task"
sqlite3 agent.db "SELECT key,status FROM jobs WHERE status='pending' LIMIT 1"
Checkpoint: The job the agent was working on is still pending in SQLite — the crash did not lose it and did not mark it done. Remove the simulated crash; run the agent again and confirm it re-claims that same job and completes it. The work happened exactly once despite the crash, because state lived in the database, not in memory.
If not: if the job shows done after the “crash,” you put the raise after mark_done — move it to before the commit so the failure happens mid-task. If no job is pending, you may have already drained the queue; re-seed jobs from Step 2 first.
4. Schedule it with launchd
The launchd scheduling lifecycle you are about to drive:
stateDiagram-v2
[*] --> Unloaded
Unloaded --> Loaded: launchctl load
Loaded --> Running: schedule tick or start
Running --> Loaded: agent.py exits
Loaded --> Unloaded: launchctl unload
Unloaded --> [*]
Notice: load registers the agent and unload deregisters it; between ticks it sits in Loaded, not Running — that “dead between ticks” state is exactly what makes a scheduled agent safer than a continuous loop.
Create ~/Library/LaunchAgents/com.you.month11.plist. Use absolute paths — launchd has a minimal environment and no PATH. Find your paths first: which uv and pwd.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key> <string>com.you.month11</string>
<key>ProgramArguments</key>
<array>
<string>/opt/homebrew/bin/uv</string> <!-- output of `which uv` -->
<string>run</string>
<string>agent.py</string>
</array>
<key>WorkingDirectory</key> <string>/Users/you/agentic/month-11</string>
<key>StartInterval</key> <integer>300</integer> <!-- every 5 min, for testing -->
<key>StandardOutPath</key> <string>/Users/you/agentic/month-11/logs/out.log</string>
<key>StandardErrorPath</key><string>/Users/you/agentic/month-11/logs/err.log</string>
</dict>
</plist>
Load and verify:
launchctl load ~/Library/LaunchAgents/com.you.month11.plist
launchctl list | grep com.you.month11 # confirm it is registered
launchctl start com.you.month11 # force a run now instead of waiting
sleep 3; cat logs/out.log # see the JSONL the run emitted
Checkpoint: launchctl list | grep com.you.month11 shows your label, and logs/out.log contains the claim/done (or idle) JSON lines from the scheduled run. The agent now runs every 5 minutes without you. For a real digest you would switch StartInterval to a StartCalendarInterval dict (Hour/Minute) — see README §3.
If not: empty logs almost always mean a path problem (launchd has no PATH) or a missing logs/ directory — check logs/err.log, confirm which uv matches the plist, and that mkdir -p logs ran. See the “launchd didn’t run my agent” item in Troubleshooting.
5. Write the cron equivalent (portability for Lab 3’s cloud host)
launchd is macOS-only; the Linux cloud host in Lab 3 uses cron. Capture the equivalent now so you are not relearning scheduling later. The crontab line for “every 5 minutes” is:
*/5 * * * * cd /home/you/agentic/month-11 && /home/you/.local/bin/uv run agent.py >> logs/out.log 2>&1
Add it to a CRON.md note in your project. (On macOS you can also test it with crontab -e, but launchd is the canonical local choice.)
Checkpoint: You can explain the five cron fields (minute, hour, day-of-month, month, day-of-week) and why the launchd StartCalendarInterval with Hour/Minute is the macOS equivalent of a daily cron line. You have both written down.
If not: if the five fields blur together, write out 0 6 * * * and read it left-to-right: “minute 0, hour 6, every day-of-month, every month, every day-of-week” = 6:00 a.m. daily. That single example anchors the rest.
6. Stop it cleanly
launchctl unload ~/Library/LaunchAgents/com.you.month11.plist
launchctl list | grep com.you.month11 # now returns nothing
Checkpoint: The label no longer appears in launchctl list — the scheduled agent is fully stopped and deregistered. (This unload is also your out-of-band kill switch, which Lab 2 builds on.)
If not: “Could not find specified service” means it was already unloaded — harmless. If the label still shows, you unloaded a different path than you loaded; pass the exact same plist path to unload that you passed to load.
Definition of Done
store.pyimplements a SQLite job queue whereenqueuededuplicates on an idempotency key (INSERT OR IGNORE), verifiable by double-enqueueing and seeing one row.agent.pyclaims one pending job, does the work, and commitsdone— and processing the queue twice never duplicates work.- You have demonstrated durability: a simulated crash mid-task leaves the job
pending, and a re-run completes it exactly once. - A
launchdplistrunsuv run agent.pyon a schedule, is loaded, appears inlaunchctl list, and writes JSONL tologs/out.log; you canload/start/unloadit. - A
CRON.mdrecords the equivalent crontab line for Lab 3’s Linux host. - Self-verify:
sqlite3 agent.db "SELECT status,count(*) FROM jobs GROUP BY status"shows all seeded jobs asdone, andlaunchctl list | grep com.you.month11(while loaded) returns your label.
Self-explain: in one sentence, why does a crash mid-task not cause this agent to lose or double-process work?
Stretch Goals
- Calendar schedule. Switch the
plisttoStartCalendarIntervalto run at a real time (e.g., 06:00 daily) and confirm with the next-day run inlogs/out.log. - caffeinate wrapper. Wrap the program in
caffeinate -i(via a tiny shell launcher theplistcalls) so a sleeping Mac still runs the agent; confirm it fires while the screen is off. - KeepAlive continuous mode. Make a second
plistwith<key>KeepAlive</key><true/>pointing at awhile Trueversion of the agent, and confirm launchd restarts it when youkillthe process. (Foreshadows §2’s scheduled-vs-continuous choice.) - Queue introspection. Add a
--statusflag toagent.pythat prints counts by status as JSON — the start of the dashboard you build in Lab 2.
Troubleshooting
- “launchd didn’t run my agent.” Almost always a path problem: launchd has no
PATH, so every binary and file must be an absolute path. SetWorkingDirectory, use the full output ofwhich uv, and checklogs/err.logfor the real error. launchctl loadsays “service already loaded.” You loaded it before.launchctl unloadfirst, thenloadagain. Editing aplistrequires an unload/load cycle to take effect.- Nothing in the logs. Confirm
StandardOutPath/StandardErrorPathpoint to a directory that exists (mkdir -p logs) and is writable; launchd will not create missing parent directories. database is locked. Two processes wrote SQLite at once. For this lab keep to one writer; if you hit it under launchd’s overlapping runs, set a longerStartIntervalor adddb.execute("PRAGMA busy_timeout=5000")after connecting.- Ollama call hangs or errors. Confirm
ollama serveis running (theollamaapp orbrew services start ollama) and the model is pulled (ollama list). Thetimeout=120indo_workprevents a hung call from blocking forever. - The agent didn’t run while the Mac slept. Expected — a sleeping Mac runs nothing. Use
caffeinate(Stretch 2) or, for true 24/7, the cloud substrate in Lab 3.