How to Build a Closed-Loop AI Agent System with OpenClaw

QUICK INFO


Difficulty	Intermediate
Time Required	1-2 weeks (core loop: ~1 week)
Prerequisites	A VPS with OpenClaw installed and running Working knowledge of TypeScript and Next.js API routes Familiarity with Supabase (tables, RLS, client library) Basic cron and shell access on your VPS
Tools Needed	OpenClaw (latest, on a VPS) Supabase project (free tier works initially) Next.js on Vercel (Hobby plan minimum) A configured LLM provider (Anthropic, OpenAI, etc.)

What You'll Learn:

Architect a proposal-to-execution loop where agents operate without manual intervention
Fix the three most common pitfalls that make multi-agent systems stall silently
Implement cap gates, triggers, and a reaction matrix for inter-agent coordination
Add self-healing via stale step recovery so the system survives crashes

This guide walks you through the gap between "AI agents that can talk" and "AI agents that run things end-to-end." It is aimed at developers who already have OpenClaw on a VPS, a Next.js frontend on Vercel, and Supabase as their database, and who now need those pieces to form an autonomous loop. If you do not have that stack running yet, set it up first; this is not a getting-started guide for any of those tools individually.

Getting Started

The core problem is deceptively simple. OpenClaw gives your agents cron jobs, tool use, roundtable discussions, and scheduled tasks. Your agents can produce outputs: drafted tweets, research reports, content proposals. But nothing in that default setup turns output into execution, and nothing tells the system "done" after execution completes. Between "agents can produce output" and "agents run things" sits a full closed loop you have to build yourself.

The loop looks like this: an agent proposes an idea, that proposal gets checked against approval rules, a mission with executable steps gets created, a worker claims and runs those steps, an event gets emitted, and triggers or reactions fire new proposals based on what just happened. Then the cycle repeats. Each stage feeds the next, and if any link breaks, the system either stalls or spirals.

You need eight Supabase tables at minimum: ops_mission_proposals for proposals, ops_missions for approved missions, ops_mission_steps for executable steps, ops_agent_events for the event stream, ops_policy for configuration stored as JSON, ops_trigger_rules for trigger definitions, ops_agent_reactions for the reaction queue, and ops_action_runs for execution logs. Create these before writing any application code. Supabase is the single source of truth; everything reads from and writes to these tables.

The Three Pitfalls That Stall Your Loop

Once you have the tables and a basic proposal flow, things will appear to work. Agents propose, proposals get approved, missions get created. Then you notice the system is "spinning in place," doing work but not completing anything meaningful. These three problems are what you will hit.

Two Executors Fighting Over the Same Work

If your VPS has OpenClaw workers claiming tasks from ops_mission_steps, and your Vercel heartbeat cron is also running a mission-worker process, both will grab the same step. The result is race conditions, conflicting status updates, and occasional silent data corruption.

The fix is straightforward: pick one executor. The VPS handles all step execution. Vercel runs only the lightweight control plane: evaluating triggers, processing the reaction queue, promoting insights, and cleaning up stuck tasks. Remove any runMissionWorker call from your heartbeat route.

// Heartbeat now does only 4 things
const triggerResult = await evaluateTriggers(sb, 4_000);
const reactionResult = await processReactionQueue(sb, 3_000);
const learningResult = await promoteInsights(sb);
const staleResult = await recoverStaleSteps(sb);

This separation also means you do not need Vercel Pro for cron. A single crontab line on your VPS hitting the heartbeat endpoint every 5 minutes works fine:

*/5 * * * * curl -s -H "Authorization: Bearer $KEY" https://yoursite.com/api/ops/heartbeat

Proposals Created But Never Executed

This one was confusing to debug. Triggers would correctly detect a condition (say, a tweet going viral) and insert a row into ops_mission_proposals. But the proposal would sit at pending forever, never becoming a mission, never generating steps.

The problem: triggers were inserting proposals directly into the table, bypassing the approval flow. The normal path is insert proposal, evaluate auto-approve rules, and if approved, create the mission with its steps. Triggers skipped steps two and three.

The fix is a single shared function, something like createProposalAndMaybeAutoApprove, that every proposal source must call. API endpoints, triggers, reactions: all of them go through this one function. It handles daily limits, cap gates (more on those next), proposal insertion, event emission, auto-approval evaluation, and mission creation.

// proposal-service.ts
export async function createProposalAndMaybeAutoApprove(
  sb: SupabaseClient,
  input: ProposalServiceInput,
): Promise<ProposalServiceResult> {
  // 1. Check daily limit
  // 2. Check cap gates
  // 3. Insert proposal
  // 4. Emit event
  // 5. Evaluate auto-approve
  // 6. If approved: create mission + steps
  // 7. Return result
}

Triggers then just return a proposal template. The evaluator feeds it into the shared service. Any future logic (rate limits, blocklists, new caps) changes in one file.

Queue Buildup When Quotas Are Full

The sneakiest one. No errors in the logs. Everything looks clean. But your ops_mission_steps table has hundreds of queued steps piling up, and nothing is processing them.

What happens: your tweet quota is full, but proposals are still being approved and generating missions and steps. The VPS worker sees the quota is full, skips the step without claiming it, and does not mark it as failed either. Next cycle, another batch arrives.

The fix is cap gates, which reject at the proposal entry point. If a proposal would generate steps that cannot be executed (because a quota is full, a feature is disabled, or a policy blocks it), reject the proposal before it creates any queued steps.

async function checkPostTweetGate(sb: SupabaseClient) {
  const autopost = await getOpsPolicyJson(sb, 'x_autopost', {});
  if (autopost.enabled === false)
    return { ok: false, reason: 'x_autopost disabled' };

  const quota = await getOpsPolicyJson(sb, 'x_daily_quota', {});
  const limit = Number(quota.limit ?? 10);
  const { count } = await sb
    .from('ops_tweet_drafts')
    .select('id', { count: 'exact', head: true })
    .eq('status', 'posted')
    .gte('posted_at', startOfTodayUtcIso());

  if ((count ?? 0) >= limit)
    return { ok: false, reason: `Daily tweet quota (${count}/${limit})` };
  return { ok: true };
}

Each step kind (write_content, post_tweet, deploy) gets its own gate function. Rejected proposals are recorded for auditing, not silently dropped. The key principle: reject at the gate, not in the queue.

Triggers and the Reaction Matrix

With the three pitfalls fixed, the loop runs clean. But it only does what you explicitly schedule. To make it responsive, you need triggers (system reacts to conditions) and reactions (agents respond to each other).

Triggers are condition-action rules stored in your database. A trigger checks for something specific, like tweet engagement exceeding 5% or a mission failing, and returns a proposal template that goes through the standard proposal service. Four built-in triggers cover most cases: analyzing viral tweets, diagnosing mission failures, reviewing newly published content, and promoting mature insights to permanent memory. Each trigger needs a cooldown period. Without it, one viral tweet fires an analysis proposal on every 5-minute heartbeat cycle.

The reaction matrix is more interesting, and I should clarify that it is less "matrix" and more "pattern-matching config." It lives as JSON in the ops_policy table and defines probabilistic inter-agent responses.

{
  "patterns": [
    { "source": "twitter-alt", "tags": ["tweet","posted"],
      "target": "growth", "type": "analyze",
      "probability": 0.3, "cooldown": 120 },
    { "source": "*", "tags": ["mission:failed"],
      "target": "brain", "type": "diagnose",
      "probability": 1.0, "cooldown": 60 }
  ]
}

When Agent X posts a tweet, there is a 30% chance Agent Y analyzes its performance. When any mission fails, there is a 100% chance the diagnostic agent kicks in. The probability is intentional, not a bug. Full determinism makes the system feel mechanical. Some randomness makes it feel like a team where people sometimes respond and sometimes do not. Whether that matters to you depends on whether your agents are public-facing or purely backend infrastructure. For backend work, you probably want 1.0 everywhere.

Self-Healing

VPS restarts, API timeouts, network blips. Steps get stuck in running status with no process actually handling them. The heartbeat includes a recovery function that marks any step stuck for over 30 minutes as failed, then checks whether the parent mission should be finalized.

const STALE_THRESHOLD_MS = 30 * 60 * 1000;

const { data: stale } = await sb
  .from('ops_mission_steps')
  .select('id, mission_id')
  .eq('status', 'running')
  .lt('reserved_at', staleThreshold);

for (const step of stale) {
  await sb.from('ops_mission_steps').update({
    status: 'failed',
    last_error: 'Stale: no progress for 30 minutes',
  }).eq('id', step.id);
  await maybeFinalizeMissionIfDone(sb, step.mission_id);
}

maybeFinalizeMissionIfDone checks all steps in a mission. If any step failed, the mission fails. All completed means success. Without this, you get a subtle bug where one step succeeds and the mission gets marked as success while other steps are still hanging.

Policy-Driven Configuration

Do not hardcode limits. Every behavioral toggle goes in the ops_policy table as a JSON document. Auto-approve rules, daily quotas, which step kinds are allowed to execute, whether Vercel should run workers (set to false). Adjust any policy without redeploying code. This was one of those decisions that felt like overkill during initial development but paid off within the first week of operation. Being able to disable tweet posting by changing one JSON value while debugging a formatting issue is worth the abstraction cost.

Troubleshooting

Symptom: Proposals show pending status and never progress.
Fix: Check that whatever created the proposal is calling the shared proposal service, not inserting directly into the table. Also verify auto-approve policy exists in ops_policy and that the step kinds in the proposal match the allowed_step_kinds array.

Symptom: Steps pile up in queued status, worker logs show no errors.
Fix: The worker is likely skipping steps because a quota or policy check fails silently. Add cap gates to reject proposals before they generate steps. Check whether the relevant policy (like x_daily_quota) is set correctly.

Symptom: Duplicate step executions or conflicting status updates.
Fix: You have two executors claiming work. Verify that Vercel's heartbeat route does not include runMissionWorker. Only the VPS should claim and run steps.

Symptom: Triggers fire but produce no visible result.
Fix: Check cooldown periods. If a trigger fired recently and the cooldown has not elapsed, subsequent detections are silently skipped. Also check that the trigger's proposal template includes valid step kinds that pass cap gates.

What Comes Next

The full loop (propose, approve, execute, emit event, trigger reaction) takes roughly a week to wire up once you have the underlying infrastructure. From there, the next challenge is inter-agent collaboration: roundtable voting, memory consolidation, and making multiple LLM instances act like a team rather than six isolated processes. That is a different problem entirely, and one where I am still iterating.