Architecture for converting autoresearch experiment outputs into client-specialized fine-tuned LLMs via GTM, Meta, and Google Ads account state ingestion.
╔══════════════════════════════════════════════════════════════════════════╗ ║ CLIENT ACCOUNT DATA SOURCES ║ ╠══════════════╦═══════════════════╦══════════════╦════════════════════════╣ ║ GTM Export ║ Google Ads MCP ║ Meta MCP ║ claude-mem ║ ║ Container ║ Campaigns/Conv ║ Pixel/CAPI ║ SQLite + Chroma ║ ╚══════════════╩═════════╤═════════╩══════════════╩════════════════════════╝ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ ACCOUNT STATE COLLECTOR │ │ Normalizes all client account data into structured context object │ │ client_id / gtm_schema / ad_structure / pixel_events / history │ └──────────────────────────────────┬──────────────────────────────────────┘ │ ┌──────────────┴──────────────┐ ▼ ▼ ┌───────────────────────────┐ ┌──────────────────────────────────────────┐ │ AUTORESEARCH LOOP │ │ EXISTING claude-mem HISTORY │ │ 50-100 experiments/run │ │ Past fixes, decisions, patterns │ │ Each: problem→solution │ │ Chroma vector store │ │ Score: 0.0 → 1.0 │ └──────────────────────────────────────────┘ └──────────────┬────────────┘ │ │ │ ▼ ▼ ┌──────────────────────────────────────────────────────────────────────────┐ │ SCORING FILTER + FORMATTER │ │ score ≥ 0.75 → keep │ Dedup │ Inject account_state into system prompt │ │ Output: training.jsonl {"system":..., "user":..., "assistant":...} │ └──────────────────────────────┬───────────────────────────────────────────┘ │ ┌────────────────┴────────────────┐ ▼ ▼ ┌─────────────────────────┐ ┌─────────────────────────────────────────────┐ │ OPENAI FINE-TUNE API │ │ OLLAMA LOCAL (NoClaw :11434) │ │ gpt-4o-mini target │ │ Llama 3.1 8B / Mistral → M3 Ultra │ │ Fast MVP / cloud │ │ Private / no data egress │ └─────────────┬───────────┘ └──────────────────┬──────────────────────────┘ └────────────────┬────────────────┘ ▼ ┌──────────────────────────────────────────────────────────────────────────┐ │ CLIENT BRAIN AGENT (OpenClaw) │ │ Fine-tuned model routed per client_id inside OpenClaw gateway │ │ No 200k context dump needed — patterns baked into weights │ │ New experiments → re-score → auto retrain trigger → model v++ │ └──────────────────────────────────────────────────────────────────────────┘ ▲ │ flywheel: more client work → better model └──────────────────────────────────────────
Modify autoresearch loop to persist structured experiment records with score, context, and client_id. This is the foundation — everything else depends on clean experiment output.
ExperimentRecord schema: client_id, run_id, problem, solution, score, timestamp, account_snapshotPull and normalize client account data from GTM, Google Ads MCP, and Pipeboard Meta MCP into a structured AccountState object stored per client. This becomes the system prompt context.
clients/{client_id}/account_state.json, versionedFilter high-scoring experiments, inject account state into system prompts, and export clean JSONL training files per client. This is the core transformation step.
score ≥ 0.75 (configurable threshold per client)training/{client_id}/v{N}.jsonl in OpenAI fine-tune formatTwo-track runner: OpenAI API for fast cloud fine-tunes, Ollama/MLX for private local fine-tunes on M3 Ultra. Triggered automatically when new JSONL version is ready.
ollama create client-{id}-v{N} on M3 Ultramodels/{client_id}.json tracking versions, eval scores, active flagRoute requests through OpenClaw gateway to the correct client-specialized model. Each client_id maps to an active model version. No context dump required at inference.
x-client-id header → lookup active model in registryClose the loop. New autoresearch runs automatically trigger re-scoring, JSONL delta append, and retrain when enough new high-quality examples accumulate. Model gets smarter with every client engagement.
{
"messages": [
{
"role": "system",
"content": "You are a tracking expert for HRE.
GTM: GTM-XXXXXXX | sGTM: container-XXXXXXX
Conversion actions: All Purchases (tag:AW-xxx/yyy), Lead Form Submit
Known issue history: PMAX $0 value → fix: sGTM cart items mapping
dataLayer schema: {event, ecommerce.value, ecommerce.items[]}
Meta Pixel: HRE-pixel-id | CAPI: configured via Stape"
},
{
"role": "user",
"content": "PMAX campaign shows conversions but $0 revenue value"
},
{
"role": "assistant",
"content": "Root cause: sGTM purchase tag is reading value from
top-level ecommerce object but HRE pushes revenue
inside items[0].price * quantity. Fix: update sGTM variable
to map items array and sum value. Tag: 'sGTM - GA4 Purchase'
→ Variable: {{DLV - ecommerce.items}} → recompute value."
}
],
"metadata": {
"client_id": "hre",
"score": 0.91,
"run_id": "exp-2026-04-07-047",
"sources": ["gtm", "google_ads"]
}
}
Every client engagement generates experiments. Every experiment above threshold extends the training set. Every retrain makes the model more accurate for that client's specific patterns. At ~500 high-quality examples per client, the model knows their full tracking architecture better than any human consultant who hasn't lived in their accounts for years. This is the retention mechanic — the model is an asset that only you and the client share.
claude --dangerously-skip-permissions
# Client LLM Fine-Tune Pipeline — Phase 1 Bootstrap
## Context
Repo: github.com/Organized-AI/gtm-autoresearch
Infra: M3 Ultra (Ollama :11434 NoClaw), M4 Mini (OpenClaw :18789), claude-mem (SQLite + Chroma :37777)
Clients: bioptimizers, rtt, teleios
## Phase 1 Task: Experiment Logger Instrumentation
Read AGENT-HANDOFF/ and PLANNING/ first. Then:
1. Create packages/experiment-logger/
- ExperimentRecord TypeScript schema:
client_id, run_id, problem, solution, score (float 0-1),
timestamp, account_snapshot (JSON), sources_used string[]
- SQLite writer extending claude-mem DB at :37777
- Score normalizer: validate 0.0–1.0 range
- Client_id tagger on each record
2. Instrument autoresearch loop to call ExperimentLogger.save()
on each scored experiment output
3. Add CLI: `pnpm experiment-logger export --client bioptimizers`
outputs all records to stdout as JSONL
4. Write tests for schema validation and SQLite round-trip
Use Organized Codebase structure. Update CLAUDE.md with
experiment-logger package docs. Do not build Phase 2 yet.
## Env vars needed:
CLAUDE_MEM_DB_PATH=/path/to/claude-mem.sqlite
EXPERIMENT_LOG_DIR=./data/experiments
SCORE_THRESHOLD=0.75