Greg's AI Workflow
Model stack · Skills · Subagents · Build pipelines · Safety rails
Reasoning, strategy, architecture, plans
Primary executor · Reviews all outputs · Integrates · Deploys
Research, blog posts, FAQs, comparisons, case studies
Off — daily billing creep. Use Gemini Flash (free) for research/writing + ollama_deepseek_pro for frontier reasoning. Tools pro_reason, deep_research, frontier in cheatsheet DO-NOT-CALL list.
Llama 4 MoE, fast multimodal — Groq marks this Preview, "may be discontinued at short notice." Use opportunistically.
OpenAI OSS — the only production-tier cross-architecture reviewer on Groq. Use as second pass in the verification chain.
Workhorse fast chat — speed-tier fallback when Ollama Pro is down. 1K req/min, 500K req/day.
Default first-pass code reviewer. 1M context, 3 thinking modes (off / low / medium / high). Streaming via ollama-mcp 0b88571 — CF 524 fix 2026-05-12.
Off — uncapped cost creep. Verifier slot moves to ollama_deepseek_pro (free, 1.6T frontier) + deepseek_code_review (direct API, $10 cap). Cheatsheet DO-NOT-CALL: opencode_gpt55_pro, opencode_gpt55, opencode_gpt53_codex_spark, opencode_kimi_thinking.
Deep reasoning, thinking, agentic tasks — comparable to GPT-5
SWE-Bench 72.2% — Mistral's best coder, multi-file editing
Agentic coding — sustains over 100s of rounds, SWE-Bench Pro 58.4%
Agentic coding + tool use — alternative architecture for cross-arch second-opinion review (different family from Qwen / DeepSeek).
1.6T / 49B active. Pre-deploy verification path — different inference than Ollama Cloud. $0.55 / $2.19 per M tokens.
Bulk HTML, multimodal, vision, agent swarm — moved from free to Pro
Agentic reasoning, content — moved from free to Pro
Code tasks, second opinions, large code reviews — cloud-proxied
Reasoning, writing, analysis, vision, thinking — zero cost
Complex reasoning, large documents, vision — biggest model
Text-to-video & image-to-video — Pro (1080p) + Lite (720p, faster)
Render before judging. DOM + accessibility tree + screenshot, not curl. Default for any UX, design audit, redesign, or rendering task per CLAUDE.md.
Live SERP, keyword volume, backlinks, on-page Lighthouse, business listings, AI visibility (LLM mentions). 80+ tools — the canonical SEO data source.
Multi-server file ops. SSH + RAM mounts under one virtual filesystem — cross-server diff, read, exec without juggling tabs. SSH writes default-deny.
Google Stitch — UI generation from prompts. Design systems, screen variants, project scaffolding. Convert output to semantic CSS before shipping.
Mistral code-specialized models — review, explain, generate. Complements Ollama Pro + OpenCode coverage.
fal.ai image generation + edit endpoints — fallback when cc-nano-banana (Gemini) is unavailable or for specific fal-only models.
Privacy-friendly analytics MCP. Org-level Bearer key — reads sites, sessions, events, funnels, goals, outbound, live visitors. FPS live (siteId 10025). 17 tools.
+ gemini-research (Flash only), groq-fast, ollama-pro, opencode (free models only), xiaomi-mimo, glm-free, deepseek — model-routing MCPs already covered above. Plus Anthropic-managed: Gmail, Google Calendar, Google Drive, HubSpot, Notion, Playwright (via plugins).
Screenshot, brief, or reference file
Claude Code skills invoked by Opus
Specialized agents spawned per task
Fast code gen + design research
Code review then ship
Fort Lauderdale POC — SEO audit never ran. Schedule seo-audit + seo-geo + seo-local once content is final. (Flagged 2026-05-12.)
Every deployed page runs: taste → humanizer → frontend-design → impeccable → SEO suite. SEO skills are not optional, not batch-only — per-page on every ship.
blog-factcheck verifies every claim against cited sources via WebFetch. BLOCKS ship on NOT-FOUND, unverified entity/date/quote, rubric < 90, or P0 fabricated stat. Pair with blog-factcheck-fix — Ollama Pro + subagents rewrite, Claude orchestrates (10-25× cheaper than Claude rewriting).
For Alpine-heavy templates (phone.html, contacts/detail.html), open as affected user with DevTools open BEFORE commit. Server-side smoke tests miss silent x-text binding failures. Locked 2026-05-25 after phone-page incident.
DeepSeek V4 Pro 1.6T is now first-pass code review. CF 524 fix shipped 2026-05-12 (ollama-mcp 0b88571): streaming + AbortSignal.any + 270s wallclock budget + jittered overload retry + think:false default.
Multi-server file ops — SSH + RAM mounts under one virtual filesystem for cross-server diff/read/exec. SSH writes default-deny.
Render before judging. DOM + a11y + screenshot, not curl. Default for any UX, render, audit, or redesign task.
Remove AI-writing tells. Detects/fixes inflated symbolism, em-dash overuse, rule-of-three, AI vocab. Run on every Gemini/GPT-drafted copy before publish.
Distinct production-grade UI. impeccable flags AI-slop tells (side-stripe borders, gradient text, glassmorphism, hero-metric clichés). Pair for design audit + polish.
Senior UI/UX engineer. Editorial typography, gapless bento grids, strict GSAP scroll triggers, massive section spacing. Default for any web dev (per CLAUDE.md taste-skill rule).
seo-audit · seo-technical · seo-content · seo-geo · seo-google · seo-local · seo-maps · seo-schema · seo-sitemap · seo-cluster · seo-drift · seo-firecrawl · seo-rotation · seo-sxo · seo-image-gen · seo-programmatic · seo-competitor-pages · seo-ecommerce · seo-plan · seo-dataforseo · seo-page · seo-backlinks
Required for ALL image generation. Nano Banana (Gemini CLI) for blog images, thumbnails, icons, diagrams, illustrations, photos.
Query + manage Google NotebookLM via CLI. Master notebook has 38 sources. Vault-to-Master sync after every session.
Mandatory after any frontend edit — screenshot via Playwright, Read the PNG before calling work done. Per CLAUDE.md rule #4.
Run after ANY code edit, deployment, or fix. Requires running verification commands and confirming output before claiming completion.
Write handoff doc so fresh agent continues in clean context window. Use for scope-creep split-outs, ~120k dumb-zone compression, planner→prototype→planner round-trip, or cross-CLI pass (Codex, Copilot, OpenCode). Triggers: "handoff", "spawn agent for", "fresh session".
66 DESIGN.md files from Stripe, Figma, Apple, PlayStation, WIRED — VoltAgent/awesome-design-md
Codex CLI second opinion — gpt-4.1-mini, invoke when stuck >2 attempts or need architecture review
Semantic search across all 40+ projects via RAG-Anything + OpenAI embeddings
Weekly audit: broken symlinks, GitHub freshness, usage stats, duplicate detection
fastapi · python · laravel · php · sql · deployment · devops · security · code-reviewer + more — VoltAgent
Session lifecycle — /start reads Memory+Obsidian+NotebookLM, /wrap commits+pushes+updates Obsidian
305 skills installed (~/.claude/skills/). Includes humanizer, frontend-design, impeccable, taste-skill, full 22-skill SEO suite, cc-nano-banana, notebooklm, visual-review, verify-my-work. Plus plugin marketplaces (caveman, obra-superpowers, interface-design, codex, memsearch).
ByteDance text-to-video via fal.ai — Pro (1080p) + Lite (720p). Live now.
251 audit rules across 20 categories — @seomator engine. CLI: seomator audit <url>
47 AI citability criteria — llms.txt, AI bot rules, passage-level optimization for ChatGPT/Perplexity/Gemini
CORE-EEAT (80-item) + CITE (40-item) frameworks. 12 commands: audit, optimize, schema, keywords, alerts
Evaluated for purchase ($162/yr). 1T params (42B active MoE), 1M ctx, 1000+ tool call coherence. Declined — ollama_deepseek_pro on Ollama Pro covers the frontier slot (1.6T/49B active, 1M ctx, 3 thinking modes) at zero marginal cost. Reassess after 30-day delta check.
Text-to-speech with emotion control, voice cloning from 30s audio sample, and voice design from text description. 24kHz, multilingual. Free during open beta.
Full-modal base model: native text, image, audio, video understanding. 1M context, 131K max output. Pro-level agentic at half the cost ($0.40/M input).
~80 free AI models via OpenAI-compatible API. MiniMax M2.7, DeepSeek V4, GLM 5.1, Kimi K2.5, Sarvam-M, GPT-OSS 120B + nvidia_ask (any model)
Semantic vector search across 135 curated memory files (was 101). ONNX bge-m3 embeddings (local, free). Auto-captures sessions via hooks.
reorganize-memory.sh (audit: orphans, empties, oversized files) + sync-memsearch.sh (copy curated memories to MemSearch, re-index)
4 behavioral principles from Andrej Karpathy: Think Before Coding, Simplicity First, Surgical Changes, Goal-Driven Execution
Semantic across vault + master notebook. 38-source master. CLI authenticated (Mac + Fedora). Vault → project notebook → Master after every session.
~/Documents/greg-claude/ — Projects, Servers, Sessions, NotebookLM. Per-project file pattern. Check FIRST before asking.
Interactive reporting dashboard for CrawlHound — grade history, scan trends, site-wide metrics visualization
Downloadable PDF audit reports for CrawlHound, MrBotsworth, and gjapp — branded, shareable
Automated build + test + deploy pipeline on git.myseodesk.com — lint, SEO checks, rsync deploy on push
Subagent-heavy sessions. Top spawns: backend-developer, frontend-developer, code-reviewer — all have free MCP twins.
Usage at >150k context. Long mixed-topic sessions, no /clear or /compact between unrelated tasks.
Skill invocation share. The 100+ installed skills (visual-review, redesign, audits) barely fire — same work runs through paid subagents instead.
Fires on every Agent tool call. Drift list (6 named agents): warns ≤65% context used, blocks >65%. code-reviewer + general-purpose in ALWAYS_BLOCK. Whitelist (Explore/Plan/gsd/audit/seo/feature-dev) silent. Override: [paid-subagent-OK] token logs + allows.
Three triggers: session start, topic-switch keywords, post-/compact detection (transcript msg-count drop >50%). Injects ~250 tokens of routing rules from ~/.claude/rules/delegation-cheatsheet.md.
Detects topic-switch keywords (“okay new task”, “switching to”, “moving on”) and appends a /clear reminder. 10-message debounce so it doesn’t fire on every consecutive switch.
Highest-leverage hook. 26 keyword triggers in ~/.claude/rules/skill-triggers.json (regex, priority-ranked). On match, injects an EXTREMELY_IMPORTANT system-reminder naming the skill — same mechanism as using-superpowers.
Enforces the “render before judging” rule. When a prompt looks like UX / design / audit / redesign work, injects a reminder to use mcp__chrome-devtools__new_page + take_snapshot + take_screenshot from the main session — not curl, not WebFetch, not subagent.
| drift agent | free mcp route | when in-agent is justified |
|---|---|---|
| backend-developer | ollama_devstral, ollama_deepseek_pro | 5+ files coordinated, or hard arch |
| frontend-developer | ollama_kimi (HTML), ollama_code (review) | multi-framework full-stack work |
| fastapi-developer | ollama_deepseek_pro, ollama_devstral | complex async patterns + tests |
| code-reviewer | HARD-BLOCKED. Chain: ollama_deepseek_pro (DEFAULT) → ollama_code → fast_gpt_oss | [paid-subagent-OK] override only |
| security-auditor | /security-auditor skill + /api-keys skill | whole-system audit (5+ services) |
| database-administrator | /sql-pro skill + mysql/postgres MCP | HA/replication infra, not just queries |
Apply when work is non-trivial: >50 LOC change, security-sensitive, or first-time deploy. Review prompts are short, so paid free-tier rate-limits go further than on bulk gen.
| Plan Type | Lead Model | ~Tokens | ~Cost | Notes |
|---|---|---|---|---|
| Quick Ask | Ollama Pro (Nemotron/Chat) | 1–3K | $0 (Pro) | ollama_nemotron / ollama_chat — Tier 1 |
| Bug Fix | Sonnet 4.6 | 5–15K | ~$0.05–0.15 | Read file → diagnose → patch → verify |
| Single Feature | Opus 4.7 + Sonnet | 20–60K | ~$0.50–2 | Plan → subagent execute → codex:review |
| Static Page Build | Opus + Kimi K2.6 | 30–80K | ~$0.50–1.50 | DESIGN.md → /stitch → Playwright check → deploy |
| Astro SSR Feature | Opus + GPT 5.3 Spark | 50–120K | ~$1–3 | /astro-ssr skill + full pre-deploy gate |
| Laravel API | Opus + GPT 5.5 Pro | 60–150K | ~$1.50–4 | Models + migrations + tests + PHPStan + review |
| Full Site (5–10 pages) | Opus + mix | 150–400K | ~$4–12 | Full 6-stage pipeline: PLAN→DESIGN→BUILD→SEO→CRO→QA |
| SEO Audit | Gemini 3.5 Flash | 10–30K | ~$0.10–0.30 | Schema + compliance + content + schema — mostly Gemini |
- → Use Ollama Pro (Tier 1) for drafts, analysis, code review
- → Use Kimi K2.6 (Pro) for bulk HTML generation
- → Use Gemini 3.5 Flash for research/content
- → Groq/NVIDIA = fallback only (Tier 3)
- → /compact regularly in long sessions
- → Each file read adds ~1–5K tokens
- → Large HTML files: 10–30K tokens each
- → Long conversations drift — use /clear or new session
- → Subagents get fresh context — use them for big files
- → CLAUDE.md loaded every session (~5K tokens)
- → Stuck after 2 attempts → /codex:rescue
- → Complex architecture → Opus (not Sonnet)
- → Critical prod deploy → deepseek_code_review ($10 cap)
- → Cross-arch verify → Codex CLI gpt-5.4-mini (free quota)
- → Free model fails tool calls → switch to Sonnet
| Role | Model | MCP Server | Use Case | Cost |
|---|---|---|---|---|
| Planner | Opus 4.7 | native | Reasoning, architecture, strategy | paid |
| Lead | Sonnet 4.6 / 4.7 | native | Executes tasks, reviews, deploys | paid |
| Frontier reviewer | DeepSeek V4 Pro 1.6T | ollama-pro | Default first-pass code review (post-streaming-fix 2026-05-12). 1M ctx, off/low/med/high thinking modes. | PRO |
| Research | Gemini 3.5 Flash | gemini-research | Research, comparisons, blog posts, FAQs | paid |
| Reasoning | Gemini 3.1 Pro | gemini-research | Disabled 2026-05-18 — use ollama_deepseek_pro (free) | DISABLED |
| Frontier | Gemini 3.1 Pro | gemini-research | Disabled 2026-05-18 — use ollama_deepseek_pro (free, 1.6T) or deepseek_code_review ($10 cap) | DISABLED |
| Fast 8B | Llama 3.1 8B Instant | groq-fast | Sub-second 8B — gates, autocomplete, micro-tasks | FREE |
| Fast 70B | Llama 3.3 70B Versatile | fast_code | Workhorse fast chat — speed-tier fallback | FREE |
| Preview | Llama 4 Scout 17B/16E | fast_llama4 | Llama 4 MoE — preview tier per Groq, opportunistic only | FREE |
| Cross-arch | GPT OSS 120B | fast_gpt_oss | Verification chain second pass — only production-tier cross-arch on Groq | FREE |
| STT | Whisper Large v3 Turbo | groq-fast | Speech-to-text — 400 req/min, 4M sec/day | FREE |
| Backend | GPT 5.5 Pro | opencode | Hardest backend, algorithms, APIs | paid |
| Code | GPT 5.3 Codex Spark | opencode | Fast code generation, everyday tasks | paid |
| Reason | Kimi K2 Thinking | opencode | Complex logic, multi-step planning | paid |
| Code | Trinity Large Preview | opencode | General coding and reasoning | paid |
| HTML | Kimi K2.6 | ollama_kimi | Bulk HTML, multimodal, agent swarm | PRO |
| Content | Nemotron 3 Super 120B | ollama_nemotron | Agentic reasoning, content | PRO |
| Ollama Pro | Qwen3-Coder 480B | ollama_code | Code tasks, large code reviews | PRO |
| Ollama Pro | Qwen3.5 397B | ollama_chat | Reasoning, writing, analysis, vision, thinking | PRO |
| Ollama Pro | Mistral Large 3 675B | ollama_mistral | Complex reasoning, large documents, vision | PRO |
| Ollama Pro | DeepSeek V4 Flash 284B / 13B active | ollama_deepseek | Deep reasoning, thinking, agentic tasks | PRO |
| Ollama Pro | Devstral 2 123B | ollama_devstral | SWE coding, multi-file editing (72.2% SWE-Bench) | PRO |
| Ollama Pro | GLM 5.1 | ollama_glm | Agentic coding, sustained over 100s of rounds | PRO |
| MiMo | MiMo-V2.5-Pro | mimo | Frontier reasoning, 1T/42B MoE, 1M ctx, 1000+ tool calls | $1/$3M |
| MiMo | MiMo-V2.5 | mimo | Full-modal (text+image+audio+video), 1M ctx | $0.4/$2M |
| MiMo | MiMo-V2.5-TTS | mimo | Text-to-speech, voice clone, voice design | FREE |
| NVIDIA · archived | MiniMax M2.7 | nvidia-nim | Large MoE reasoning + code (exclusive to NVIDIA) | FREE |
| NVIDIA · archived | DeepSeek V4 Flash | nvidia-nim | 284B MoE reasoning (Tier 3 fallback) | FREE |
| NVIDIA · archived | GLM 5.1 | nvidia-nim | Agentic coding (Tier 3 fallback) | FREE |
| NVIDIA · archived | Kimi K2.5 | nvidia-nim | Moonshot reasoning (Tier 3 fallback) | FREE |
| NVIDIA · archived | Sarvam-M | nvidia-nim | Indian multilingual (exclusive to NVIDIA) | FREE |
| NVIDIA · archived | GPT-OSS 120B | nvidia-nim | OpenAI OSS (Tier 3 fallback) | FREE |
+ 5 hooks (agent-spawn-gate, delegation-table-injector, topic-switch-nudge, skill-trigger-injector, chrome-devtools-reminder) · 237 memory files in MemSearch · 38-source NotebookLM master · Obsidian vault per-project files