Greg's AI Workflow
Model stack · Skills · Subagents · Build pipelines · Safety rails
Reasoning, strategy, architecture, plans
Primary executor · Reviews all outputs · Integrates · Deploys
Research, blog posts, FAQs, comparisons, case studies
Deep reasoning, complex analysis, architecture decisions
Most capable model — hardest problems, deepest analysis
Llama 4 MoE, fast multimodal — Groq marks this Preview, "may be discontinued at short notice." Use opportunistically.
OpenAI OSS — the only production-tier cross-architecture reviewer on Groq. Use as second pass in the verification chain.
Workhorse fast chat — speed-tier fallback when Ollama Pro is down. 1K req/min, 500K req/day.
Default first-pass code reviewer. 1M context, 3 thinking modes (off / low / medium / high). Streaming via ollama-mcp 0b88571 — CF 524 fix 2026-05-12.
Hardest backend problems, algorithms, APIs
Fast code generation, everyday backend tasks
Complex logic, multi-step reasoning
General coding and reasoning
Deep reasoning, thinking, agentic tasks — comparable to GPT-5
SWE-Bench 72.2% — Mistral's best coder, multi-file editing
Agentic coding — sustains over 100s of rounds, SWE-Bench Pro 58.4%
Agentic coding + tool use — alternative architecture for cross-arch second-opinion review (different family from Qwen / DeepSeek).
671B MoE, 160K ctx. Use V4 Flash / Pro instead; V3.2 kept as compatibility fallback.
Bulk HTML, multimodal, vision, agent swarm — moved from free to Pro
Agentic reasoning, content — moved from free to Pro
Code tasks, second opinions, large code reviews — cloud-proxied
Reasoning, writing, analysis, vision, thinking — zero cost
Complex reasoning, large documents, vision — biggest model
Text-to-video & image-to-video — Pro (1080p) + Lite (720p, faster)
Render before judging. DOM + accessibility tree + screenshot, not curl. Default for any UX, design audit, redesign, or rendering task per CLAUDE.md.
Live SERP, keyword volume, backlinks, on-page Lighthouse, business listings, AI visibility (LLM mentions). 80+ tools — the canonical SEO data source.
Multi-server file ops. SSH + RAM mounts under one virtual filesystem — cross-server diff, read, exec without juggling tabs. SSH writes default-deny.
Google Stitch — UI generation from prompts. Design systems, screen variants, project scaffolding. Convert output to semantic CSS before shipping.
Mistral code-specialized models — review, explain, generate. Complements Ollama Pro + OpenCode coverage.
fal.ai image generation + edit endpoints — fallback when cc-nano-banana (Gemini) is unavailable or for specific fal-only models.
+ gemini-research, groq-fast, ollama-pro, opencode, xiaomi-mimo, glm-free — the 6 model-routing MCPs already covered above. Plus Anthropic-managed: Gmail, Google Calendar, Google Drive, HubSpot, Notion, Playwright (via plugins).
Screenshot, brief, or reference file
Claude Code skills invoked by Opus
Specialized agents spawned per task
Fast code gen + design research
Code review then ship
Fort Lauderdale POC — SEO audit never ran. Schedule seo-audit + seo-geo + seo-local once content is final. (Flagged 2026-05-12.)
DeepSeek V4 Pro 1.6T is now first-pass code review. CF 524 fix shipped 2026-05-12 (ollama-mcp 0b88571): streaming + AbortSignal.any + 270s wallclock budget + jittered overload retry + think:false default.
Multi-server file ops — SSH + RAM mounts under one virtual filesystem for cross-server diff/read/exec. SSH writes default-deny.
Render before judging. DOM + a11y + screenshot, not curl. Default for any UX, render, audit, or redesign task.
Remove AI-writing tells. Detects/fixes inflated symbolism, em-dash overuse, rule-of-three, AI vocab. Run on every Gemini/GPT-drafted copy before publish.
Distinct production-grade UI. impeccable flags AI-slop tells (side-stripe borders, gradient text, glassmorphism, hero-metric clichés). Pair for design audit + polish.
Senior UI/UX engineer. Editorial typography, gapless bento grids, strict GSAP scroll triggers, massive section spacing. Default for any web dev (per CLAUDE.md taste-skill rule).
seo-audit · seo-technical · seo-content · seo-geo · seo-google · seo-local · seo-maps · seo-schema · seo-sitemap · seo-cluster · seo-drift · seo-firecrawl · seo-rotation · seo-sxo · seo-image-gen · seo-programmatic · seo-competitor-pages · seo-ecommerce · seo-plan · seo-dataforseo · seo-page · seo-backlinks
Required for ALL image generation. Nano Banana (Gemini CLI) for blog images, thumbnails, icons, diagrams, illustrations, photos.
Query + manage Google NotebookLM via CLI. Master notebook has 38 sources. Vault-to-Master sync after every session.
Mandatory after any frontend edit — screenshot via Playwright, Read the PNG before calling work done. Per CLAUDE.md rule #4.
Run after ANY code edit, deployment, or fix. Requires running verification commands and confirming output before claiming completion.
66 DESIGN.md files from Stripe, Figma, Apple, PlayStation, WIRED — VoltAgent/awesome-design-md
Codex CLI second opinion — gpt-4.1-mini, invoke when stuck >2 attempts or need architecture review
Semantic search across all 40+ projects via RAG-Anything + OpenAI embeddings
Weekly audit: broken symlinks, GitHub freshness, usage stats, duplicate detection
fastapi · python · laravel · php · sql · deployment · devops · security · code-reviewer + more — VoltAgent
Session lifecycle — /start reads Memory+Obsidian+NotebookLM, /wrap commits+pushes+updates Obsidian
305 skills installed (~/.claude/skills/). Includes humanizer, frontend-design, impeccable, taste-skill, full 22-skill SEO suite, cc-nano-banana, notebooklm, visual-review, verify-my-work. Plus plugin marketplaces (caveman, obra-superpowers, interface-design, codex, memsearch).
ByteDance text-to-video via fal.ai — Pro (1080p) + Lite (720p). Live now.
251 audit rules across 20 categories — @seomator engine. CLI: seomator audit <url>
47 AI citability criteria — llms.txt, AI bot rules, passage-level optimization for ChatGPT/Perplexity/Gemini
CORE-EEAT (80-item) + CITE (40-item) frameworks. 12 commands: audit, optimize, schema, keywords, alerts
Evaluated for purchase ($162/yr). 1T params (42B active MoE), 1M ctx, 1000+ tool call coherence. Declined — ollama_deepseek_pro on Ollama Pro covers the frontier slot (1.6T/49B active, 1M ctx, 3 thinking modes) at zero marginal cost. Reassess after 30-day delta check.
Text-to-speech with emotion control, voice cloning from 30s audio sample, and voice design from text description. 24kHz, multilingual. Free during open beta.
Full-modal base model: native text, image, audio, video understanding. 1M context, 131K max output. Pro-level agentic at half the cost ($0.40/M input).
~80 free AI models via OpenAI-compatible API. MiniMax M2.7, DeepSeek V3.2, GLM 5.1, Kimi K2.5, Sarvam-M, GPT-OSS 120B + nvidia_ask (any model)
Semantic vector search across 135 curated memory files (was 101). ONNX bge-m3 embeddings (local, free). Auto-captures sessions via hooks.
reorganize-memory.sh (audit: orphans, empties, oversized files) + sync-memsearch.sh (copy curated memories to MemSearch, re-index)
4 behavioral principles from Andrej Karpathy: Think Before Coding, Simplicity First, Surgical Changes, Goal-Driven Execution
Semantic across vault + master notebook. 38-source master. CLI authenticated (Mac + Fedora). Vault → project notebook → Master after every session.
~/Documents/greg-claude/ — Projects, Servers, Sessions, NotebookLM. Per-project file pattern. Check FIRST before asking.
Interactive reporting dashboard for CrawlHound — grade history, scan trends, site-wide metrics visualization
Downloadable PDF audit reports for CrawlHound, MrBotsworth, and gjapp — branded, shareable
Automated build + test + deploy pipeline on git.myseodesk.com — lint, SEO checks, rsync deploy on push
Subagent-heavy sessions. Top spawns: backend-developer, frontend-developer, code-reviewer — all have free MCP twins.
Usage at >150k context. Long mixed-topic sessions, no /clear or /compact between unrelated tasks.
Skill invocation share. The 100+ installed skills (visual-review, redesign, audits) barely fire — same work runs through paid subagents instead.
Fires on every Agent tool call. Drift list (6 named agents): warns ≤65% context used, blocks >65%. code-reviewer + general-purpose in ALWAYS_BLOCK. Whitelist (Explore/Plan/gsd/audit/seo/feature-dev) silent. Override: [paid-subagent-OK] token logs + allows.
Three triggers: session start, topic-switch keywords, post-/compact detection (transcript msg-count drop >50%). Injects ~250 tokens of routing rules from ~/.claude/rules/delegation-cheatsheet.md.
Detects topic-switch keywords (“okay new task”, “switching to”, “moving on”) and appends a /clear reminder. 10-message debounce so it doesn’t fire on every consecutive switch.
Highest-leverage hook. 26 keyword triggers in ~/.claude/rules/skill-triggers.json (regex, priority-ranked). On match, injects an EXTREMELY_IMPORTANT system-reminder naming the skill — same mechanism as using-superpowers.
Enforces the “render before judging” rule. When a prompt looks like UX / design / audit / redesign work, injects a reminder to use mcp__chrome-devtools__new_page + take_snapshot + take_screenshot from the main session — not curl, not WebFetch, not subagent.
| drift agent | free mcp route | when in-agent is justified |
|---|---|---|
| backend-developer | ollama_devstral, opencode_gpt54_pro | 5+ files coordinated, or hard arch |
| frontend-developer | ollama_kimi (HTML), ollama_code (review) | multi-framework full-stack work |
| fastapi-developer | opencode_gpt54_pro, ollama_deepseek_pro | complex async patterns + tests |
| code-reviewer | HARD-BLOCKED. Chain: ollama_deepseek_pro (DEFAULT) → ollama_code → fast_gpt_oss | [paid-subagent-OK] override only |
| security-auditor | /security-auditor skill + /api-keys skill | whole-system audit (5+ services) |
| database-administrator | /sql-pro skill + mysql/postgres MCP | HA/replication infra, not just queries |
Apply when work is non-trivial: >50 LOC change, security-sensitive, or first-time deploy. Review prompts are short, so paid free-tier rate-limits go further than on bulk gen.
| Plan Type | Lead Model | ~Tokens | ~Cost | Notes |
|---|---|---|---|---|
| Quick Ask | Ollama Pro (Nemotron/Chat) | 1–3K | $0 (Pro) | ollama_nemotron / ollama_chat — Tier 1 |
| Bug Fix | Sonnet 4.6 | 5–15K | ~$0.05–0.15 | Read file → diagnose → patch → verify |
| Single Feature | Opus 4.7 + Sonnet | 20–60K | ~$0.50–2 | Plan → subagent execute → codex:review |
| Static Page Build | Opus + Kimi K2.6 | 30–80K | ~$0.50–1.50 | DESIGN.md → /stitch → Playwright check → deploy |
| Astro SSR Feature | Opus + GPT 5.3 Spark | 50–120K | ~$1–3 | /astro-ssr skill + full pre-deploy gate |
| Laravel API | Opus + GPT 5.4 Pro | 60–150K | ~$1.50–4 | Models + migrations + tests + PHPStan + review |
| Full Site (5–10 pages) | Opus + mix | 150–400K | ~$4–12 | Full 6-stage pipeline: PLAN→DESIGN→BUILD→SEO→CRO→QA |
| SEO Audit | Gemini 3.1 Flash | 10–30K | ~$0.10–0.30 | Schema + compliance + content + schema — mostly Gemini |
- → Use Ollama Pro (Tier 1) for drafts, analysis, code review
- → Use Kimi K2.6 (Pro) for bulk HTML generation
- → Use Gemini 3.1 Flash for research/content
- → Groq/NVIDIA = fallback only (Tier 3)
- → /compact regularly in long sessions
- → Each file read adds ~1–5K tokens
- → Large HTML files: 10–30K tokens each
- → Long conversations drift — use /clear or new session
- → Subagents get fresh context — use them for big files
- → CLAUDE.md loaded every session (~5K tokens)
- → Stuck after 2 attempts → /codex:rescue
- → Complex architecture → Opus (not Sonnet)
- → Critical prod deploy → Gemini 3.1 Pro review
- → Adversarial review finds HI bugs → GPT 5.4 Pro fix
- → Free model fails tool calls → switch to Sonnet
| Role | Model | MCP Server | Use Case | Cost |
|---|---|---|---|---|
| Planner | Opus 4.7 | native | Reasoning, architecture, strategy | paid |
| Lead | Sonnet 4.6 / 4.7 | native | Executes tasks, reviews, deploys | paid |
| Frontier reviewer | DeepSeek V4 Pro 1.6T | ollama-pro | Default first-pass code review (post-streaming-fix 2026-05-12). 1M ctx, off/low/med/high thinking modes. | PRO |
| Research | Gemini 3.1 Flash | gemini-research | Research, comparisons, blog posts, FAQs | paid |
| Reasoning | Gemini 3.1 Pro | gemini-research | Deep analysis, architecture, complex reasoning | paid |
| Frontier | Gemini 3.1 Pro | gemini-research | Hardest problems, frontier intelligence | paid |
| Fast 8B | Llama 3.1 8B Instant | groq-fast | Sub-second 8B — gates, autocomplete, micro-tasks | FREE |
| Fast 70B | Llama 3.3 70B Versatile | fast_code | Workhorse fast chat — speed-tier fallback | FREE |
| Preview | Llama 4 Scout 17B/16E | fast_llama4 | Llama 4 MoE — preview tier per Groq, opportunistic only | FREE |
| Cross-arch | GPT OSS 120B | fast_gpt_oss | Verification chain second pass — only production-tier cross-arch on Groq | FREE |
| STT | Whisper Large v3 Turbo | groq-fast | Speech-to-text — 400 req/min, 4M sec/day | FREE |
| Backend | GPT 5.4 Pro | opencode | Hardest backend, algorithms, APIs | paid |
| Code | GPT 5.3 Codex Spark | opencode | Fast code generation, everyday tasks | paid |
| Reason | Kimi K2 Thinking | opencode | Complex logic, multi-step planning | paid |
| Code | Trinity Large Preview | opencode | General coding and reasoning | paid |
| HTML | Kimi K2.6 | ollama_kimi | Bulk HTML, multimodal, agent swarm | PRO |
| Content | Nemotron 3 Super 120B | ollama_nemotron | Agentic reasoning, content | PRO |
| Ollama Pro | Qwen3-Coder 480B | ollama_code | Code tasks, large code reviews | PRO |
| Ollama Pro | Qwen3.5 397B | ollama_chat | Reasoning, writing, analysis, vision, thinking | PRO |
| Ollama Pro | Mistral Large 3 675B | ollama_mistral | Complex reasoning, large documents, vision | PRO |
| Ollama Pro | DeepSeek V4 Flash 158B | ollama_deepseek | Deep reasoning, thinking, agentic tasks | PRO |
| Ollama Pro | Devstral 2 123B | ollama_devstral | SWE coding, multi-file editing (72.2% SWE-Bench) | PRO |
| Ollama Pro | GLM 5.1 | ollama_glm | Agentic coding, sustained over 100s of rounds | PRO |
| MiMo | MiMo-V2.5-Pro | mimo | Frontier reasoning, 1T/42B MoE, 1M ctx, 1000+ tool calls | $1/$3M |
| MiMo | MiMo-V2.5 | mimo | Full-modal (text+image+audio+video), 1M ctx | $0.4/$2M |
| MiMo | MiMo-V2.5-TTS | mimo | Text-to-speech, voice clone, voice design | FREE |
| NVIDIA · archived | MiniMax M2.7 | nvidia-nim | Large MoE reasoning + code (exclusive to NVIDIA) | FREE |
| NVIDIA · archived | DeepSeek V3.2 | nvidia-nim | 671B MoE reasoning (Tier 3 fallback) | FREE |
| NVIDIA · archived | GLM 5.1 | nvidia-nim | Agentic coding (Tier 3 fallback) | FREE |
| NVIDIA · archived | Kimi K2.5 | nvidia-nim | Moonshot reasoning (Tier 3 fallback) | FREE |
| NVIDIA · archived | Sarvam-M | nvidia-nim | Indian multilingual (exclusive to NVIDIA) | FREE |
| NVIDIA · archived | GPT-OSS 120B | nvidia-nim | OpenAI OSS (Tier 3 fallback) | FREE |
+ 5 hooks (agent-spawn-gate, delegation-table-injector, topic-switch-nudge, skill-trigger-injector, chrome-devtools-reminder) · 135 memory files in MemSearch · 38-source NotebookLM master · Obsidian vault per-project files