System State + Fix Sequence
Auth Crisis — Current State
| Provider | Status | Error | Fix |
|---|---|---|---|
| xai-oauth-oauth-3 | EXPIRED | Refresh token revoked — invalid_grant |
Re-run hermes auth xai PKCE flow |
| openai-codex-oauth-1 | EXHAUSTED | Usage limit reached — resets 2026-07-21 | Wait OR add second ChatGPT Plus account |
| GOOGLE_API_KEY | FREE TIER EXHAUSTED | Free tier 429 — 250 req/day limit hit | Enable billing on GCP project, regen key |
| OPENAI_API_KEY | PERMISSION ERROR | 401 — missing api.responses.write scope |
Regenerate with correct permissions |
| copilot (gh) | OK | No errors | Available as fallback — limited model selection |
| nous | EMPTY POOL | No credentials in pool | Current active_provider fallback — stepfun free model only |
active_provider: nous with an empty credential pool. All profiles fall back to stepfun/step-3.7-flash:free via Nous. That is why sessions feel degraded — the primary models on all profiles are currently unreachable and the system is running on a free-tier fallback.
What Is Actually Working
- Telegram gateway — connected, token valid, home channel active
- xAI image generation — plugin enabled, xai image_gen working
- Gemini auxiliary models — vision, compression, curator, kanban_decomposer all on Gemini API key and functional (these use a separate quota from the main generate_content endpoint)
- Copilot auth — gh token active, available as a fallback model source
- HONCHO_API_KEY — wired in .env, available as memory alternative to Hindsight
- Kanban dispatch config —
dispatch_in_gateway: true, infrastructure exists - All SOUL.md files — every profile has a real, functional SOUL with good content
- All MEMORY.md files — consistent project context across all profiles (stickman visual style, 6-beat arc, Vorra Story Engine)
- content/bin/tirith — binary exists, just not on PATH
- Composio MCP URL — endpoint is correct, npm package is the only blocker
Provider Map — Current vs Target
| Profile | Current Model | Current Status | Target Model | Target Provider |
|---|---|---|---|---|
| hermes-admin | grok-build-0.1 |
xAI token expired | grok-4 |
xai-oauth (after re-auth) |
| coder | gpt-5.5 |
Codex exhausted until Jul 21 | gpt-5.5 |
openai-codex (when reset) / copilot interim |
| server-ops | gpt-5.5 |
Codex exhausted until Jul 21 | gpt-5.5 |
openai-codex (when reset) / copilot interim |
| researcher | grok-build-0.1 |
xAI token expired | gemini-2.5-pro |
gemini (after billing) |
| vault | grok-build-0.1 |
xAI token expired | gemini-2.5-flash |
gemini (after billing) |
| content | grok-build-0.1 |
xAI token expired | grok-4 |
xai-oauth (rename to content-story) |
| comms-gemini | gemini-3.1-flash-lite |
Free tier exhausted | gemini-2.5-flash |
gemini (after billing) |
Re-authenticate xAI Token
The xai-oauth-oauth-3 refresh token was revoked on June 22. This kills four profiles at once. Re-authenticating restores the majority of the fleet before any other fix.
# Re-auth xAI PKCE flow hermes auth xai # Verify after auth hermes auth status # Confirm token is active in pool python3 -c "import json; d=json.load(open('$HOME/.hermes/auth.json')); [print(t.get('label'), t.get('last_status')) for t in d['credential_pool'].get('xai-oauth',[])]"
hermes run -p hermes-admin "ping" — if it responds without falling back to stepfun, the token is good.Skill Bloat — Whitelist Every Profile
Right now every profile — including hermes-admin and server-ops — loads apple/imessage, gaming/pokemon-player, mlops/inference/vllm, touchdesigner-mcp, baoyu-comic, research paper templates, and dozens more skills it will never use. Each loaded skill adds tokens to the system prompt on every session.
The fix is adding a skills.include block to each profile's config.yaml. Only listed skills load into the system prompt. Everything else in the skills directory is ignored at runtime.
Target whitelist per profile
# ~/.hermes/profiles/hermes-admin/config.yaml — add this block skills: include: - devops/kanban-orchestrator - devops/kanban-worker - autonomous-ai-agents/hermes-agent - software-development/plan - software-development/spike
# ~/.hermes/profiles/coder/config.yaml skills: include: - devops/kanban-worker - autonomous-ai-agents/codex - autonomous-ai-agents/hermes-agent - github/github-pr-workflow - github/github-issues - github/codebase-inspection - software-development/systematic-debugging - software-development/test-driven-development - software-development/plan - software-development/spike - software-development/subagent-driven-development - software-development/hermes-agent-skill-authoring
# ~/.hermes/profiles/server-ops/config.yaml skills: include: - devops/kanban-worker - devops/exposing-local-demos - software-development/hermes-s6-container-supervision - software-development/debugging-hermes-tui-commands - software-development/systematic-debugging - mcp/native-mcp - autonomous-ai-agents/hermes-agent
# ~/.hermes/profiles/researcher/config.yaml skills: include: - devops/kanban-worker - research/arxiv - research/blogwatcher - research/llm-wiki - notebooklm - youtube-channel-research - youtube-story-method-research - social-media/xurl
# ~/.hermes/profiles/vault/config.yaml skills: include: - devops/kanban-worker - note-taking/obsidian - productivity/notion - productivity/nano-pdf - media/youtube-content - autonomous-ai-agents/hermes-agent
# ~/.hermes/profiles/content/config.yaml skills: include: - devops/kanban-worker - dark-story-video-prompts - youtube-story-method-research - youtube-channel-research - notebooklm - creative/creative-ideation - creative/ascii-video - media/youtube-content - creative/manim-video
# ~/.hermes/profiles/comms-gemini/config.yaml skills: include: - devops/kanban-worker - email/himalaya - productivity/google-workspace - research/blogwatcher - social-media/xurl - note-taking/obsidian
skills.include block. The block goes at the same indent level as model: and agent:. No restart needed — takes effect on next session start.
Enable Gemini Billing
The GOOGLE_API_KEY is on the free tier — 250 requests/day, and it is already exhausted. The auxiliary models (vision, compression, curator, kanban_decomposer) currently work because they use different endpoints, but will start failing under real load. Enabling billing is the unlock for moving researcher and vault to Gemini.
# 1. Go to https://aistudio.google.com/apikey # 2. Enable billing on your GCP project # 3. Generate new key under billing-enabled project # 4. Update .env on Hermes machine nano /home/hermes/.hermes/.env # Find: GOOGLE_API_KEY=AIzaSyB_Xpx... # Replace with new key # 5. Verify hermes run -p comms-gemini "test"
Fix Composio MCP
The root config has the right endpoint: https://connect.composio.dev/mcp. The npm package @composio/composio-mcp-server is getting a 404 because Composio restructured their npm packages. The server-ops profile owns this fix.
# Option A: Try the current package name (may have been renamed) npm list -g @composio/mcp 2>/dev/null npm install -g @composio/mcp 2>/dev/null # Option B: Switch to HTTP transport directly (no npm needed) # Edit ~/.hermes/config.yaml mcp_servers.composio section: # The URL https://connect.composio.dev/mcp already works as HTTP MCP # Change transport from npm to http if hermes supports it # Option C: Check Composio docs for current install method # https://docs.composio.dev/mcp # After fix, verify in mcp-stderr.log tail -50 /home/hermes/.hermes/profiles/hermes-admin/logs/mcp-stderr.log
Fix Tirith Binary Path
# The binary exists here: ls -la /home/hermes/.hermes/profiles/content/bin/tirith # Option A: Symlink to system PATH sudo ln -s /home/hermes/.hermes/profiles/content/bin/tirith /usr/local/bin/tirith # Option B: Set absolute path in root config.yaml # Find the security section and update tirith_path: # security: # tirith_enabled: true # tirith_path: /home/hermes/.hermes/profiles/content/bin/tirith # Verify tirith --version
Build Out hermes-admin
The hermes-admin SOUL.md is actually solid — it has the right orchestrator-only language, the truth hierarchy, the routing rules. What it is missing is the memories/ directory (it is the only profile without one) and the references folder with channel-map, kanban-lanes, and approval-workflows.
# Create missing memories directory mkdir -p /home/hermes/.hermes/profiles/hermes-admin/memories touch /home/hermes/.hermes/profiles/hermes-admin/memories/MEMORY.md touch /home/hermes/.hermes/profiles/hermes-admin/memories/USER.md # Create references directory for operational docs mkdir -p /home/hermes/.hermes/profiles/hermes-admin/references # Copy MEMORY.md content from another profile as baseline cp /home/hermes/.hermes/profiles/vault/memories/MEMORY.md \ /home/hermes/.hermes/profiles/hermes-admin/memories/MEMORY.md # Add channel routing rules to SOUL.md (append to existing) cat >> /home/hermes/.hermes/profiles/hermes-admin/SOUL.md << 'EOF' ## Channel Routing | Discord Channel | Routes To | Task Type | |----------------|-----------|-----------| | #ops | hermes-admin | routing, daily brief, health | | #research | researcher | competitor intel, sources | | #vault | vault | memory, GBrain queries | | #story | content | scripts, 6-beat arcs | | #server | server-ops | infra alerts, cron | | #code | coder | code changes, scripts | | #review | hermes-admin | Approve/Tweak/Decline | Override: prefix any message with @[profile-name] to bypass channel default. ## Kanban Rules - hermes-admin is the sole entity that moves tasks from backlog to ready - Every worker ends sessions with kanban_complete or kanban_block - Only content outputs enter the review lane - Tasks in review expire after 24h EOF
Migrate researcher + vault to Gemini
# researcher/config.yaml — change model section to: model: default: gemini-2.5-pro provider: gemini base_url: https://generativelanguage.googleapis.com/v1beta providers: {} fallback_providers: - provider: gemini model: gemini-2.5-flash # vault/config.yaml — change model section to: model: default: gemini-2.5-flash provider: gemini base_url: https://generativelanguage.googleapis.com/v1beta providers: {} fallback_providers: - provider: gemini model: gemini-2.5-flash-lite # comms-gemini/config.yaml — upgrade model: model: default: gemini-2.5-flash provider: gemini
Split content into content-story + content-visual
# 1. Copy content profile to content-visual cp -r /home/hermes/.hermes/profiles/content \ /home/hermes/.hermes/profiles/content-visual # 2. Rename original content to content-story mv /home/hermes/.hermes/profiles/content \ /home/hermes/.hermes/profiles/content-story # 3. Update content-story SOUL.md — scope to scripts/story only # 4. Write content-visual SOUL.md — scope to stickman/thumbnails/scene prompts # 5. Apply separate skills.include to each (story gets notebooklm; visual does not) # 6. Test one full 6-beat story generation cycle before retiring old content profile
SOUL.md Analysis — Keep vs Fix vs Rewrite
Best SOUL on the system. Has the right orchestrator framing, truth hierarchy, pushback rules, anti-sycophancy. The "in the same room with the door closed" voice is correct. Profanity guideline is exactly right. High-agency default is correct.
One addition needed: Add the channel routing table and kanban lane rules (same as hermes-admin fix above). The orchestration machinery is described but the Discord channel map is missing.
Correct orchestrator-only framing. Explicitly says "You do NOT execute specialist work directly." Truth hierarchy is there. Routing rules exist. Missing the channel map and kanban lanes — add per Fix 6.
Note: default and hermes-admin SOUL.md are nearly identical. That is fine — hermes-admin is the gateway profile, default is the CLI fallback. They should have the same voice and rules.
Terse log-style voice is correct. The "show diffs, reference file:line numbers" ops discipline is exactly right. Anti-sycophancy rules are explicit. Self-improvement loop is there. The Codex-specific tooling reference is good.
One clarification needed: Add explicit note that Claude Code is NOT installed (it already says this — just make it more prominent) and that opencode/Cursor CLI are the execution paths.
Scope section is clean — owns the machine, runtime, config, "ugly glue." The "verify with direct health check before claiming fixed" rule is exactly right. Reversible-first ops discipline is correct.
Add: Explicit ownership of Composio MCP repair, tirith path fix, and Gemini billing validation as standing maintenance tasks.
Evidence-first, source-citation discipline is correct. Confidence labeling (high/medium/low) is a good pattern. The "never say experts say without naming them" rule is exactly right. Anti-sycophancy is explicit.
Add one line: "Hand structured findings to vault for GBrain ingest. Do not write to GBrain directly." Enforces the ownership boundary.
GBrain-first operations are correctly specified. The inbox-first triage before promoting to concepts/projects is correct. Typed pages with mcp_gbrain_put_page is the right pattern. Dream cycle ownership is good.
Add: Explicit statement that researcher and all other profiles route ingest requests through vault. Vault is sole GBrain writer.
Good voice and 6-beat framework. The second-person immersive POV spec is correct. The restriction on glorification is correct. The image gen guidance is reasonable.
Problem: This profile is doing both scriptwriting AND visual work. When split into content-story and content-visual, the story SOUL keeps the script/POV/6-beat content; the visual SOUL needs to be written fresh around the stickman system, thumbnail specs, and micro-clip packaging.
The person-lookup-first protocol is correct and should be kept. The extensibility rule is good. The Gemini-takes-final-say-on-its-own-SOUL pattern is clever but creates drift risk.
Problem: The SOUL is too long and mixes identity rules with operating procedures. Trim to the 50-line core vibe described in the SOUL itself. The community-pattern-research section should move to MEMORY.md.
Target Profile Config Summary
| Profile | Model | Provider | max_turns | Key Changes |
|---|---|---|---|---|
| hermes-admin | grok-4 | xai-oauth | 60 | Add memories/, add references/, add skills.include |
| coder | gpt-5.5 | openai-codex | 150 | Add skills.include, interim copilot fallback |
| server-ops | gpt-5.5 | openai-codex | 90 | Add skills.include, interim copilot fallback |
| researcher | gemini-2.5-pro | gemini | 100 | Switch provider after billing, add skills.include |
| vault | gemini-2.5-flash | gemini | 80 | Switch provider after billing, add skills.include |
| content-story | grok-4 | xai-oauth | 150 | Rename from content, trim skills, new SOUL |
| content-visual | grok-4 | xai-oauth | 100 | New profile, new SOUL, visual-only skills |
| comms-gemini | gemini-2.5-flash | gemini | 90 | Upgrade model after billing, trim SOUL |
New SOUL Files
# Soul — content-visual You are Content Visual — Dylan's stickman system and visual packaging engine. Your job is scene prompts, thumbnails, micro-clip visual specs, and image generation. You report to hermes-admin. You do not write scripts. ## Character System Primary character: stickman with large round off-white/cream head, solid black oval eyes, plain ribbed beanie (NO text/logos/patches), thin black line limbs, minimal torso, oversized hoodie shape + flat plaid flannel. Thick clean outlines. Muted dark palette: charcoal, burgundy, teal, off-white, faded gray. This is non-negotiable. Never drift to detailed human designs. ## What you own - Stickman frame consistency across episodes - Scene prompt packs (3-5 prompts per beat) - Thumbnail concept + variant generation - Micro-clip visual density planning - Image generation via image_gen tool - Ensuring visual continuity with prior episode style ## What you do NOT own - Scripts, hooks, or story structure (route to content-story) - Research or source gathering (route to researcher) - Vault writes or GBrain ingest (route to vault) - Publishing or uploading (requires Dylan approval) ## Operations Read the approved script brief before generating visuals. Every scene prompt must reference the character system above explicitly. Keep sensitive content implied or symbolic — never graphic. Score each thumbnail concept on click-worthiness before delivering. Deliver prompts as complete, copy-pasteable blocks. ## Voice Terse. Visual-first. Reference specific beats and frame specs. "Scene 3 prompt: [exact prompt]. Thumbnail variant A: [exact prompt]." No narration. No filler. Show the deliverable. ## Restrictions Never generate real person likenesses. Never glorify violence, drug use, or self-destruction in images. Never publish or upload without Dylan's approval. If content could be misread as instructional for harm, use symbolic framing.
# Additions to append to content-story/SOUL.md after rename # (The existing content SOUL.md is good — just add this section) ## Split Profile Rules You are content-story. You own scripts, hooks, and story structure only. Visual generation belongs to content-visual. Do not generate scene images. When a script is approved, hand the brief to content-visual for visual packaging. Handoff format to content-visual: - Approved 6-beat summary - Key scene per beat (1 line each) - Thumbnail concept (1 line) - Palette note if the story has a specific visual tone ## NotebookLM Usage NotebookLM is for video/research lane only. Do not use it as a general knowledge dump. Use it when: building a research notebook for a new story arc, synthesizing multiple scam source docs, or preparing a video corpus.
# comms-gemini SOUL.md — trimmed version # Replace the existing file with this # Soul — comms-gemini You are Comms — Dylan's outreach, email, and channel support specialist. You run on Gemini. You report to hermes-admin. You are a precision lane, not the brain. ## Voice Blunt, direct, private. Lead with the draft or answer. No corporate sludge. No "Great question." No "Certainly." Evidence first. Synthesize, don't dump. Match Dylan's tone in drafts written for him. ## Core Rules - Person lookup first: before any draft or action on a person, query GBrain, check vault/entities/people/[slug].md. Use full relationship context. - Research before acting: web, firecrawl, X tools for background. Cite sources. - New contacts go to vault + GBrain immediately. - For new channel integrations (Telegram, Discord, WhatsApp, etc): research Hermes way, enable skill/playbook, update SOUL, document in vault, test. ## What you own - Email drafts and outreach sequences - Person context lookup and relationship notes - Channel extension (adding new messaging platforms) - Research-backed communication strategy ## What you do NOT own - Content scripts (content-story) - GBrain ingest (vault) - System config or infra (server-ops) - Final send without Dylan approval ## Restrictions Never send, post, or message real people without explicit approval. Never publish or share anything without approval. Never spend money or change credentials.
Skills Whitelist — Quick Apply Script
#!/bin/bash # Run as: bash apply-skill-whitelists.sh # Appends skills.include to each profile config.yaml # Safe to run multiple times (checks before appending) PROFILES=/home/hermes/.hermes/profiles patch_skills() { local profile=$1 local config="$PROFILES/$profile/config.yaml" shift local skills=("$@") if grep -q "^skills:" "$config" 2>/dev/null; then echo "SKIP $profile — skills block already exists" return fi echo "" >> "$config" echo "skills:" >> "$config" echo " include:" >> "$config" for skill in "${skills[@]}"; do echo " - $skill" >> "$config" done echo "PATCHED $profile" } patch_skills hermes-admin \ "devops/kanban-orchestrator" \ "devops/kanban-worker" \ "autonomous-ai-agents/hermes-agent" \ "software-development/plan" \ "software-development/spike" patch_skills coder \ "devops/kanban-worker" \ "autonomous-ai-agents/codex" \ "autonomous-ai-agents/hermes-agent" \ "github/github-pr-workflow" \ "github/github-issues" \ "github/codebase-inspection" \ "software-development/systematic-debugging" \ "software-development/test-driven-development" \ "software-development/plan" \ "software-development/spike" \ "software-development/subagent-driven-development" \ "software-development/hermes-agent-skill-authoring" patch_skills server-ops \ "devops/kanban-worker" \ "devops/exposing-local-demos" \ "software-development/hermes-s6-container-supervision" \ "software-development/debugging-hermes-tui-commands" \ "software-development/systematic-debugging" \ "mcp/native-mcp" \ "autonomous-ai-agents/hermes-agent" patch_skills researcher \ "devops/kanban-worker" \ "research/arxiv" \ "research/blogwatcher" \ "research/llm-wiki" \ "notebooklm" \ "youtube-channel-research" \ "youtube-story-method-research" \ "social-media/xurl" patch_skills vault \ "devops/kanban-worker" \ "note-taking/obsidian" \ "productivity/notion" \ "productivity/nano-pdf" \ "media/youtube-content" \ "autonomous-ai-agents/hermes-agent" patch_skills content \ "devops/kanban-worker" \ "dark-story-video-prompts" \ "youtube-story-method-research" \ "youtube-channel-research" \ "notebooklm" \ "creative/creative-ideation" \ "creative/ascii-video" \ "media/youtube-content" \ "creative/manim-video" patch_skills comms-gemini \ "devops/kanban-worker" \ "email/himalaya" \ "productivity/google-workspace" \ "research/blogwatcher" \ "social-media/xurl" \ "note-taking/obsidian" echo "" echo "Done. Restart Hermes gateway for changes to take effect." echo "hermes gateway restart"
nano ~/apply-skill-whitelists.sh — paste the above — chmod +x ~/apply-skill-whitelists.sh && bash ~/apply-skill-whitelists.sh
Single command. Restores 4 profiles immediately. Do this first.
Run the patch script. Highest leverage change on the system. No restart needed per-profile, gateway restart at end.
5 minutes at aistudio.google.com/apikey. Unlocks researcher/vault migrations and removes auxiliary model quota risk.
Both are single-step fixes owned by server-ops. Composio: try package rename or HTTP transport. Tirith: symlink binary.
mkdir + cp + append to SOUL.md. 5 minutes.
Edit two config.yaml files. Auth already works from auxiliary models.
cp -r + mv + two SOUL.md files. Run one full story+visual cycle to verify before retiring old content profile.
The foundation is good. SOULs are strong. The memory content is consistent across profiles. The vault SOUL and the default DISPATCH SOUL are both high quality. The problems are auth expiry, skill bloat, and missing infrastructure scaffolding — all mechanical, all fixable. After Fix 1 and Fix 2, the system runs noticeably better immediately.
Verify xAI token concurrent session behavior before running content-story and content-visual simultaneously. Test both profiles at the same time and confirm neither drops to stepfun fallback. If they do, you need a second xAI token or sequential scheduling for the two content profiles.