This article explains how AI Agents work and the multi-agent collaboration mechanism through the real-world case of OpenCode/Oh My OpenCode.
1. Plain English: What is an AI Agent?
Traditional AI vs AI Agent
Traditional AI (e.g., ChatGPT): - You ask: "Help me analyze this project's code quality" - AI says: "Please send me the code" - You copy and paste the code - AI gives you the analysis results
AI Agent: - You say: "Help me analyze this project's code quality" - The Agent goes and: 1. Reads project files 2. Runs code inspection tools 3. Checks test coverage 4. Generates a report - Gives you the result directly
Core difference: Traditional AI is "customer service" — you ask, it answers. An Agent is an "assistant" — you state your needs and it gets things done on its own.
Three Key Capabilities of an Agent
- Tool usage: Can read files, run commands, search the web, call APIs
- Memory: Remembers what was done before, avoids redundant work
- Planning: Knows how to break large tasks into smaller steps and complete them one by one
2. Why Do We Need Multiple Agents?
The Problem with Single Agent: Context Competition
Imagine asking one person to do all of the following simultaneously: - Architecture design (requires macro-level thinking) - Writing code (requires attention to detail) - Debugging bugs (requires state tracking) - Writing documentation (requires clear expression)
This person will: - Cognitive overload: Planning and execution details compete for attention in the same brain - Easily confused: While fixing Bug A, they reintroduce Bug A while working on Bug B - Memory loss: After several rounds of debugging, they forget the original planning goal
This is not a capability issue, it's an information architecture problem — mixing things that shouldn't be mixed together.
Core Value of Multi-Agent: Information Domain Isolation
The leverage of multi-agent comes from information domain isolation, not from imitating corporate organizational structures.
Key insight (from axiom T03): - Isolation is not for division of labor, but to enable each Agent to make better decisions in a clean information environment - Planner focuses on global decisions without being overwhelmed by execution details - Executor focuses on low-level implementation without being distracted by planning discussions
Analogy: - Single Agent = One person being both boss and employee, thinking about strategy and details simultaneously - Multi-Agent = Boss focuses on strategy, employee focuses on execution, coordinated through shared documents
3. OpenCode's Agent Team
Team Structure
OpenCode has a "Project Manager" (Sisyphus) managing 11 "specialists".
Project Manager: Sisyphus
Why this name: In Greek mythology, Sisyphus pushes a boulder up a mountain every day, only for it to roll back down, and he starts again the next day. This symbolizes the AI handling repetitive tasks every day, never stopping.
Core responsibilities: - Receive your requirements, understand intent - Decision-making: Determine which specialist Agents are needed - Assign tasks to specialist Agents - Aggregate results, verify quality
Decision logic: How does Sisyphus know who to dispatch?
User Request
↓
Analyze Task Type
↓
┌─────────────────────────────────────┐
│ Need to search code? │
│ → Explore │
│ │
│ Need external information? │
│ → Librarian │
│ │
│ Need architecture advice? │
│ → Oracle │
│ │
│ Need to write code? │
│ → Select Category by complexity │
└─────────────────────────────────────┘
↓
Evaluate if tasks can run in parallel
↓
Dispatch Agents (possibly multiple at once)
↓
Wait for results → Aggregate → Verify
Practical example: You say "Fix the login bug"
Sisyphus's thinking process: 1. This is a fix task, not a new feature 2. Involves authentication, may span multiple files 3. Need to understand existing code first → Dispatch Explore 4. May need to check common issues → Dispatch Librarian 5. The two tasks are independent, can run in parallel 6. After results come back, decide on the fix plan 7. The fix is a simple config change → Use quick category
11 Specialist Agents
According to Oh My OpenCode's official documentation, they are divided into 4 categories:
1. Communication & Coordination
- Metis (Pre-Analyst)
- Responsibility: Identify hidden pitfalls before a task begins
- When to use: Requirements are vague or may have multiple interpretations
- Recommended model: Claude Opus 4.6 (requires deep reasoning)
-
Example: You say "Add user authentication", Metis will ask:
- Need database migration?
- JWT or Session?
- Is there an existing authentication pattern in the code?
-
Momus (Quality Reviewer)
- Responsibility: Check if work plans are feasible
- When to use: Complex tasks, let it review after plan is made
- Recommended model: Claude Opus 4.6 (requires critical thinking)
- Example: Sisyphus made a 5-step plan, Momus checks:
- Are steps complete?
- Any missing dependencies?
- Are verification criteria clear?
2. Exploration & Research
- Explore (Codebase Search Expert)
- Responsibility: Find files and code patterns in your project
- When to use: Unfamiliar with codebase, need to locate relevant files
- Recommended model: MiniMax-M2.1 (lightweight and fast, frequently used)
- Example: Find all authentication-related code
-
Cannot do: Cannot search external sources (only searches local project)
-
Librarian (External Resource Retrieval Expert)
- Responsibility: Search the web for documentation, GitHub examples, best practices
- When to use: Need external knowledge (official docs, open source examples)
- Recommended model: MiniMax-M2.5 (medium model, balances performance and cost)
- Example: Look up JWT authentication security best practices
- Cannot do: Cannot search local code (only searches external sources)
3. Advisory
- Oracle (Architecture Consultant)
- Responsibility: Gives advice but does not modify code (read-only)
- When to use: Need architecture advice, debugging complex issues, failed 3 times consecutively
- Recommended model: Claude Opus 4.6 (strongest reasoning capability)
- Example: Design distributed lock方案, analyze performance bottlenecks
- Characteristic: High cost but good quality, used only at critical moments
4. Execution (Categorized by Task Type)
These are not specific Agent names, but execution modes automatically selected based on task type:
| Category | Chinese Name | Use Case | Recommended Model | Why |
|---|---|---|---|---|
visual-engineering |
Visual Engineer | Frontend, UI, styles, animations | MiniMax-M2.5 | Strong visual understanding |
ultrabrain |
Ultra Brain | Complex logic, architecture design | Claude Opus 4.6 | Strongest reasoning |
deep |
Deep Thinker | Tasks requiring deep analysis | MiniMax-M2.7 | Balances performance and cost |
artistry |
Artist | Creativity, brainstorming | Gemini 3 Pro | Unconventional thinking |
quick |
Quick Hand | Simple edits, typos | MiniMax-M2.1 | Fast and cheap |
writing |
Writer | Documentation, reports | MiniMax-M2.5 | Optimized for text generation |
Key design: Same task framework, automatically switches to the most suitable model based on type.
Why this configuration?
- Oracle uses the strongest model: Architecture advice requires the strongest reasoning, high cost but used only at critical moments
- Librarian uses a medium model: Searching external resources requires intent understanding, used frequently
- Explore uses a lightweight model: Searching local code only needs pattern matching, used very frequently
- Ultrabrain uses the strongest model: Complex logic requires deep reasoning, high task quality requirements
- Quick uses a lightweight model: Simple edits prioritize speed, low cost
Design principle: Choose the most cost-effective model based on task difficulty and frequency.
4. Real Case: How Do Agents Collaborate?
Case 1: Fix Login Bug
Your requirement: "There's a bug in the login feature, fix it"
Workflow:
Step 1: Dispatch Explorer Agents in Parallel
┌─────────────────────────────────────────────────┐
│ Sisyphus dispatches two Agents simultaneously │
│ │
│ ┌─────────────────┐ ┌──────────────────┐ │
│ │ Explore │ │ Librarian │ │
│ │ Searches local │ │ Searches external│ │
│ │ code │ │ resources │ │
│ └─────────────────┘ └──────────────────┘ │
│ ↓ ↓ │
│ Find auth-related Check JWT common issues │
│ code │
└─────────────────────────────────────────────────┘
Step 2: Wait for Results
┌─────────────────────────────────────────────────┐
│ Explore reports: │
│ Found login.ts, auth.ts, token.ts │
│ │
│ Librarian reports: │
│ Common issue is token expiration time │
│ configuration error │
└─────────────────────────────────────────────────┘
Step 3: Dispatch Execution Agent to Fix
┌─────────────────────────────────────────────────┐
│ Sisyphus decides: │
│ This is a simple config modification │
│ → Dispatch Quick to execute │
│ → Fix expiration time in token.ts │
└─────────────────────────────────────────────────┘
Step 4: Verify
┌─────────────────────────────────────────────────┐
│ - Run code inspection tools │
│ - Confirm no new errors │
│ - Report completion │
└─────────────────────────────────────────────────┘
Key point: Explore and Librarian truly run simultaneously, not one after another.
Case 2: Write a Technical Research Report
Your requirement: "Research best practices for React Server Components"
Sisyphus's strategy:
Dispatch 3 Librarians, each responsible for a different angle, but with 30-50% overlap (for cross-validation):
Parallel Research (3 Librarians running simultaneously)
┌─────────────────────────────────────────────────┐
│ Agent 1: Official docs + Community discussions │
│ Agent 2: Community discussions + Production │ ← Overlap: Community discussions
│ cases │
│ Agent 3: Production cases + Comparative │ ← Overlap: Production cases
│ analysis │
└─────────────────────────────────────────────────┘
↓
Cross-validate overlapping areas
↓
┌─────────────────────────────────────────────────┐
│ If Agent 2 and Agent 3 agree on "Production │
│ cases" information │
│ → High credibility │
│ │
│ If they disagree │
│ → Sisyphus further verifies │
└─────────────────────────────────────────────────┘
↓
Aggregate and generate comprehensive report
Why overlap?
- Agent 2 and Agent 3 both look at "production cases"
- If they find consistent information → High credibility
- If inconsistent → Sisyphus further verifies
Result: 3 Agents run simultaneously, 3x faster than one Agent running serially, and the information is more comprehensive.
5. Key Design Mechanisms
1. Task Routing: Category System
Sisyphus automatically selects the most suitable model based on task type:
Task Type Determination
↓
┌─────────────────────────────────────────┐
│ Frontend, UI, styles? │
│ → visual-engineering │
│ → Use MiniMax-M2.5 (strong visual │
│ understanding) │
│ │
│ Complex logic, architecture design? │
│ → ultrabrain │
│ → Use Claude Opus 4.6 (strongest │
│ reasoning) │
│ │
│ Simple edits, typos? │
│ → quick │
│ → Use MiniMax-M2.1 (fast and cheap) │
└─────────────────────────────────────────┘
Benefit: Same framework, automatically switches to the most suitable model based on task.
2. Session Reuse: Avoid Redundant Work
If an Agent fails the first time, you can continue the conversation without starting over:
First Attempt
↓
Agent executes task
↓
Returns session_id (e.g., "ses_abc123")
↓
Failed?
↓
Continue the same session
↓
Agent remembers:
- Which files were read
- Which approaches were tried
- Which problems were encountered
↓
Saves 70% of redundant work
Value: Agent retains complete context, no need to re-explore.
3. The 6 Elements of Delegation Prompt
When Sisyphus dispatches a task, it must clearly state 6 things:
- TASK: What specifically to do
- EXPECTED OUTCOME: What counts as success
- REQUIRED TOOLS: What tools can be used
- MUST DO: Things that must be done
- MUST NOT DO: Things that are prohibited
- CONTEXT: Relevant files, existing patterns, constraints
Why so strict? (from axiom A08)
Clear prompt quality is the decisive factor in whether AI can correctly understand intent. Vague prompts cause Agents to guess your intent in a huge search space, with a high probability of failure.
4. Failure Recovery: The 3-Strike Rule
If an Agent fails 3 times consecutively:
3 failures
↓
Immediately stop all edits
↓
Rollback to the last working version
↓
Consult Oracle (architecture consultant)
↓
Oracle can't solve it either?
↓
Ask the user
Why: Prevent Agents from trial-and-error, wasting time and money.
6. Comparison with Other Frameworks
OpenCode vs Traditional Frameworks
| Comparison | OpenCode | LangChain | AutoGPT |
|---|---|---|---|
| Architecture | Multi-agent division of labor | Single agent + tools | Single agent + loop |
| Parallel capability | Native support | Need to write yourself | Not supported |
| Model selection | Auto-switch based on task | Fixed one model | Fixed one model |
| Specialization | 11 specialist Agents | General-purpose Agent | General-purpose Agent |
Core difference (from axiom T03):
OpenCode's multi-agent value comes from information domain isolation: - Traditional frameworks = One generalist, planning and execution compete in the same context - OpenCode = Professional team, each Agent makes decisions in a clean information environment
Analogy: - Traditional frameworks = One generalist - OpenCode = A professional team
For simple tasks, a generalist may be faster (no coordination cost). For complex tasks, a professional team is significantly stronger.
Cost and Performance
Based on community data: - Request count: Oh My OpenCode is 3x a regular version (96 vs 27) - Time: 10 minutes more (55 vs 45 minutes) - Success rate: Slightly lower by 4% (69% vs 73%)
However: - Oh My OpenCode handles more complex tasks - Includes more verification and quality checks - Provides more detailed intermediate results
Selection advice: - Simple tasks (fix a typo) → Use regular version - Complex tasks (multi-module refactoring) → Use Oh My OpenCode - Cost-sensitive → Control parallelism
7. Summary
Core Points
What is an AI Agent: - Not just answering questions, can complete tasks on its own - Can use tools, has memory, can plan
OpenCode's innovation: - Multi-agent division of labor, information domain isolation - Automatic model selection based on task type - True parallel execution - Session reuse to avoid redundant work
Key principles: - Delegate if you can do it yourself - Parallelize if you can, don't serialize - Every operation has verification - Stop immediately after 3 failures
Applicable Scenarios
OpenCode excels at: - Complex multi-module tasks - Requiring deep research - Exploring unfamiliar codebases
Not good at: - Simple single-file edits - Highly serial tasks - Extreme cost control
References
- Oh My OpenCode GitHub
- Official Documentation
- Agent Architecture Deep Dive - Rost Glukhov, 2026-03
Closing note: This article is based on practical experience using OpenCode/Oh My OpenCode, combined with guidance from the axiom system (T03 Context Isolation, A08 Prompt Quality, M05 Simplicity). The system is still rapidly iterating, and details may change with version updates.