- Add mc_api_call_machine() function for MC_MACHINE_TOKEN auth - Update mc_api_call() to use machine token when available - Allows cron jobs to authenticate without cookie-based login - No breaking changes - cookie auth still works for interactive use - Also updates default API URL to production (was localhost)
4.6 KiB
4.6 KiB
Implementation Plan: Nine Meta-Learning Loops Integration
Overview
Integrate Vox's 9 meta-learning loops framework into Mission Control's autonomous agent system to enable closed-loop operations and progression toward full agent autonomy.
Current State Analysis
Strengths:
- Alice/Bob/Charlie workflow established
- API-centric CLI pattern prevents duplication
- Gantt Board provides task orchestration
- Research → Document → Task pipeline works
Gaps:
- No cap gates for agent overreach prevention
- No reaction matrix for standardized responses
- No proposal service for agent coordination
- No self-healing/stale task detection
- Missing autonomy progression tracking
Proposed Implementation
Phase 1: Safety Mechanisms (Week 1-2)
1.1 Cap Gates System
Location: /lib/agents/cap-gates.ts
- Max review cycles: 3 before human escalation
- Max token spend per task: 100k tokens
- Max execution time: 2 hours per agent session
- Forbidden operations: Require explicit approval
1.2 Reaction Matrix
Location: /lib/agents/reaction-matrix.ts
Standardized responses for:
- API failures → Retry with backoff
- Syntax errors → Check SKILL.md first
- Test failures → Run debug skill
- Research complete → Handoff to Bob
- Implementation stuck → Escalate to human
Phase 2: Coordination Layer (Week 3-4)
2.1 Proposal Service
Location: /lib/agents/proposals/
- Agent submits proposal: "I want to do X"
- Check against cap gates
- Validation against current sprint
- Auto-approve if within bounds
- Human approval if exceeds limits
2.2 Proposal Protocol
{
"proposalId": "uuid",
"agentId": "alice-researcher",
"type": "research|implement|test",
"estimatedCost": "tokens",
"estimatedTime": "minutes",
"requiresApproval": true|false,
"rationale": "string",
"expectedOutput": "string"
}
Phase 3: Self-Healing (Week 5-6)
3.1 Stale Task Detection
Location: /lib/agents/health-check.ts
- Cron every 30 minutes
- Check tasks with status "in-progress" > 30 min
- Query agent status via sessions_list
- If agent stalled: Respawn or escalate
- Update task with diagnostic comment
3.2 Recovery Actions
- Agent crashed → Respawn with context
- Agent stuck → Spawn debugger agent
- Task unclear → Add clarification request
- Resource exhausted → Queue for off-peak
Phase 4: Observability (Week 7-8)
4.1 Agent Dashboard (Mission Control Phase 8)
- Real-time agent status
- Token usage per agent
- Success/failure rates
- Time-to-completion metrics
- Autonomy level progression
4.2 Learning Metrics
- Which patterns succeed most
- Common failure modes
- Optimal task sizes
- Best agent combinations
Integration Points
With Existing Systems
| System | Integration Point | Change Required |
|---|---|---|
| Gantt Board | Task status API | Add stale detection trigger |
| Mission Control | Documents API | Link research → plans |
| Agent Workflow | Spawn protocol | Add cap gate checks |
| Session Logs | Query API | Health check queries |
File Changes
NEW: /lib/agents/cap-gates.ts
NEW: /lib/agents/reaction-matrix.ts
NEW: /lib/agents/proposal-service.ts
NEW: /lib/agents/health-check.ts
NEW: /lib/agents/dashboard.ts
MODIFY: /agents/TEAM-REGISTRY.md
MODIFY: Skill files for Alice/Bob/Charlie (add cap checks)
Risks and Mitigation
| Risk | Impact | Mitigation |
|---|---|---|
| Cap gates too restrictive | Agents can't work | Start permissive, tighten based on data |
| Proposal overhead | Slower execution | Auto-approve 90% of cases |
| False stale detection | Interrupted work | Require 3 checks before action |
| Dashboard complexity | Delayed Phase 8 | Build incrementally |
Success Criteria
- Zero runaway agent incidents
- 95% auto-approval rate for proposals
- <5 min stale detection latency
- 50% reduction in human intervention needs
- Complete audit trail of agent decisions
Timeline
- Week 1-2: Cap gates + reactions
- Week 3-4: Proposal service
- Week 5-6: Self-healing
- Week 7-8: Dashboard
Dependencies
- Requires current agent workflow to be stable
- Gantt Board API token access
- Session log query capability
- Session list/monitoring tools
Rollout Strategy
- Deploy cap gates (observation mode)
- Enable reaction matrix
- Launch proposal service (with manual approval)
- Enable auto-approval after 1 week
- Add stale detection
- Build dashboard incrementally
Verdict: ADOPT
This plan directly addresses Mission Control's Phase 6-9 roadmap using a proven pattern from Vox. Start with Phase 1 safety mechanisms before enabling more autonomy.