Three agents only
Generator, Evaluator, Combiner. Every benchmark gain has come from this trio — not from specialized roles. The simpler the colony, the easier it is to understand why it wins.
Five rules that keep Colony ant-like. Next insights come from benchmark data — not more design.
Generator, Evaluator, Combiner. Every benchmark gain has come from this trio — not from specialized roles. The simpler the colony, the easier it is to understand why it wins.
Novelty pulls outward. Evidence pulls toward reality. Constraint Fit pulls toward the question. Three dimensions create useful tension; eight become impossible to reason about.
No agent sees the full notebook. Each one gets a sampled local view: strong trails, random trails, recent trails. Universal visibility kills emergence — agents stop being ants.
Every local view reserves 70% attention for the strongest trails, 20% for the middle, and 10% for weak/random trails — plus a guaranteed recency slot. Real colonies never stop exploring.
The strongest answers are hybrids: Idea A + Idea F + Idea M → new organizing principle. If one operation drives most colony wins, it's Combine — not generation, not scoring.
Every iteration that drifts toward a conventional multi-agent framework gets pulled back here. The temptations to refuse:
Today the synthesizer picks the best note and writes the answer. Eventually we want it to trace the strongest lineage — Note 4 → Note 11 → Note 27 → Note 44 — and synthesize from the winning trail, not just the winning snapshot. That is more ant-like, but it is a real change to the final step. Post-benchmark candidate, not v1.
The colony is exposed as a single MCP tool so any agent (Claude Desktop, Cursor, other apps) can call it when it needs deeper reasoning. No notes API, no run browser, no streaming — one tool, one answer.
POST /api/mcp
Authorization: Bearer <MCP_SECRET>
tool: run_colony({
problem: string,
mode?: "framework" | "alternatives" | "creative" | "research", // default "framework"
max_ticks?: number // 1–12, default 6
})
→ {
run_id, answer, framework,
top_notes[], alternatives[],
score, stats { ticks, llm_calls, latency_ms }
}Internals stay internal. Ants don't expose their pheromone molecules — they expose "here's the food."
Can useful intelligence emerge from many simple agents interacting through a shared trail system?