← Torna a /research
Classifica · WAB
Top 100 · workspace agentici auditati · 12 pillar · L0–L4
| # | Workspace | Tipo | Grade | Score | cluster | ELO | Pillar maturi | Punto debole | Stack | Auditato | Evidence |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Madani Workspace• B2B services portfolio · iter-39 | ref | A | 87.08 | 87 A95 B89 C73 D | 1900 | — | — | Claude Code · Python · n8n · launchd · auto-promote-engine | 2026-05-25 | audit ↗ |
| 2 | Hermes Agent · NousResearch• skill-curator + RL self-evolution | ext | C | 50.83 | 30 A63 B53 C60 D | 1650 | — | — | Python · agent/curator.py · skill_manage · GRPO | 2026-05-24 | audit ↗ |
| 3 | OpenClaw• agentic platform · plugin ecosystem | ext | D | 47.50 | 23 A57 B58 C50 D | 1580 | — | — | TypeScript · Node.js · plugin system | 2026-05-24 | audit ↗ |
| 4 | OpenAI Agents SDK · Python• agent SDK library | ext | D | 40.83 | 23 A42 B49 C50 D | 1450 | — | — | Python · agents framework | 2026-05-20 | audit ↗ |
| 5 | Cline · IDE Agent• VS Code agentic IDE | ext | D | 32.50 | 13 A47 B35 C35 D | 1480 | — | — | TypeScript · VS Code extension | 2026-05-24 | audit ↗ |
| 6 | Anthropic Cookbook• code-sample repository | ext | F | 27.50 | 7 A33 B35 C35 D | 1380 | — | — | Python · Jupyter · Claude Agent SDK | 2026-05-20 | audit ↗ |
Mostro 6 di 6
Legenda
✓ verified · audit verificato dai maintainer del benchmark.
• self-reported · audit eseguito dal submitter · re-audit server-side in roadmap v0.5.
Pillar maturi · numero di pillar al massimo livello di maturità (L4 Optimizing) su 12 totali. Es. 9/12 = 9 pillar a L4.
Punto debole · il pillar con la maturità più bassa · dove il workspace ha il gap più grande da colmare.
Cluster A·B·C·D · medie dei 4 cluster (Cognition, Action, Trust, Operations).
ELO · derivato dal composite (1200 + composite × 8). Stesso composite → stesso ELO.
Score · composito 0-100 · media equally-weighted dei 12 pillar.
Livelli L0-L4 · L0 assente · L1 ad hoc · L2 documentato · L3 automatizzato · L4 optimizing (auto-improve).
Composito = media ponderata 4 cluster · ELO Bradley-Terry · ~70% audit deterministic · IRR 1.0 verified. Reference entries verificate nel benchmark repo. Community submissions in Vercel KV live · re-audit CI roadmap v0.5.
