Agent Performance Analysis
Multi-dimensional evaluation across 2,847 test scenarios
Agents Tested
127
Test Scenarios
2,847
Data Points
361K
Avg Success
73.4%
Top Quartile
89.2%
Std Deviation
±12.8
Task Completion vs Reasoning Depth
X:
Reasoning Steps
Y:
Success Rate
Success Rate (%)
Reasoning Steps (avg per task)
High Performers
Mid-High
Average
Below Avg
Low Performers
Top Performing Agents
This Week
1
ReasonerV3-XL
94.7%
2
PlannerPro-2
93.2%
3
CodeAgent-T5
91.8%
4
ToolMaster-v4
88.4%
5
MultiStep-Agent
87.1%
Cluster Distribution
High Performers
23
18.1%
Mid-High
31
24.4%
Average
38
29.9%
Below Avg
22
17.3%
Success by Task Type
Code Gen
82%
Reasoning
76%
Tool Use
71%
Multi-Step
64%