Is Claude Dumb
Today?
Daily HumanEval-CC40 benchmark for Claude Code's default model
...
Loading latest results…
Score
—
Model
—
Cost
—
Runtime
—
Score History (last 30 days)
Per-Task Results
Task
Function
Result
Attempts
Turns
Cost
Error
Loading…