Why this repo matters
No release captured, 8 developer signals, 1 package/install signals

Measure AI agent smartness with a 14-dimension eval framework, confidence intervals, trend tracking, and anti-gaming probes (0 stars, 0 forks, Python, 6 AI signals, 5 developer signals).