AgentWatch — Real-Time AI Drift Detective
December 5, 2025
Limited-Time Free
DeveloperToolsObservabilitySaaSAI
Original Context
Author built aistupidlevel.info, an open-source automated benchmark that runs 140+ coding/debugging/optimization tasks every ~20 minutes against multiple LLMs to measure drift across correctness, refusals, latency and stability and found noticeable performance swings across models.
Sign in to see full details
Create a free account to access complete business idea analysis and execution guides.
Sign In / Sign Up