Human Benchmark Testing

A new AI benchmark tests whether chatbots protect human well-being

AI chatbots have been linked to serious mental health harms in heavy users, but there have been few standards for measuring whether they safeguard human well-being or just maximize for engagement. A ...

ZDNet

With AI models clobbering every benchmark, it's time for human evaluation

Artificial intelligence has traditionally advanced through automatic accuracy tests in tasks meant to approximate human knowledge. Carefully crafted benchmark tests such as The General Language ...

JD Supra

The Artificial Intelligence Benchmark: The Most Important Clause You’ve Never Used (Part 1)

You might have noticed, particularly if you watched the Super Bowl this year, that AI is… everywhere. AI is now embedded in nearly everything we use. From customer support chatbots and ...

Morningstar

KushoAI Unveils APIEval-20 to Benchmark AI Agents in API Testing

SAN FRANCISCO, April 8, 2026 /PRNewswire/ -- KushoAI, an AI-native platform for API testing and software reliability, has introduced APIEval-20, an open benchmark designed to evaluate how effectively ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results