Isabelle Bousquette / Wall Street Journal:
How Anthropic, OpenAI, and Google are testing AI models by having them play Pokémon Blue on Twitch to track a model's ability to reason and make decisions — Nintendo's original Pokémon games are becoming a popular and strangely effective way to test and benchmark new artificial-intelligence models.
Posted from: this blog via Microsoft Power Automate.