A Test So Hard No AI System Can Pass It — Yet

The creators of a new test called “Humanity’s Last Exam” argue we may soon lose the ability to create tests hard enough for A.I. models.The New York Times reports that artificial intelligence (A.I.) systems are becoming so advanced that even the smartest humans are struggling to create tests that they can’t pass. For years, A.I. progress was measured by giving new models standardized benchmark tests, but as A.I. systems improved, these tests became too easy. So, new, harder tests were created, but even those are now being aced by A.I. models from companies like OpenAI, Google, and Anthropic.

This has led to a concerning question: Are A.I. systems becoming too smart for us to measure? To address this issue, researchers at the Center for AI Safety and Scale AI have developed a new evaluation called “Humanity’s Last Exam.” This test, created by well-known A.I. safety researcher Dan Hendrycks, is being touted as the hardest test ever administered to A.I. systems. It was originally named “Humanity’s Last Stand,” but was changed for being too dramatic.

According to Hendrycks, this new test will push A.I. systems to their limits and truly test their capabilities. It is meant to be a final exam for A.I. and will hopefully provide a more accurate measure of their progress. However, some experts are skeptical, stating that A.I. systems are constantly evolving and will likely find ways to pass this test as well.

The release of “Humanity’s Last Exam” raises important questions about the future of A.I. and its potential impact on society. As A.I. continues to advance, it is crucial that we find ways to measure and regulate its progress to ensure its safe and ethical use. 

Source:Read More

Leave a Reply