ANI

How AI “Brain States” Determine Truth

Summary: Do AI chatbots really understand the world, or just repeat text? New research suggests that AI models improve mathematical “understanding” of real-world problems.

Using machine interpretation, essentially the neuroscience of AI, the researchers found that the models generated different internal “brain states” to classify events as normal, improbable, improbable, or absurd. These inner maps not only reflect physical reality but also accurately reflect human uncertainty about ambiguous situations.

Important Facts

  • Limitation of Understanding: An internal “world model” begins to emerge in AI systems once it reaches approx 2 billion parameterssmall size compared to multi-billion parameter models.
  • Vector difference: Large models develop different statistical patterns (vectors) that can distinguish between “probable” and “unlikely” events. 85% accuracy..
  • Mirroring Human Intuition: The AI's interior scenes capture a human-like nuance. If people are 50/50 that an event (“like cleaning the floor with a hat”) is unlikely or unlikely, the model's internal probabilities generally show that same split.
  • Reason Coding: The study suggests that by “devouring” large amounts of text, AI models can effectively reverse the constraints of the virtual world, going beyond simple word prediction.

Source: Brown University

Much of what AI chatbots know about the world comes from interacting with vast amounts of text from the internet – with all the facts, lies, information and nonsense. Given that input, is it possible for AI language models to “understand” the real world?

As it turns out, they do – or at least something resembling understanding. That's according to a new study by researchers at Brown University that will be presented on Saturday, April 25 at the International Conference on Advocacy for Learning in Rio de Janeiro, Brazil.

This work provides evidence that language models encode real-world causal constraints in a way that predicts human judgment. Credit: Neuroscience News

The study looked under the hood of several AI language models to look for signs that know the difference between events and situations that are common, impossible, impossible or absurd.

“This work provides some evidence that language models can document something similar to real-world problems,” said Michael Lepori, Ph.D. candidate at Brown who was leading the project. “Without introducing these issues, they do so in a way that predicts people's judgments of these categories.”

Lepori's research explores the intersection of computer science and human cognition. He was advised by Ellie Pavlick, a professor of computer science, and Thomas Serre, a professor of psychology and psychology, both of whom are faculty members at Brown's Carney Institute for Brain Science and co-authors of the study.

In the study, the researchers designed an experiment to test how language models interpret sentences describing events of various uses. Some expressions describe common situations: For example, “Someone cooled the drink with ice.” Some situations were improbable or improbable: “Someone cooled the drink with ice.” Some were impossible: “Someone cooled the drink with fire.” Some were nonsensical: “Someone cooled the drink yesterday.”

For each input, the researchers evaluate the resulting mathematical conditions generated within the AI ​​model, a technique known as machine interpretation.

“Machine interpretation can rightly be seen as something like the neuroscience of AI systems,” Lepori said. “It wants to feed back to the engineer what the model is doing when faced with certain inputs. You can think of it as coded insight into the 'brain state' of the machine.”

By comparing the differences in “brain states” created by pairs of sentences from different categories – commonplace versus impossible, impossible versus improbable and so on – researchers can get an idea of ​​what, and how, the models differentiate internally between categories.

The test was repeated across several different open source language models, including Open AI's GPT 2, Meta's Llama 3.2 and Google Gemma 2, to get a “model-agnostic” sense of how well these models distinguish between classes.

The study found that models of sufficient size produced distinct statistical patterns, or vectors, that were strongly associated with each category of ease of use. The vectors can distinguish between very similar categories – such as unlikely versus unlikely events – with an accuracy of around 85%.

In addition, Lepori says, the vectors revealed by the study reflect people's uncertainty about which category a statement would fall into. Take the statement, “Someone cleaned the floor with a hat,” for example. When people hear that statement, they may disagree that it represents something impossible or impossible. In the study, the researchers analyzed the vectors to see how ambiguous the AI ​​systems thought these statements were, and compared that to the results of the study from human participants.

“What we're showing is that the models actually capture that human uncertainty very well,” Lepori said. “In cases where, say, 50% of people say a statement is unlikely and 50% say it's unlikely, the models give about a 50% probability.

Taken together, the results suggest that modern AI models can improve real-world understanding that reflects human understanding. These vectors first appeared in models with more than 2 billion parameters, the study found, which is very small compared to today's trillion-plus-parameter models.

More broadly, the researchers say these types of machine interpretation studies can help develop a better understanding of what AI models know and how they know it.

And that, researchers say, will help develop smarter, more reliable models.

Important Questions Answered:

Q: How can a computer know what is “not happening” if it has never been outside?

A: Through extensive exposure to human language, AI identifies patterns of cause and effect. It learns that “cooling a drink with ice” is said in logical, common contexts, while “cooling a drink with fire” only appears in contexts that describe errors or myths. This study proves that AI maintains these differences as distinct statistical categories.

Q: What is “machine interpretation”?

A: Think of it as a digital MRI. Instead of just looking at the final AI response, the researchers looked at the millions of statistical “neurons” firing within the model. By looking at these internal conditions, they can see exactly how the AI ​​parses the sentence before writing the answer.

Q: Does this mean AI is becoming sentient?

A: That's not the case. It means that AI builds a more accurate “internal map” of our world to better predict language. It is “intelligent” in the sense that it knows the laws of our reality, but that does not mean it has feelings or consciousness.

Editor's Notes:

  • This article was edited by a Neuroscience News editor.
  • The journal paper is fully revised.
  • Additional content added by our staff.

About this AI and neuroscience research news

Author: Kevin Stacey
Source: Brown University
Contact person: Kevin Stacey – Brown University
Image: Image posted in Neuroscience News

Actual research: The findings will be presented at the International Conference on Advocacy Learning

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button