Sweet Talk of Water, EP. 9: What “Thinking” and “Reasoning” really mean in AI and LLMS

0 2 7 minutes read

Sweet Talk of Water, EP. 9: What “Thinking” and “Reasoning” really mean in AI and LLMS

Speech is a special type of small talk, commonly seen in office spaces around a water cooler. There, employees often share all kinds of company gossip, myths, legends, scientific misconceptions, mysterious anecdotes, or outright lies. Anything goes. In my very cool water, I discuss strange and often scientific ideas that I, my friends, or other acquaintances who have experienced in the office have left us speechless.

So, here's a water cooler idea for today's post:

I was quite disappointed using ChatGt the other day to review the Q3 results. This is not artificial intelligence – this is just a search and summarization tool, but not artificial intelligence.

🤷♀️

Usually when we talk about AI, we think of some kind of superior intelligence, straight out of a 90s sci-fi movie. It's easy to get carried away and think of it as a cinematic one-off like Tersotor's Skynet or Dystopian ai. Commonly used illustrations of topics related to AI and robots, Androids, and intergalactic portals, ready to transport us into the future, continue the motivation to misinterpret AI.

Some top results from 'AI' on Underplash;
from left to right: 1) photo by Julien Tromusa on UnscwaSH, 2) photo by Luka Jones Unscwass, 3) photo by xu haiwei on – I brought it

Still, for better or worse, AI systems work in a fundamentally different way — at least for now. Currently, there is no supertipresent SuperIntelligence waiting to solve all of humanity's intractable problems. That's why it's important to understand what AI models actually exist and what they can (and can't) do. Only then can we manage expectations and make the best use of this powerful new technology.

🍨 Data details Is a newsletter to learn about, build, and think about AI and data. If you are interested in these topics, subscribe here.

Mystical Thinking Adventure VS

To get our heads around AI in its current state and what it isn't, and what it can and can't do and, first we can, we first have to understand the difference between constructive and passive thinking.

Psychologist Daniel Kahneman has dedicated his life to the study of how our minds work, which leads to how our minds work, which leads to the conclusions and decisions, which form our actions and behaviors – the last of which is to keep the Nobel Prize. His work is well summarized for the average reader within Thinking fast and slowwhere he describes two modes of human thought:

Program 1: Fast, accurate, and automatic, it really doesn't come.
System 2: slowly, deliberately, and profitably, it requires conscious effort.

From an evolutionary point of view, we tend to choose to operate in system 1 because it saves time and energy – a kind of life that is like living automatically, not thinking about many things. However, the efficiency of system 1 often goes hand in hand with low accuracy, leading to errors.

Similarly, cognitive reasoning aligns closely with Kahneman's system 1. From specific observations to general conclusions. This type of thinking is based on a pattern and therefore, stochastic. In other words, its conclusions always carry some degree of uncertainty, even if we don't see them clearly.

For example:

Pattern: The sun has risen every day of my life.
Conclusion: Therefore, the sun will rise tomorrow.

As you can imagine, this type of reasoning is prone to bias because it always comes from limited data. In other words, the sun will probably rise tomorrow, because it has risen every day in my life, but not actually.

To reach this conclusion, we also think so 'Every day will follow the same pattern as we have found'which may or may not be true. In other words, we implicitly assume that patterns observed in a small sample will apply everywhere.

Such a silent consideration made to reach a conclusion, is exactly what made the thinking that you understand lead to the most visible results, however you are not sure. Similarly to fitting a function with a few data points, we can assume that the underlying relationship may be correct, but it is not certain, and being incorrect is always a possibility. We create a physical model of what we see – and we just hope it's good.

Or put another way, different people working with different credentials or in different situations will produce different results when using induction.

On the flip side, synchronization moves from general principles to specific conclusions – that is, in fact Kahneman's system 2. Derived from kahneman 2If a, then definitely b“.

For example:

Presence 1: All people die.
Presence 2: Socrates is a man.
Conclusion: So, Socrates dies.

This type of thinking is prone to error, because every step of reasoning is decisive. There are no silent ideas; Since the premises are true, the conclusion is it's right Of course.

Back to the proper measurement function, we can think of reduction as a backward process. Calculating the datapaints assigned to the task. Since we know the function, we can certainly calculate the data point, and unlike many curves that fit the same data points for better or worse, it will have one exact answer. Most importantly, devotional thinking is inconsistent and rigid. We can redo a certain function a million times, and we'll always get the same result.

Obviously, even if you use altruistic reasoning, people can make mistakes. For example, we can display the calculation of a certain amount of work and get the result wrong. But this will be a random error. On the contrary, the error of direct reasoning is systematic. The reasoning process itself is prone to error, because we include these silent things without knowing how true they are.

So, how will LLMS work?

It's easy, especially for non-tech or computer science folks, to think of today's AI models as innate, all-knowing intelligence, able to provide intelligent answers to all of people's questions. However, this is not (yet) the case, and today's AI models, as impressive and advanced as they are, are always limited by the goals they work on.

Large-scale linguistic models (LLMS) do not “think” or “understand” the human mind. Instead, they rely on the detailed patterns they are trained on, similar to kahneman's system 1 or reasoning which. Simply put, they work by predicting the next pollunga word for a given input.

You can think of an LLM as a very active student who memorized large amounts of text and learned to reproduce patterns – hear is correct except that – hear Why are they accurate? Most of the time this works because the sentences are – hear right has a high probability actually which exists prepare. This means that such models can produce human-like text and speech with impressive quality, and actually sound very intelligent. However, producing a human-like text and producing arguments and conclusions is – well what's right doesn't necessarily guarantee it there are prepare. Even though LLMS produces content that sounds like dedicated consulting, it is not. You can easily find this out by looking at nonsense ai tools like chatgpt from time to time.

It's also important to understand how LLMS gets these next names. Conceptually, we can imagine that such models simply calculate the frequency of words in an existing text and somehow reproduce these frequencies to produce a new text. But that's not how it works. There are approximately 50,000 words that are used frequently in English, resulting in endless combinations of words. For example, even a short sentence of 10 words the combination would be 50,000 x 10 ^ 10 which is like a large number of stars. On the flip side, all existing English texts in books and on the Internet are a few hundred billion words (about 10^12). Because of this, there is not even enough text to exist to cover the entire phrase, and generate text in this way.

Instead, llms use statistical models built from existing text to count Chances of words and phrases that may have never appeared before. As with any model of reality, however, this is a simplistic approach, leading to attribution of AI errors or design details.

What about a chain of thought?

So, what's up 'The model thinks', or 'Chain of thought (cot) Reasoning'? If LLMS doesn't really think like people do, what are these fancy words? Is it just a marketing ploy? Yes, sort of, but not quite.

A chain of reasoning (COT) is primarily a motivational method to allow llms to answer questions in a sequential, step-by-step manner. In this way, instead of making one big guess to answer the user's question in one step, with a big risk of producing the wrong answer, the model makes many steps to be sure. Essentially, the user 'directs' the LLM by breaking the first question into a multiple declaration that the LLM answers with that one. For example, the simplest form of COT promotion can be initiated by adding to the end of such an object 'Let's think step by step'.

Taking this concept a step further, instead of requiring the user to break down the initial query into sub-queries, the models have 'Thinking Longer'You can do this process alone. In particular, such conceptual models can break down a user's query into step-by-step, sub-questions, leading to better results. COT was one of the major developments in Ayi, allowing models to better handle complex consulting tasks. Opelai's O1 model was the first large example to demonstrate the power of Cot's reasoning.

In my mind

Understanding the underlying principles that enable today's AI models to work is essential to having realistic expectations of what they can and cannot do, and to implement their applications. Neural networks and AI models work naturally to think in informal ways, even if often – well Like making a deduction. Even techniques such as chain thinking, while producing impressive results, are only partially effective in importing and generating information. noise correct, but in reality it is not.

Did you like this post? Let's be friends! Join me at:

📰Put it down 💌 The medium 💼LinkedIn ☕Buy me a coffee!