AI Reasoning and Planning

Until very recently, it was observed that LLMs had a very hard time with complex problems. Context was lost, memory of previous steps was distorted, and so forth. This led to unreliable results (hallucinations) and, consequently, to a lack of trust in the technology.

Recent research has shown that LLMs are, in fact, quite good at reasoning and planning if the problem is broken into a series of steps as a result of the right prompts. This reasoning and planning greatly improves the accuracy of the LLM’s output.

Chain of Thought

One of the great breakthroughs in the field of LLMs was the discovery that appending “think step by step” to the prompt had a profound effect on the accuracy of the LLMs output. Prompting the LLM in this way forces the model to decompose the problem and to generate an “inner monologue” of the steps it is taking to solve the problem.

This has two significant advantages:

  • It makes the LLM’s reasoning process more transparent
  • It allows the model to check its own work as it goes.

For example, we might ask, “If a plane crashes on the border of the north field and the south field, where will the survivors be buried?”

Without the reasoning prompt, it is entirely possible that the LLM would pick one of the fields at random. However, if we tell it to think step-by-step, it will examine the parts of the question and realize that survivors are not buried at all.

Note that today’s LLMs have a degree of Chain-of-Thought built into them and won’t get this wrong.

There are three key aspects to explain why chain-of-thought reasoning works:

  • Decomposing the problem into smaller intermediate steps
  • CoT offers the model the ability to keep track of its work and to remember intermediate results.
  • Typically, more tokens are allocated to the reasoning, and thus the model can “think” longer.

Debugging

Because we’ve asked the LLM to think step-by-step, it can tell us each step in its reasoning, and we can examine those steps to see when the LLM fell of the rails. This takes a process that might otherwise be opaque and makes it transparent, greatly enhancing the debugging process.

Next up: ReAct…

Unknown's avatar

About Jesse Liberty

** Note ** Jesse is currently looking for a new position. You can learn more about him at https://jesseliberty.bio Thank you. Jesse Liberty has three decades of experience writing and delivering software projects and is the author of 2 dozen books and a couple dozen online courses. His latest book, Building APIs with .NET, is now available wherever you buy your books. Liberty was a Team Lead and Senior Software Engineer for various corporations, a Senior Technical Evangelist for Microsoft, a Distinguished Software Engineer for AT&T, a VP for Information Services for Citibank and a Software Architect for PBS. He is a 13 year Microsoft MVP.
This entry was posted in Agents, AI, Essentials and tagged , , . Bookmark the permalink.