diff --git a/.DS_Store b/.DS_Store index 0ab78ff5..c4574f1e 100644 Binary files a/.DS_Store and b/.DS_Store differ diff --git a/Lectures/.DS_Store b/Lectures/.DS_Store index fe476a35..980a055b 100644 Binary files a/Lectures/.DS_Store and b/Lectures/.DS_Store differ diff --git a/Lectures/S0-L20/.DS_Store b/Lectures/S0-L20/.DS_Store index b73a5977..172046b9 100644 Binary files a/Lectures/S0-L20/.DS_Store and b/Lectures/S0-L20/.DS_Store differ diff --git a/Lectures/S0-L20/images/.DS_Store b/Lectures/S0-L20/images/.DS_Store index df2dde5c..db3fd20a 100644 Binary files a/Lectures/S0-L20/images/.DS_Store and b/Lectures/S0-L20/images/.DS_Store differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide10.png b/Lectures/S0-L20/images/Reasoning/Slide10.png new file mode 100644 index 00000000..13cfb383 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide10.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide11.png b/Lectures/S0-L20/images/Reasoning/Slide11.png new file mode 100644 index 00000000..4f85d2d3 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide11.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide12.png b/Lectures/S0-L20/images/Reasoning/Slide12.png new file mode 100644 index 00000000..8c313149 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide12.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide13.png b/Lectures/S0-L20/images/Reasoning/Slide13.png new file mode 100644 index 00000000..9e30c13a Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide13.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide14.png b/Lectures/S0-L20/images/Reasoning/Slide14.png new file mode 100644 index 00000000..7676a95f Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide14.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide15.png b/Lectures/S0-L20/images/Reasoning/Slide15.png new file mode 100644 index 00000000..fdd38dd3 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide15.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide16.png b/Lectures/S0-L20/images/Reasoning/Slide16.png new file mode 100644 index 00000000..80c5cea2 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide16.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide17.png b/Lectures/S0-L20/images/Reasoning/Slide17.png new file mode 100644 index 00000000..d95f45b8 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide17.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide18.png b/Lectures/S0-L20/images/Reasoning/Slide18.png new file mode 100644 index 00000000..2352a443 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide18.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide19.png b/Lectures/S0-L20/images/Reasoning/Slide19.png new file mode 100644 index 00000000..2c01d21b Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide19.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide2.png b/Lectures/S0-L20/images/Reasoning/Slide2.png new file mode 100644 index 00000000..681fcfa0 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide2.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide3.png b/Lectures/S0-L20/images/Reasoning/Slide3.png new file mode 100644 index 00000000..e58dd54c Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide3.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide4.png b/Lectures/S0-L20/images/Reasoning/Slide4.png new file mode 100644 index 00000000..65b30c21 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide4.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide5.png b/Lectures/S0-L20/images/Reasoning/Slide5.png new file mode 100644 index 00000000..0034f095 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide5.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide6.png b/Lectures/S0-L20/images/Reasoning/Slide6.png new file mode 100644 index 00000000..1f25056e Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide6.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide7.png b/Lectures/S0-L20/images/Reasoning/Slide7.png new file mode 100644 index 00000000..83427247 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide7.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide8.png b/Lectures/S0-L20/images/Reasoning/Slide8.png new file mode 100644 index 00000000..19ade72d Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide8.png differ diff --git a/Lectures/S0-L20/images/Reasoning/Slide9.png b/Lectures/S0-L20/images/Reasoning/Slide9.png new file mode 100644 index 00000000..c010878e Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/Slide9.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_01.png b/Lectures/S0-L20/images/Reasoning/img_01.png new file mode 100644 index 00000000..1a69559a Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_01.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_02.png b/Lectures/S0-L20/images/Reasoning/img_02.png new file mode 100644 index 00000000..4c2e8abe Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_02.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_03.png b/Lectures/S0-L20/images/Reasoning/img_03.png new file mode 100644 index 00000000..9757afc8 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_03.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_04.png b/Lectures/S0-L20/images/Reasoning/img_04.png new file mode 100644 index 00000000..548d7a72 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_04.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_05.png b/Lectures/S0-L20/images/Reasoning/img_05.png new file mode 100644 index 00000000..be9fa8c6 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_05.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_06.png b/Lectures/S0-L20/images/Reasoning/img_06.png new file mode 100644 index 00000000..9f50ced1 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_06.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_07.png b/Lectures/S0-L20/images/Reasoning/img_07.png new file mode 100644 index 00000000..7923c0a4 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_07.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_08.png b/Lectures/S0-L20/images/Reasoning/img_08.png new file mode 100644 index 00000000..cd2d934d Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_08.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_09.png b/Lectures/S0-L20/images/Reasoning/img_09.png new file mode 100644 index 00000000..66ea30bf Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_09.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_10.png b/Lectures/S0-L20/images/Reasoning/img_10.png new file mode 100644 index 00000000..0c8dd8d9 Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_10.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_11.png b/Lectures/S0-L20/images/Reasoning/img_11.png new file mode 100644 index 00000000..c6cc917a Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_11.png differ diff --git a/Lectures/S0-L20/images/Reasoning/img_12.png b/Lectures/S0-L20/images/Reasoning/img_12.png new file mode 100644 index 00000000..b66bdfec Binary files /dev/null and b/Lectures/S0-L20/images/Reasoning/img_12.png differ diff --git a/_contents/.Rhistory b/_contents/.Rhistory new file mode 100644 index 00000000..e69de29b diff --git a/_contents/S0-L20.md b/_contents/S0-L20.md index dff757e9..3de65e6a 100755 --- a/_contents/S0-L20.md +++ b/_contents/S0-L20.md @@ -36,7 +36,7 @@ In this session, our readings cover: # Unleashing the potential of prompt engineering in Large Language Models: a comprehensive review ### Introduction -Models that are built on Large Language Model (LLM) as the backbone are capable of extracting meaningful information that can assist medical diagnosis or creating engaging contents. These models are also referred to as Artificial Intelligence-Generated Content (AIGC). Once the AIGC model is trained, by changing the way we compose the prompts as input to the model, the quality of the models' output can change. In this paper, we focus on techniques of engineering the prompts to achieve higher quality model output from the same AIGC model. +Models that are built on Large Language Model (LLM) as the backbone are capable of extracting meaningful information that can assist medical diagnosis or creating engaging contents. These models are also referred to as Artificial Intelligence-Generated Content (AIGC). Once the AIGC model is trained, by changing the way we compose the prompts as input to the model, the quality of the model's output can change. In this paper, we focus on techniques of engineering the prompts to achieve higher quality model output from the same AIGC model. ### Basic of Prompt Engineering @@ -46,15 +46,15 @@ One basic technique to improve the model output is to **be clear and precise** i -**Few Shot prompting** is also a common prompt engineering technique, where the model is given a few examples with answers in addition to the original question. This relies on the few shot learning ability that is an emergent in large language models, which is can be understood as a form of meta learning. +**Few Shot prompting** is also a common prompt engineering technique, where the model is given a few examples with answers in addition to the original question. This relies on the few shot learning ability that is emergent in large language models, which can be understood as a form of meta learning. -Authors of the paper also note that **adjusting the temperature and top-p** is essential for the prompt engineering. For code generation where standard pattern is valued, a smaller temperature and top-p is preferred, whereas in creative writing, a larger temperature and top-p may help the model produce original responses. +Authors of the paper also note that **adjusting the temperature and top-p** is essential for prompt engineering. For code generation where standard pattern is valued, a smaller temperature and top-p is preferred, whereas in creative writing, a larger temperature and top-p may help the model produce original responses. ### Advanced Prompt Engineering -Chain of Thought prompting induce the model to respond with step by step reasoning, which not only improves the quality of the output, but also shows correct intermediate steps for high stake applications such as medical reasoning. **Zero-shot chain of thought** is a simple yet effective technique, where we only need to include the phrase "Let's think step by step" to the input. **Golden chain of thought** is a technique that utilizes few-shot prompting for chain of thought prompting, by providing ground truth chain of thoughts solutions as examples to the input of the model. Golden chain of thoughts can boost the solve rate from 38% to 83% in the case of GPT-4, but the method is limited by the requirement of ground truth chain of thoughts examples. +Chain of Thought prompting induces the model to respond with step by step reasoning, which not only improves the quality of the output, but also shows correct intermediate steps for high stake applications such as medical reasoning. **Zero-shot chain of thought** is a simple yet effective technique, where we only need to include the phrase "Let's think step by step" to the input. **Golden chain of thought** is a technique that utilizes few-shot prompting for chain of thought prompting, by providing ground truth chain of thoughts solutions as examples to the input of the model. Golden chain of thoughts can boost the solve rate from 38% to 83% in the case of GPT-4, but the method is limited by the requirement of ground truth chain of thoughts examples. **Self-Consistency** is an extension to chain of thought prompting. After chain of thought prompting, by sampling from the language model decoder and choosing the most self-consistent response, Self-Consistency achieves better performance in rigorous reasoning tasks such as doing proofs. @@ -62,13 +62,13 @@ Chain of Thought prompting induce the model to respond with step by step reasoni -**Knowledge Generation** break down the content generation into two step generations: in the first step generation, the model is only prompted to output pertinent information (knowledge) of the original query, then the knowledge is included as prompt in the second step generation. +**Knowledge Generation** breaks down the content generation into two step generations: in the first step generation, the model is only prompted to output pertinent information (knowledge) of the original query, then the knowledge is included as prompt in the second step generation. -**Least-to-most prompting** also take a multi-step generation approach similar to knowledge generation. A given problem is decomposed into numerous sub-problems, and the model will output responses for each sub-problem. These responses will be included in the prompt to help the model answer the original problem. +**Least-to-most prompting** also takes a multi-step generation approach similar to knowledge generation. A given problem is decomposed into numerous sub-problems, and the model will output responses for each sub-problem. These responses will be included in the prompt to help the model answer the original problem. -**Tree of Thoughts reasoning** construct the steps of reasoning in a tree structure. This is particularly helpful when we need to break down a problem into steps, and further break down of each steps into more steps. **Graph of Thoughts** is a generalization of tree of thought structure, where each each contains the relation between each node. Graph of thoughts may be helpful for problems requiring intricate multifaceted resolutions. +**Tree of Thoughts reasoning** constructs the steps of reasoning in a tree structure. This is particularly helpful when we need to break down a problem into steps, and further break down of each steps into more steps. **Graph of Thoughts** is a generalization of tree of thought structure, where each each contains the relation between each node. Graph of thoughts may be helpful for problems requiring intricate multifaceted resolutions. @@ -76,7 +76,7 @@ Chain of Thought prompting induce the model to respond with step by step reasoni **Chain of Verification** corrects a response that may contain false information, by prompting the LLM to ask verification questions for the response. LLM may correct the false information by answering the verification questions. These answers will help LLM to generate a more accurate response for the original query. -In addition to the specific techniques mentioned above, there also exists **Plug-ins** of ChatGPT such as Prompt Enhancer that automatically enhance the prompt for the user. +In addition to the specific techniques mentioned above, there also exist **Plug-ins** of ChatGPT such as Prompt Enhancer that automatically enhance the prompt for the user. @@ -84,19 +84,11 @@ In addition to the specific techniques mentioned above, there also exists **Plug Benchmarking the prompt methods requires evaluating the quality of response from LLM, which can be performed by human or by other metrics. -**Subjective evaluations** requires human evaluators, which has the following pros and cons -Pros: Fluency, Accuracy, Novelty, and Relevance -Cons: Inconsistency Problem, Expensive, Time Consuming +**Subjective evaluations** requires human evaluators, which has the advantage of evaluating fluency, accuracy, novelty, and relevance, and some of its disadvantages are the inconsistency problem, expensive, and time consuming. -**Objective evaluations** relies on metrics to evaluate the response. Some examples includes - - BLEU: BiLingual Evaluation Understudy - - ROUGE: Recall-Oriented Understudy for Gisting Evaluation - - METEOR: Metric for Evaluation of Translation with Explicit ORdering - - BERTScore: BERT Model used for metric +**Objective evaluations** relies on metrics to evaluate the response. Some examples includes BLEU, which is a biLingual evaluation and BERTScore, which relies on a BERT Model for the metric. -Objective evaluations has the following pros and cons -Pros: Automatic Evaluation, Cheap, Quick -Cons: Alignment Problem +Objective evaluations has pros such as automatic evaluation, cheap, quick and cons particularly about the alignment problem. Evaluation results from InstructEval shows that in few shot settings, once the examples are specified, providing additional prompt harms the performance, while in zero shot settings, the expert written prompt improves performance. @@ -116,7 +108,7 @@ Prompt engineering can help **Assessment in teaching and learning**, where tailo - + ### Long context prompting for Claude 2.1 + https://www.anthropic.com/news/claude-2-1-prompting @@ -176,7 +168,7 @@ The authors use parallel point expanding to achieve speed-up than normal decodin For the evaluation, we can assess it from various perspectives. -- **Evaluation Process​:** +- **Evaluation Process:** - Present a question and a pair of answers to an LLM judge. @@ -256,7 +248,7 @@ In summary, some strong models have very high-quality answers that are hard to b - Ask the RoBERTa to **classify** if the SoT is suitable for the desired answer. -## ​SoT-R – Evaluation +## SoT-R – Evaluation Based on the provided figures, we can understand: @@ -294,3 +286,310 @@ Having thoroughly reviewed the paper, we've gained significant insights into the - **Eliciting or improving LLMs’ ability:** - Graph-of-Thoughts + + + +# Topologies of Reasoning: Demystifying Chains, Trees, and Graphs of Thoughts +## Evolving into Chains of Thought +In the exploration of reasoning and cognitive processes, the paper delves into the intricacies of how thoughts are structured, leading to the conceptualization of reasoning topologies. These topologies provide a framework for understanding the organization and flow of thoughts as individuals tackle various tasks. + +

+ + +This figure presents an evolution of reasoning topologies in language model (LLM) prompting methodologies, showing an increasing complexity in how LLMs process and generate output based on a given input. + +- **Input-Output (IO) prompting**: This is the most basic method where an LLM provides a final reply immediately after receiving the initial prompt from the user, with no intermediate steps in the reasoning process. +- **Chain of Thought (CoT)**: Introduced by Wei et al., this method improves upon IO by incorporating explicit intermediate steps of reasoning, known as "chains of thought," which lead to the final output. +- **Chain-of-Thought with Self-Consistency (CoT-SC)**: Improving upon CoT, CoT-SC introduces several independent reasoning chains originating from the same initial input. The model then selects the best outcome from these final thoughts based on a predefined scoring function. The idea is to utilize the randomness within the LLM to generate multiple possible outcomes. +- **Tree of Thoughts (ToT)**: This method further advances CoT by allowing branches at any point within the chain of thoughts. This branching allows for the exploration of different paths and options during the reasoning process. Each node in the tree represents a partial solution, and based on any given node, the thought generator can create a number of new nodes. Scores are then assigned to these new nodes either by an LLM or human evaluation. The method of extending the tree is determined by the search algorithm used, such as Breadth-First Search (BFS) or Depth-First Search (DFS). +- **Graph of Thoughts (GoT)**: GoT enables complex reasoning dependencies between generated thoughts, allowing for any thought to generate multiple child thoughts and also have multiple parent thoughts, forming an aggregation operation. This method incorporates both branching (where thoughts can generate multiple outcomes) and aggregation (where multiple thoughts can contribute to a single new thought). + +The progression of these topologies indicates a move from linear, single-step reasoning to complex, multi-step, and multi-path reasoning structures, improving the depth and robustness of the reasoning process within LLMs. + +### Thoughts and Reasoning Topologies + +**What is a Thought ?** + +- In CoT, a thought refers to **a statement within a paragraph** that contains a **part of the reasoning process** aimed at **solving the input task**. +- In ToT, in some tasks, such as Game of 24, a thought means **an intermediate or a final solution** to the **initial question**. +- In GoT, a thought contains a **solution of the input task (or of its subtask**). + +Therefore, Paper proposes thought to be "Semantic unit of task resolution, i.e., a step in the process of solving a given task" + +**What is a Reasoning Topology?** + +Authors models thoughts as nodes; edges between nodes correspond to dependencies between these thoughts and a topology can be defined as G =(V,E) + +### Taxonomy of Reasoning Schemes + +**Topology Class** + +

+ +- This section presents three different classes of topological structures used to represent reasoning steps: Chain, Tree, and Graph. +- **Chain:** Depicted as a linear sequence of nodes connected vertically from an "Input" node at the top to an "Output" node at the bottom, suggesting a step-by-step, sequential reasoning process. +- **Tree:** Shown as a branching structure that starts with a single "Input" node which then divides into multiple pathways, eventually leading to one "Output" node. This illustrates a decision-making process that considers various paths or options before concluding. +- **Graph:** Illustrated as a network of interconnected nodes with one "Input" node and one "Output" node. Unlike the chain or tree, the graph shows multiple connections between the nodes, indicating a complex reasoning process with interdependencies and possible loops. + + + +**Topology Scope**:"Can the topology extend beyond a single prompt?" + + +- **Single-prompt** + + - Describes a structure contained within a single prompt/reply interaction. + + - The visual represents a tree topology where all reasoning nodes are part of one complete exchange, suggesting a condensed reasoning process that occurs in one step. + +- **Multi-prompt** + + - Indicates that one prompt/reply can contain multiple reasoning nodes. + + - The visual here expands the tree topology to show that individual prompts or replies may encompass multiple nodes, which implies a more extensive reasoning process involving several interactions. + +**Topology Representation** + +

+ +- The question is, "How is the topology structure represented?" indicating a focus on the manner in which the reasoning processes are visually and conceptually depicted. +- **Tree Diagram** + - A tree diagram is shown with a root node labeled "0" at the top, branching out to nodes "1," "2," and "3," which further branch out to nodes "4" through "9". This diagram is a representation of the reasoning structure, likely meant to illustrate the hierarchical and branching nature of thought processes. + +- **Implicit vs. Explicit Representation** + + - On the left, under the heading "Implicit," there is a statement suggesting a less direct method of describing the reasoning process: "The first preliminary solution should be enhanced three times. Each of these three enhanced solutions should be further augmented in two attempts." + + - On the right, under the heading "Explicit," there is a more direct and detailed explanation of the connections between the nodes: " connects to , , connects to , connects to , connects to , ." + +**Topology Derivation** + +

+ +- **Automatic, semi-automatic:** + - The left side of the slide discusses the automatic and semi-automatic construction of topology structures. It mentions that the structure can be constructed on-the-fly by the Large Language Model (LLM), either fully automatically or with partial control from the user, indicating a semi-automatic approach. The accompanying graphic shows a partial tree with some nodes filled in and others as dotted outlines, suggesting that some parts of the structure are generated by the LLM while others may be influenced or completed by the user. + +- **Manual:** + - On the right side, the slide describes a manual method of topology derivation. Here, the user statically prescribes the structure before reasoning starts, implying that the entire topology is defined in advance by the user without the dynamic involvement of an LLM. The graphic shows a complete tree structure, symbolizing a user-defined topology without any automatic generation. + +**Topology Schedule and Schedule Representation** + +

+ +- **Schedule Class** + + - The slide poses the question, "How is the topology structure explored?" indicating an interest in the methods used to navigate the reasoning topology. + + - Two common search strategies are presented: + - **DFS (Depth-First Search):** Illustrated with a partial topology where the search path moves from the root node "0" to the deepest node along a branch before backtracking, as shown by the direction of the arrows. + - **BFS (Breadth-First Search):** Also shown with a partial topology, but here the search path is horizontal, indicating that the strategy explores all nodes at the current depth before moving to the next level. + +- **Schedule Representation** + + - This section asks, "How is the schedule represented?" highlighting different ways to describe the traversal strategy. + + - Two forms of representation are given + - **Textual description:** Provides a direct command to proceed in either "BFS manner" or "DFS manner," offering a high-level instruction on how to navigate the topology. + - **In-context examples:** Offers specific node traversal sequences such as "Traverse nodes <0>, <1>, <4>" for BFS and "Traverse nodes <0>, <1>, <2>, <3>" for DFS, providing a clear, detailed path to follow within the topology. + + + +**Generative AI Pipeline** + +

+ +1. **Modalities?** + - This suggest various types of data inputs or outputs used in AI, such as text, speech, image, and music. +2. **Pre-training?** + - Indicated by a lightning bolt symbol, referring to the initial phase of AI training where a model learns from a vast dataset before it's fine-tuned for specific tasks. +3. **Fine-tuning?** + - Depicted with a wrench, implying the process of adjusting a pre-trained model with a more targeted dataset to improve its performance on specific tasks. +4. **Tools?** + - Represented by a screwdriver and wrench, this likely refers to additional software or algorithms that can be applied in conjunction with the AI for task completion or enhancement. +5. **Retrieval?** + - Shown with a database icon, suggesting the use of retrieval systems to access pre-stored data or knowledge bases that the AI can use to inform its responses or generate content. + +### LLM Reasoning Schemes Represented With Taxonomy + +

+

+ +Focusing on the application of reasoning schemes in Large Language Models (LLMs), these pages highlight how the taxonomy of reasoning is implemented in AI systems. It covers specific methodologies within the Chain of Thought (CoT) reasoning, such as multi-step reasoning and zero-shot reasoning instructions, showcasing their impact on enhancing the problem-solving capabilities of LLMs. + +### Chain of Thought Works + +

+ +1. **Multi-Step Reasoning:** + - **Chain-of-Thought (CoT):** This is described as a single-prompt scheme utilizing few-shot examples to guide LLMs. + - **Program of Thoughts (PoT):** It refers to the use of code to generate a step-by-step functional Python program. + - **SelfAsk:** This expands each step in the reasoning chain by posing a follow-up question, which is then answered in sequence. +2. **Math Reasoning:** + - On the left, under "User Prompt," an example question is posed regarding Alexis and her spending on business clothes and shoes, followed by a systematic breakdown of the cost of items and the budget used to deduce how much she paid for the shoes. + - On the right, under "LLM Answer," a similar math problem is presented concerning Tobias earning money from chores, with the solution worked out step-by-step to determine how many driveways he shoveled. +3. **Examples:** + - The right side features two math reasoning examples to illustrate the Chain of Thought method in action. Each example is carefully broken down into individual reasoning steps, showing how an LLM might approach complex problems by dividing them into smaller, more manageable parts. + +

+ +1. **Zero-Shot Reasoning Instructions:** + - It describes an approach where LLMs are expected to perform multi-step reasoning without relying on hand-tuned, problem-specific in-context examples. + - Two types of zero-shot reasoning are mentioned: + - **Zeroshot-CoT (Chain of Thought):** A prompt to the LLM to "Let’s think step by step." + - **Zeroshot-PoT (Program of Thoughts):** A prompt to write a Python program step by step, starting with defining the variables. +2. **Creative Writing Example:** + - A user prompt is provided on the right-hand side, which outlines a task for creative writing. The user is instructed to write four short paragraphs, with each paragraph ending with a specific sentence: + 1. "It isn't difficult to do a handstand if you just stand on your hands." + 2. "It caught him off guard that space smelled of seared steak." + 3. "When she didn't like a guy who was trying to pick her up, she started using sign language." + 4. "Each person who knows you has a different perception of who you are." + +### Overview of Chain of Thought Works + +

+ +On the left side, a "User Prompt" is provided for the task of writing a coherent passage of four short paragraphs. Each paragraph must end with a pre-specified sentence: + +1. "It isn't difficult to do a handstand if you just stand on your hands." +2. "It caught him off guard that space smelled of seared steak." +3. "When she didn't like a guy who was trying to pick her up, she started using sign language." +4. "Each person who knows you has a different perception of who you are." + +The phrase "Let’s think step by step." is emphasized, suggesting the application of sequential reasoning to address the creative task. + +On the right side, the "LLM Answer" section provides a sample output from an LLM that has followed the chain of thought reasoning approach. The LLM’s responses are crafted to end each paragraph with the specified sentences, displaying a thoughtful progression that connects each statement. Each paragraph develops a context that leads to the predetermined ending, demonstrating the LLM’s ability to generate content that flows logically and coherently. + +**Planning & Task Decomposition** + +

+ +This figure contains two contrasting examples demonstrating how the Plan-and-Solve approach can be applied: + +1. **Incorrect LLM Approach:** + - The first example (top left) shows an attempt by an LLM to solve a math problem related to a dance class enrollment. The model incorrectly calculates the percentages of students enrolled in various dance classes. The process is marked by a red "X," indicating an incorrect reasoning path where the LLM does not first understand the problem or plan its solution. +2. **Correct PS Prompting Approach:** + - The second example (bottom left) applies the Plan-and-Solve approach correctly. Here, the problem is first understood, a plan is then devised, and finally, the solution is carried out step-by-step. This method is laid out in a series of steps, each addressing a part of the problem: + - **Step 1:** Calculate the total number of students enrolled in contemporary and jazz dance. + - **Step 2:** Calculate the number of students enrolled in hip-hop dance. + - **Step 3:** Calculate the percentage of students who enrolled in hip-hop dance. + +The example demonstrates a structured problem-solving technique where an initial plan is crucial for guiding the LLM through the reasoning process. It emphasizes the effectiveness of decomposing a task into manageable parts and addressing each part systematically, leading to a correct solution. + +

+ +This shows the approach in two stages: + +1. **Stage 1: Decompose Question into Subquestions** + - The example given is a math problem involving Amy climbing and sliding down a slide, with an inquiry about how many times she can do this before the slide closes. + - The problem is decomposed into sub-questions, likely to simplify the task and make the solution process more manageable. +2. **Stage 2: Sequentially Solve Subquestions** + - Subquestion 1: "How long does each trip take?" + - The answer to Subquestion 1 is then used to tackle Subquestion 2: "How many times can she slide before it closes?" + - Each sub-question is answered using a language model that appears to provide a step-by-step explanation, building on the information from the previous steps. + + +

+ +This includes a figure (Figure 2) that provides an example of prompts used for both decomposing and reassembling (split and merge) sub-tasks within a task-solving framework. The example shows a sequence of operations starting with a complex task and breaking it down into smaller, sequential operations that eventually lead to the solution. These operations are represented by the prompts given to the language model, indicating a sequence that the model follows to achieve the task. For instance, starting with a name like "Jack Ryan," the model is prompted to split this into words, identify the first letter of each word, and finally concatenate them with spaces. + +This method showcases how complex tasks can be handled systematically by LLMs, allowing for the modular processing of information. The approach can be generalized to various tasks, as indicated by the side examples where the model performs similar operations on different inputs like "Elon Musk Tesla" and "C++," demonstrating flexibility in the model's reasoning capability. + +**Task Preprocessing:** + +

+ +- **Selection-Inference (SI) :** + - Selection-Inference (SI) is designed to tackle multi-step logical reasoning problems where all essential information is already present within the input context +- **Iterative Refinement:** + - Verification enables the reasoning frameworks to iteratively refine the generated context and intermediate results. + +- **Tool Utilization:** + - To better integrate multiple execution methods, more effective schemes opt to devise a plan that specifies tools for handling each sub-task, before executing the reasoning chain. Examples include AutoGPT , Toolformer , Chameleon , ChatCot , PVS and others . + + +### Reasoning With Trees + +

+ +**Motivation** +- Exploration + - Generate multiple thoughts from a given thought + - Sampling + - Task decomposition +- Voting + - Automatic selection of best outcome of generated outputs + +**K-ary Trees** +K-ary trees can represent decision processes where each node is a decision point, and the branches (up to K) represent different options or outcomes from that decision point. This is especially useful in scenarios with multiple choices at each step, allowing a comprehensive visualization of possible decision paths. + +

+ +

+ +

+ +**Tree of Chains** + Tree of Chains enables a clear visualization of various inference paths and their interconnections, aiding in the systematic exploration and analysis of potential outcomes. By breaking down complex inference processes into manageable chains, it facilitates a deeper understanding and aids in the identification of the most logical or optimal conclusion from a set of premises. + +

+ +**Single Level Tree** +In the reasoning process, Single-Level Trees help organize and visualize the different dimensions or options of a problem, making the decision-making process more structured and streamlined. Each child node can represent an independent line of thought or decision point, allowing analysts to quickly assess the pros and cons of different options without delving into more complex hierarchical structures. + +

+ + **Tree Performance** + - Increasing branching factor + - Higher diversity of outcomes + - Beneficial for accuracy + - Increases computational cost + - Optimal branching factor is hard to find + - Problem dependent + - More complicated problems can benefit more from decomposition into subproblems + ### Reasoning with graphs + +

+ + **Motivation** + +- Aggregation + - Being able to combine multiple thoughts into one + - Synergy + - Produce outcome better than individual parts + - Effective composition of outcomes of tasks +- Exploration +- Flexible + - Arbitrary + +**Examples** + +

+ +

+ +

+ +

+ +

+ + +### Chains vs. Trees vs. Graphs of THoughts + + **Chains** + - Explicit intermediate LLM thoughts + - Step-by-step + - Usually most cost effective + **Trees** + - Possibility of exploring at each step + - More effective than chains + **Graphs** + - Most complex structure + - Enable aggregation of various reasoning steps into one solution + - Often see improvements in performance compared to chains and trees + + + + + + \ No newline at end of file