You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[](https://colab.research.google.com/github/dcarpintero/generative-ai-101/blob/main/02_in_context_learning.ipynb)
42
42
43
-
With the increasing size and complexity of model architectures, [large language models (LLMs) have demonstrated in-context learning (ICL) ability](https://splab.sdu.edu.cn/GPT3.pdf). This enables LLMs to perform tasks and generate responses based on the context provided in the input prompt, without requiring explicit fine-tuning or retraining. In practice, this context includes one or a few demonstration examples that guide (condition) the model in performing downstream tasks such as classification, question/answering, information extraction, reasoning, and data analysis. [In 2022, researchers at Anthropic investigated the hypothesis that *'induction [attention] heads'* were the primary mechanism driving ICL](https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html). These specialized units attend earlier parts of the input to copy and complete sequences, which would allow models to adapt to patterns and generate responses aligned to the provided context.
43
+
With the increasing size and complexity of model architectures, [Large Language Models (LLMs) have demonstrated in-context learning (ICL) ability](https://splab.sdu.edu.cn/GPT3.pdf). This enables LLMs to perform tasks and generate responses based on the context provided in the input prompt, without requiring explicit fine-tuning or retraining. In practice, this context includes one or a few demonstration examples that guide (condition) the model in performing downstream tasks such as classification, question/answering, information extraction, reasoning, and data analysis. [In 2022, researchers at Anthropic investigated the hypothesis that *'induction [attention] heads'* were the primary mechanism driving ICL](https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html). These specialized units attend earlier parts of the input to copy and complete sequences, which would allow models to adapt to patterns and generate responses aligned to the provided context.
44
44
45
45
This notebook explores the concept of ICL, demonstrating its practical application in Named Entity Recognition (NER).
[](https://colab.research.google.com/github/dcarpintero/generative-ai-101/blob/main/03_llm_augmentation_tool_integration.ipynb)
57
57
58
-
LLM-augmentation with tool integration involves connecting Large Language Models to external tools and APIs, allowing them to perform actions beyond text generation. This approach enables LLMs to access real-time information, execute code, query databases, interact with IoT devices, and more. By interpreting user queries, LLMs can select and determine when to use these external resources, enabling them to provide more accurate, up-to-date, and actionable responses. For example, an LLM integrated with a weather API could offer current forecasts, while one connected to a code execution environment could run and debug code snippets. As a practical implementation, we will enhance the previous notebook and combine ICL and LLM-augmentation with function-calling to enrich a corpus after performing Named Entity Recognition with links to a knowledge base such as Wikipedia.
58
+
LLM-augmentation with tool integration involves connecting models to external tools and APIs, allowing them to perform actions beyond text generation. This approach enables LLMs to access real-time information, execute code, query databases, and interact with other systems. By interpreting user queries, LLMs can select and determine when to use these external resources, enabling models to provide more accurate, up-to-date, and actionable responses. For example, an LLM integrated with a weather API could offer current forecasts, while one connected to a code execution environment could run and debug code snippets. As a practical implementation, we will enhance the previous notebook and combine ICL for NER with LLM-augmentation to enrich a corpus with links to a knowledge base such as Wikipedia.
0 commit comments