Skip to content

Commit 5518723

Browse files
committed
update
1 parent 3c4da27 commit 5518723

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ Tags: `[Transfomers]` `[Self-Attention]` `[BERT]` `[BertViz]`
4040

4141
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dcarpintero/generative-ai-101/blob/main/02_in_context_learning.ipynb)
4242

43-
With the increasing size and complexity of model architectures, [large language models (LLMs) have demonstrated in-context learning (ICL) ability](https://splab.sdu.edu.cn/GPT3.pdf). This enables LLMs to perform tasks and generate responses based on the context provided in the input prompt, without requiring explicit fine-tuning or retraining. In practice, this context includes one or a few demonstration examples that guide (condition) the model in performing downstream tasks such as classification, question/answering, information extraction, reasoning, and data analysis. [In 2022, researchers at Anthropic investigated the hypothesis that *'induction [attention] heads'* were the primary mechanism driving ICL](https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html). These specialized units attend earlier parts of the input to copy and complete sequences, which would allow models to adapt to patterns and generate responses aligned to the provided context.
43+
With the increasing size and complexity of model architectures, [Large Language Models (LLMs) have demonstrated in-context learning (ICL) ability](https://splab.sdu.edu.cn/GPT3.pdf). This enables LLMs to perform tasks and generate responses based on the context provided in the input prompt, without requiring explicit fine-tuning or retraining. In practice, this context includes one or a few demonstration examples that guide (condition) the model in performing downstream tasks such as classification, question/answering, information extraction, reasoning, and data analysis. [In 2022, researchers at Anthropic investigated the hypothesis that *'induction [attention] heads'* were the primary mechanism driving ICL](https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html). These specialized units attend earlier parts of the input to copy and complete sequences, which would allow models to adapt to patterns and generate responses aligned to the provided context.
4444

4545
This notebook explores the concept of ICL, demonstrating its practical application in Named Entity Recognition (NER).
4646

@@ -55,7 +55,7 @@ Tags: `[in-context learning]` `[named-entity-recognition]` `[function-calling]`
5555

5656
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dcarpintero/generative-ai-101/blob/main/03_llm_augmentation_tool_integration.ipynb)
5757

58-
LLM-augmentation with tool integration involves connecting Large Language Models to external tools and APIs, allowing them to perform actions beyond text generation. This approach enables LLMs to access real-time information, execute code, query databases, interact with IoT devices, and more. By interpreting user queries, LLMs can select and determine when to use these external resources, enabling them to provide more accurate, up-to-date, and actionable responses. For example, an LLM integrated with a weather API could offer current forecasts, while one connected to a code execution environment could run and debug code snippets. As a practical implementation, we will enhance the previous notebook and combine ICL and LLM-augmentation with function-calling to enrich a corpus after performing Named Entity Recognition with links to a knowledge base such as Wikipedia.
58+
LLM-augmentation with tool integration involves connecting models to external tools and APIs, allowing them to perform actions beyond text generation. This approach enables LLMs to access real-time information, execute code, query databases, and interact with other systems. By interpreting user queries, LLMs can select and determine when to use these external resources, enabling models to provide more accurate, up-to-date, and actionable responses. For example, an LLM integrated with a weather API could offer current forecasts, while one connected to a code execution environment could run and debug code snippets. As a practical implementation, we will enhance the previous notebook and combine ICL for NER with LLM-augmentation to enrich a corpus with links to a knowledge base such as Wikipedia.
5959

6060
<p align="center">
6161
<img src="./static/llm_augmentation_tool_integration.png">

0 commit comments

Comments
 (0)