Beaver-Company
diff --git a/‎README.md
Lines changed: 6 additions & 3 deletions b/‎README.md
Lines changed: 6 additions & 3 deletions
diff --git a/‎docs/images/deep_learning/model_tuning/attention/attn1_1.png
265 KB b/‎docs/images/deep_learning/model_tuning/attention/attn1_1.png
265 KB
diff --git a/‎docs/images/deep_learning/model_tuning/attention/attn1_2.png
17.4 KB b/‎docs/images/deep_learning/model_tuning/attention/attn1_2.png
17.4 KB
diff --git a/‎docs/images/deep_learning/model_tuning/attention/attn2_1.png
74.4 KB b/‎docs/images/deep_learning/model_tuning/attention/attn2_1.png
74.4 KB
diff --git a/‎docs/images/deep_learning/model_tuning/attention/attn2_2.png
72 KB b/‎docs/images/deep_learning/model_tuning/attention/attn2_2.png
72 KB
diff --git a/‎docs/images/deep_learning/model_tuning/attention/attn4_1.png
74.4 KB b/‎docs/images/deep_learning/model_tuning/attention/attn4_1.png
74.4 KB
diff --git a/‎docs/images/deep_learning/model_tuning/attention/attn4_2.png
55.4 KB b/‎docs/images/deep_learning/model_tuning/attention/attn4_2.png
55.4 KB
diff --git a/‎docs/images/deep_learning/model_tuning/attention/attn4_3.png
44 KB b/‎docs/images/deep_learning/model_tuning/attention/attn4_3.png
44 KB
diff --git a/‎transformer_courses/reading_comprehension_based_on_ernie/README.md
Lines changed: 63 additions & 0 deletions b/‎transformer_courses/reading_comprehension_based_on_ernie/README.md
Lines changed: 63 additions & 0 deletions
diff --git a/‎transformer_courses/reading_comprehension_based_on_ernie/data_processor.py
Lines changed: 103 additions & 0 deletions b/‎transformer_courses/reading_comprehension_based_on_ernie/data_processor.py
Lines changed: 103 additions & 0 deletions
@@ -297,10 +297,13 @@
 
 | 章节名称                      | notebook链接                                                 | Python实现                                                   | 课程简介                                                     |
 | ----------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |
-| transformer在图像分类中的应用 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2154618) | [Python实现](./transformer_courses/Application_of_transformer_in_image_classification) | 本章节将为大家详细介绍 Transformer 在 CV 领域中的两个经典算法：ViT 以及 DeiT。带领大家一起学习Transformer 结构在图像分类领域的具体应用。 |
-|经典的预训练语言模型   |  [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2110336)  |[Python实现](./transformer_courses/Transformer_Machine_Translation)|本章节将为大家详细介绍NLP领域 Transformer。Transformer的前世今生，包括ELMo，GPT，Transformer，BERT等经典模型，还会介绍Transformer在机器翻译里面的应用                                                              |
-| 预训练模型的瘦身策略 – – 高效结构 |  [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2138857)|  [Python实现](./transformer_courses/Transformer_Punctuation_Restoration) | 本章节将为大家详细介绍NLP领域，基于Transformer模型的瘦身技巧。包括 Electra，AlBERT 以及 performer。还会介绍代码实现案例：基于Electra的语音识别后处理中文标点符号预测   |
+|经典的预训练语言模型   |  [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2110336)  |[Python实现](./transformer_courses/Transformer_Machine_Translation)|本章节将为大家详细介绍NLP领域 Transformer。Transformer的前世今生，包括ELMo，GPT，Transformer，BERT等经典模型，还会介绍Transformer在机器翻译里面的应用|
+|经典的预训练语言模型   |  [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2110336)  |[Python实现](./transformer_courses/Transformer_Machine_Translation)|本章节将为大家详细介绍NLP领域 Transformer。Transformer的前世今生，包括ELMo，GPT，Transformer，BERT等经典模型，还会介绍Transformer在机器翻译里面的应用|
+|预训练模型在自然语言理解方面的改进| [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2166195) | [Python实现](./transformer_courses/reading_comprehension_based_on_ernie)|ERNIE， RoBERTa， KBERT，清华ERNIE等，在广度上去分析经典预训练模型的一些改进。|
+|预训练模型在长序列建模方面的改进| [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2166197) |[Python实现](./transformer_courses/sentiment_analysis_based_on_xlnet)|Transformer-xl， xlnet， longformer等，分析BERT和transformer的长度局限，并讨论这些方法的改进点。|
 | BERT蒸馏 |  [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2177549)|  [Python实现](./transformer_courses/BERT_distillation) | 本章节为大家详细介绍了针对BERT模型的蒸馏算法，包括：Patient-KD、DistilBERT、TinyBERT、DynaBERT等模型，同时以代码的形式为大家展现了如何使用DynaBERT的训练策略对TinyBERT进行蒸馏。   |
+| 预训练模型的瘦身策略 – – 高效结构 |  [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2138857)|  [Python实现](./transformer_courses/Transformer_Punctuation_Restoration) | 本章节将为大家>详细介绍NLP领域，基于Transformer模型的瘦身技巧。包括 Electra，AlBERT 以及 performer。还会介绍代码实现案例：基于Electra的语音识别后处理中文标点符号预测   |
+| transformer在图像分类中的应用 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2154618) | [Python实现](./transformer_courses/Application_of_transformer_in_image_classification) | 本章>节将为大家详细介绍 Transformer 在 CV 领域中的两个经典算法：ViT 以及 DeiT。带领大家一起学习Transformer 结构在图像分类领域的具体应用。 |
 |                               |                                                              |                                                              |                                                              |
 
 # 五、 经典深度学习案例集（开发中）
 
@@ -0,0 +1,63 @@
+# 基于ERNIE的阅读理解
+
+## 依赖模块
+
+* python3
+* paddlepaddle-gpu==2.0.0.post101
+* paddlenlp==2.0.1
+
+## 项目介绍
+
+```
+|-data_proessor.py：数据处理相关代码
+|-train.py：模型训练代码
+|-evaluate.py：模型评估代码
+|-utilis.py：定义模型训练时用到的一些组件
+```
+
+本项目基于预训练模型ERNIE进行中文阅读理解，使用的数据集是Dureader_robust数据集。
+
+### 模型介绍
+
+ERINE是百度发布一个预训练模型，它通过引入三种级别的Knowledge Masking帮助模型学习语言知识，在多项任务上超越了BERT。
+
+
+## 模型训练
+
+```shell
+export CUDA_VISIBLE_DEVICES=0
+
+python ./train.py --model_name ernie-1.0 \
+                         --epochs 1       \
+                         --learning_rate 3e-5     \
+                         --max_seq_length 512     \
+                         --batch_size 12     \
+                         --warmup_proportion 0.1 \
+                         --weight_decay 0.01 \
+                         --save_model_path ./ernie_rc.pdparams \
+                         --save_opt_path ./ernie_rc.pdopt
+```
+
+其中参数释义如下：
+
+- `model_name` 需要加载的模型名字。
+- `epochs` 训练轮次。
+- `learning_rate` 学习率。
+- `max_seq_length` 最大句子长度，超过将会被截断。
+- `batch_size` 每次迭代每张卡上的样本数目。
+- `warmup_proportion` warmup占据总的训练迭代次数的比例。
+- `weight_decay` 权重衰减值。
+- `save_model_path` 模型保存路径。
+- `save_opt_path` 优化器保存路径。
+
+## 模型评估
+
+运行evaluate.py脚本进行模型评估。
+
+```shell
+export CUDA_VISIBLE_DEVICES=0
+
+python ./evaluate.py --model_path ./ernie_rc.pdparams \
+                             --max_seq_length 512     \
+                             --batch_size 12 
+```
@@ -0,0 +1,103 @@
+import collections
+import time
+import json
+import paddle
+from paddlenlp.metrics.squad import squad_evaluate, compute_prediction
+
+
+def prepare_train_features(examples,tokenizer,doc_stride,max_seq_length):
+    # Tokenize our examples with truncation and maybe padding, but keep the overflows using a stride. This results
+    # in one example possible giving several features when a context is long, each of those features having a
+    # context that overlaps a bit the context of the previous feature.
+    contexts = [examples[i]['context'] for i in range(len(examples))]
+    questions = [examples[i]['question'] for i in range(len(examples))]
+
+    tokenized_examples = tokenizer(
+        questions,
+        contexts,
+        stride=doc_stride,
+        max_seq_len=max_seq_length)
+
+    # Let's label those examples!
+    for i, tokenized_example in enumerate(tokenized_examples):
+        # We will label impossible answers with the index of the CLS token.
+        input_ids = tokenized_example["input_ids"]
+        cls_index = input_ids.index(tokenizer.cls_token_id)
+
+        # The offset mappings will give us a map from token to character position in the original context. This will
+        # help us compute the start_positions and end_positions.
+        offsets = tokenized_example['offset_mapping']
+
+        # Grab the sequence corresponding to that example (to know what is the context and what is the question).
+        sequence_ids = tokenized_example['token_type_ids']
+
+        # One example can give several spans, this is the index of the example containing this span of text.
+        sample_index = tokenized_example['overflow_to_sample']
+        answers = examples[sample_index]['answers']
+        answer_starts = examples[sample_index]['answer_starts']
+
+        # Start/end character index of the answer in the text.
+        start_char = answer_starts[0]
+        end_char = start_char + len(answers[0])
+
+        # Start token index of the current span in the text.
+        token_start_index = 0
+        while sequence_ids[token_start_index] != 1:
+            token_start_index += 1
+
+        # End token index of the current span in the text.
+        token_end_index = len(input_ids) - 1
+        while sequence_ids[token_end_index] != 1:
+            token_end_index -= 1
+        # Minus one more to reach actual text
+        token_end_index -= 1
+
+        # Detect if the answer is out of the span (in which case this feature is labeled with the CLS index).
+        if not (offsets[token_start_index][0] <= start_char and
+                offsets[token_end_index][1] >= end_char):
+            tokenized_examples[i]["start_positions"] = cls_index
+            tokenized_examples[i]["end_positions"] = cls_index
+        else:
+            # Otherwise move the token_start_index and token_end_index to the two ends of the answer.
+            # Note: we could go after the last offset if the answer is the last word (edge case).
+            while token_start_index < len(offsets) and offsets[
+                    token_start_index][0] <= start_char:
+                token_start_index += 1
+            tokenized_examples[i]["start_positions"] = token_start_index - 1
+            while offsets[token_end_index][1] >= end_char:
+                token_end_index -= 1
+            tokenized_examples[i]["end_positions"] = token_end_index + 1
+
+    return tokenized_examples
+
+def prepare_validation_features(examples,tokenizer,doc_stride,max_seq_length):
+    # Tokenize our examples with truncation and maybe padding, but keep the overflows using a stride. This results
+    # in one example possible giving several features when a context is long, each of those features having a
+    # context that overlaps a bit the context of the previous feature.
+    contexts = [examples[i]['context'] for i in range(len(examples))]
+    questions = [examples[i]['question'] for i in range(len(examples))]
+
+    tokenized_examples = tokenizer(
+        questions,
+        contexts,
+        stride=doc_stride,
+        max_seq_len=max_seq_length)
+
+    # For validation, there is no need to compute start and end positions
+    for i, tokenized_example in enumerate(tokenized_examples):
+        # Grab the sequence corresponding to that example (to know what is the context and what is the question).
+        sequence_ids = tokenized_example['token_type_ids']
+
+        # One example can give several spans, this is the index of the example containing this span of text.
+        sample_index = tokenized_example['overflow_to_sample']
+        tokenized_examples[i]["example_id"] = examples[sample_index]['id']
+
+        # Set to None the offset_mapping that are not part of the context so it's easy to determine if a token
+        # position is part of the context or not.
+        tokenized_examples[i]["offset_mapping"] = [
+            (o if sequence_ids[k] == 1 else None)
+            for k, o in enumerate(tokenized_example["offset_mapping"])
+        ]
+
+    return tokenized_examples
+