GitHub - whut-zhangwx/SimpleGPT

简介

一个 GPT 的简单实现. 其中后缀 _old 的版本是基于单个 CPU/GPU 的实现, 代码比较简洁, 便于阅读和理解 GPT 的结构; 无后缀 _old 的版本是基于 torch.nn.parallel.DistributedDataParallel 实现的多 GPU 训练代码, 可以进行单机单卡, 单机多卡, 多机多卡的数据并行训练.

结构

number of parameters: 85.00M
GPT(
  (transformer): ModuleDict(
    (wte): Embedding(65, 768)
    (wpe): Embedding(256, 768)
    (drop): Dropout(p=0.0, inplace=False)
    (blocks): ModuleList(
      (0-11): 12 x Block(
        (ln_1): LayerNorm()
        (attn): CausalSelfAttention(
          (c_attn): Linear(in_features=768, out_features=2304, bias=False)
          (c_proj): Linear(in_features=768, out_features=768, bias=False)
          (attn_dropout): Dropout(p=0.0, inplace=False)
          (resid_dropout): Dropout(p=0.0, inplace=False)
        )
        (ln_2): LayerNorm()
        (mlp): MLP(
          (c_fc): Linear(in_features=768, out_features=3072, bias=False)
          (gelu): GELU(approximate='none')
          (c_proj): Linear(in_features=3072, out_features=768, bias=False)
          (dropout): Dropout(p=0.0, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm()
  )
  (lm_head): Linear(in_features=768, out_features=65, bias=False)
)

PreTrain

使用 tinyshakespeare 文本, 在 character-level 下训练 50,000 个 iteration. 权重文件已经 push 至 whut-zhangwx/SimpleGPT | huggingface

示例

input:

"First Citizen:\nBefore we proceed any further, hear me speak."

output:

start  generate
---------------
First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.

All:
We know't, we know't.

First Citizen:
Let us kill him, and we'll have corn at our own price.
Is't a verdict?

All:
No more talking on't; let it be done: away, away!

Second Citizen:
One word, good citizens.

First Citizen:
We are accounted poor citizens, the patricians good.
What authority surfeits on w
---------------
finish generate

参考

论文

博客

深入理解GPT | whut-zhangwx

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.vscode		.vscode
tinyshakespeare		tinyshakespeare
.gitignore		.gitignore
README.md		README.md
config.py		config.py
configurator.py		configurator.py
generate.py		generate.py
generate_old.py		generate_old.py
gpt.py		gpt.py
gpt_old.py		gpt_old.py
onnx_export.py		onnx_export.py
onnx_infer.py		onnx_infer.py
onnx_vs_torch.py		onnx_vs_torch.py
test.py		test.py
train_with_multi_gpu.py		train_with_multi_gpu.py
train_with_single_device.py		train_with_single_device.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

简介

结构

PreTrain

示例

参考

About

Releases

Packages

Languages

whut-zhangwx/SimpleGPT

Folders and files

Latest commit

History

Repository files navigation

简介

结构

PreTrain

示例

参考

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages