Skip to content

Commit 1662ff1

Browse files
committed
Update to quickstart
1 parent 7e917bc commit 1662ff1

File tree

1 file changed

+2
-19
lines changed

1 file changed

+2
-19
lines changed

README.md

Lines changed: 2 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,8 @@ Currently this project is working on progress. And the code is not verified yet.
3535
pip install bert-pytorch
3636
```
3737

38+
## Quickstart
3839

39-
## Usage
4040
**NOTICE : Your corpus should be prepared with two sentences in one line with tab(\t) separator**
4141
```
4242
Welcome to the \t the jungle \n
@@ -47,32 +47,16 @@ I can stay \t here all night \n
4747
```shell
4848
bert-vocab -c data/corpus.small -o data/corpus.small.vocab
4949
```
50-
```shell
51-
usage: bert-vocab [-h] -c CORPUS_PATH -o OUTPUT_PATH [-s VOCAB_SIZE]
52-
[-e ENCODING] [-m MIN_FREQ]
53-
```
50+
5451
### 2. Building BERT train dataset with your corpus
5552
```shell
5653
bert-dataset -d data/corpus.small -v data/corpus.small.vocab -o data/dataset.small
5754
```
5855

59-
```shell
60-
usage: bert-dataset [-h] -v VOCAB_PATH -c CORPUS_PATH [-e ENCODING] -o
61-
OUTPUT_PATH [-w WORKERS]
62-
```
63-
6456
### 3. Train your own BERT model
6557
```shell
6658
bert -d data/dataset.small -v data/corpus.small.vocab -o output/
6759
```
68-
```shell
69-
usage: bert [-h] -d TRAIN_DATASET [-t TEST_DATASET] -v VOCAB_PATH -o
70-
OUTPUT_DIR [-hs HIDDEN] [-n LAYERS] [-a ATTN_HEADS] [-s SEQ_LEN]
71-
[-b BATCH_SIZE] [-e EPOCHS] [-w NUM_WORKERS]
72-
[--corpus_lines CORPUS_LINES] [--lr LR]
73-
[--adam_weight_decay ADAM_WEIGHT_DECAY] [--adam_beta1 ADAM_BETA1]
74-
[--adam_beta2 ADAM_BETA2] [--log_freq LOG_FREQ] [-c CUDA]
75-
```
7660

7761
## Language Model Pre-training
7862

@@ -119,7 +103,6 @@ not directly captured by language modeling
119103
2. Randomly 50% of next sentence, gonna be unrelated sentence.
120104

121105

122-
123106
## Author
124107
Junseong Kim, Scatter Lab ([email protected] / [email protected])
125108

0 commit comments

Comments
 (0)