You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For Linux, it should be `sudo apt-get install espeak-ng`.
24
-
For Windows, refer to the above link.
25
-
If you do not have sudo privilege, you could build the library by following the last section of this readme. -->
26
-
27
20
## Inferencing pretrained VALL-E models
28
21
### Download pretrained weights
29
-
You need to download our pretrained weights from huggingface.
22
+
You need to download our pretrained weights from huggingface. Our models are trained on the MLS dataset (45k hours of English, contains 10-20s speech).
We provide our pretrained VALL-E model that is trained on 45k hours MLS dataset.
@@ -111,31 +114,6 @@ You should also select a reasonable batch size at the "batch_size" entry (curren
111
114
112
115
You can change other experiment settings in the `/egs/tts/VALLE_V2/exp_ar_libritts.json` such as the learning rate, optimizer and the dataset.
113
116
114
-
Here we choose `libritts` dataset we added and set `use_dynamic_dataset` false.
115
-
116
-
Config `use_dynamic_dataset` is used to solve the problem of inconsistent sequence length and improve gpu utilization, here we set it to false for simplicity.
117
-
118
-
```json
119
-
"dataset": {
120
-
"use_dynamic_batchsize": false,
121
-
"name": "libritts"
122
-
},
123
-
```
124
-
125
-
We also recommend changing "num_hidden_layers" if your GPU memory is limited.
126
-
127
-
**Set smaller batch_size if you are out of memory😢😢**
128
-
129
-
I used batch_size=3 to successfully run on a single card, if you'r out of memory, try smaller.
130
-
131
-
```json
132
-
"batch_size": 3,
133
-
"max_tokens": 11000,
134
-
"max_sentences": 64,
135
-
"random_seed": 0
136
-
```
137
-
138
-
139
117
### Run the command to Train AR model
140
118
(Make sure your current directory is at the Amphion root directory).
0 commit comments