[assistance] Confirmation on Data Format and Structure for Fine-Tuning

### 确认清单

- [X] 我已经阅读过 README.md 和 dependencies.md 文件
- [X] 我已经确认之前没有 issue 或 discussion 涉及此 BUG
- [X] 我已经确认问题发生在最新代码或稳定版本中
- [X] 我已经确认问题与 API 无关
- [X] 我已经确认问题与 WebUI 无关
- [X] 我已经确认问题与 Finetune 无关

### 你的issues


Hi,

I am planning to fine-tune ChatTTS using my own dataset, and I would like to confirm a few details regarding the data format and requirements.

### 1. Data Structure and .list File Format

Based on the documentation and examples, I have organized my data as follows:

#### File Structure
```
datasets/
└── data_speaker_a/
    ├── speaker_a/
    │   ├── 1.wav
    │   ├── 2.wav
    │   └── ... (more audio files)
    └── speaker_a.list
```

#### .list File Format
Each line in the `.list` file is formatted as `filepath|speaker|lang|text`, where:
- `filepath`: Relative path to the audio file (relative to the directory containing the `.list` file).
- `speaker`: Name of the speaker.
- `lang`: Language code (e.g., `ZH` for Chinese, `EN` for English).
- `text`: Transcription of the audio content.

Example:
```
speaker_a/1.wav|John|ZH|你好
speaker_a/2.wav|John|EN|Hello
```

Could you please confirm if this structure and format are correct?

### 2. Audio Data Specifications

I am planning to use 100 audio files, each approximately 10 seconds long, with a sampling rate of 24000 Hz for training. 

Is this a suitable setup for fine-tuning the model? Are there any specific recommendations or requirements?

Thank you for your assistance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[assistance] Confirmation on Data Format and Structure for Fine-Tuning #141

确认清单

你的issues

1. Data Structure and .list File Format

File Structure

.list File Format

2. Audio Data Specifications

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[assistance] Confirmation on Data Format and Structure for Fine-Tuning #141

Description

确认清单

你的issues

1. Data Structure and .list File Format

File Structure

.list File Format

2. Audio Data Specifications

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions