You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is the use case context: My LLM is responding to user queries, and both query and response are tracked for human validation. The human feedback is given as a scalar value between 0 and 1. That makes up the dataset for fine-tuning the model.
So the question here is:
What is the acceptable dataset format? Will the below format work for the finetuning? Also, please throw some lights as to how the dataset structure/format is flexible enough if I need to add an additional key/value in the JSON for my domain/context needs—does it give such flexibility? If yes, which Python file or configuration do I need to edit the new field?
```json
[
{
"query": "What are the benefits of regular exercise?",
"response": "Regular exercise boosts physical health, improves mental health, and enhances overall well-being. It helps in weight management and reduces the risk of chronic diseases.",
"feedback": 0.9
},
{
"query": "Explain the theory of relativity in simple terms.",
"response": "The theory of relativity states that the laws of physics are the same for all non-accelerating observers, and that the speed of light is constant no matter how fast you are moving. It includes both special and general relativity.",
"feedback": 0.8
},
```
Thanks,
Tharma
The text was updated successfully, but these errors were encountered:
First off, thank you for the awesome library!!
I want to train Qwen for RLHF fine-tuning
Here is the use case context: My LLM is responding to user queries, and both query and response are tracked for human validation. The human feedback is given as a scalar value between 0 and 1. That makes up the dataset for fine-tuning the model.
So the question here is:
What is the acceptable dataset format? Will the below format work for the finetuning? Also, please throw some lights as to how the dataset structure/format is flexible enough if I need to add an additional key/value in the JSON for my domain/context needs—does it give such flexibility? If yes, which Python file or configuration do I need to edit the new field?
Thanks,
Tharma
The text was updated successfully, but these errors were encountered: