File tree Expand file tree Collapse file tree 4 files changed +9
-6
lines changed Expand file tree Collapse file tree 4 files changed +9
-6
lines changed Original file line number Diff line number Diff line change @@ -45,7 +45,7 @@ Then ensures that the kvrocks database of Vulnerability-Lookup is running.
4545Creation of datasets:
4646
4747``` bash
48- $ vulntrain-create- dataset --nb-rows 10000 --upload --repo-id CIRCL/vulnerability-dataset-10k
48+ $ vulntrain-dataset-generation --sources cvelistv5 --nb-rows 10000 --upload --repo-id CIRCL/vulnerability-dataset-10k
4949Generating train split: 9999 examples [00:00, 177710.74 examples/s]
5050DatasetDict({
5151 train: Dataset({
@@ -73,7 +73,7 @@ For now we are using distilbert-base-uncased (AutoModelForMaskedLM) or gpt2 (Aut
7373The goal is to generate text.
7474
7575``` bash
76- $ vulntrain-train-dataset --base-model gpt2 --model-name CIRCL/vulnerability
76+ $ vulntrain-train-description-generation --base-model gpt2 --dataset-id CIRCL/vulnerability --repo-id CIRCL/vulnerability-description-generation-gpt2
7777Using CUDA (Nvidia GPU).
7878[codecarbon WARNING @ 13:28:13] Multiple instances of codecarbon are allowed to run at the same time.
7979[codecarbon INFO @ 13:28:13] [setup] RAM Tracking...
Original file line number Diff line number Diff line change @@ -151,8 +151,11 @@ def main():
151151 print (dataset_dict )
152152
153153 if args .upload :
154- # dataset_dict.push_to_hub(args.repo_id, commit_message=args.commit_message, token=hf_token)
155- dataset_dict .push_to_hub (args .repo_id )
154+ if args .commit_message :
155+ # dataset_dict.push_to_hub(args.repo_id, commit_message=args.commit_message, token=hf_token)
156+ dataset_dict .push_to_hub (args .repo_id , commit_message = args .commit_message )
157+ else :
158+ dataset_dict .push_to_hub (args .repo_id )
156159
157160
158161if __name__ == "__main__" :
Original file line number Diff line number Diff line change @@ -183,7 +183,7 @@ def main():
183183 parser .add_argument (
184184 "--model-save-dir" ,
185185 dest = "model_save_dir" ,
186- required = True ,
186+ default = "results" ,
187187 help = "The path to a directory where the tokenizer and the model will be saved." ,
188188 )
189189
Original file line number Diff line number Diff line change @@ -130,7 +130,7 @@ def main():
130130 parser .add_argument (
131131 "--model-save-dir" ,
132132 dest = "model_save_dir" ,
133- required = True ,
133+ default = "results" ,
134134 help = "The path to a directory where the tokenizer and the model will be saved." ,
135135 )
136136
You can’t perform that action at this time.
0 commit comments