CASE

The models are not made public due to the double-blind reviewing process. For the final version, the repository will have the links to these models on huggingface.

In 'BERTPreTraining.py' the data path of a directory which contains the documents in .parquet format is expected. This code can be changed to suit ones requirements.

In all other documents the data path of a single csv file with the respective data is expected.

In BERTFineTuning.py the source column is 'post' as set in the custom dataset. The target column contains binary values (0 or 1) and the name of this column is the disorder it indicates.

In GemmaTraining.py the pre_training variable needs to be set to False for fine tuning. In the case of pre training the source column in the dataset is 'TEXT' while in the case of fine tuning this values is 'Post'. In the case of fine tuning, the target column is 'Generated Diagnosis Summary'. This script was adapted from https://colab.research.google.com/github/adithya-s-k/LLM-Alchemy-Chamber/blob/main/LLMs/Gemma/finetune-gemma.ipynb

In GemmaValidation.py the data path is set to the file generated by GemmaTraining.py after finetuning is done. The data file should have 'Predicted Diagnosis' column which is generated by the model and 'Generated Diagnosis Summary' column which is the reference summary obtained using the annotations from GPT-3.5. Further the BART Score repository needs to be cloned and their checkpoint needs to be downloaded to calculate the BART Score from https://github.com/neulab/BARTScore.

The python environment can be constructed using the requirments.txt file

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
BERTFineTuning.py		BERTFineTuning.py
BERTPreTraining.py		BERTPreTraining.py
README.md		README.md
gen-eval.ipynb		gen-eval.ipynb
gpt-inference.ipynb		gpt-inference.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CASE

About

Releases

Packages

Languages

MnCSSJ4x/CASE

Folders and files

Latest commit

History

Repository files navigation

CASE

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages