GitHub

The repository contains the datasets and codes for cs577 Project titled Prompt-Enhanced Medical Question Summarization

The folders are organized in the following way:

meQSum_Dataset: Consists of the original meQSum Dataset splitted into train, validation and test sets
ner_tagged_dataset: Consists of the NER augmented meQSum dataset
coocccur_dataset: Consists of cooccurance tag augmented meQSum dataset
notebooks: Consist of finetuning, zeroshot generation and zero_shot_co-occurence notebooks.
scripts: Consists of scripts to create the NER tagged dataset, generating all unqiue NER tags and preparaing the datasets for huggingface library.

All finetuning of model flant5-base has been done with huggingface and logs been generated with wandb. The training was done in google colab pro with L4 GPU with 22.5GB of VRam in High RAM environment. Required modules to run the notebooks have been mentioned in the notebooks. To run, training or zero shot inference, simply use the notebooks in the notebooks folder.

You can find the project report pdf with details of our experiments with different prompt strategy and finetuning in the prompt_enhanced_medical_question_summarization_report.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
__pycache__		__pycache__
cooccur_dataset		cooccur_dataset
meQSum_Dataset/chq		meQSum_Dataset/chq
ner_tagged_dataset		ner_tagged_dataset
notebooks		notebooks
scripts		scripts
Readme.md		Readme.md
prompt_enhanced_medical_question_summarization_report.pdf		prompt_enhanced_medical_question_summarization_report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Contributors 2

Languages

AJAkil/cs577_proj

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages