Skip to content

Rational speech act model based multi document summarization and highlights selection.

Notifications You must be signed in to change notification settings

icannos/glimpse-mds

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is the repositotry of GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews Paper | Code

Installation

  • We use python 3.10 and CUDA 12.1
module load miniconda/3
module load cuda12
  • First, create a virtual environment using:
conda create -n glimpse python=3.10
  • Second, activate the environment and install pytorch:
conda activate glimpse 
conda install pytorch==2.1.1 pytorch-cuda=12.1 -c pytorch -c nvidia
  • Finally, all remaining required packages could be installed with the requirements file:
pip install -r requirements

Data Loading

Step 1: Start by processing the input files from data.

python glimpse/data_loading/data_processing.py 

Generating Summaries and Computing RSA Scores

Step 2: Now, we generate candidate summaries and compute RSA scores for each candidate

  • for extractive candidates, use the following command:
sbatch scripts/extractive.sh Path_of_Your_Processed_Dataset_Step1.csv
  • for abstractive candidates, use either of the following commands:
    • In case the last batch is incomplete, you can add padding using --add-padding argument to complete it:
    sbatch scripts/abstractive.sh Path_of_Your_Processed_Dataset_Step1.csv --add-padding
    • If you want to remove the last incomplete batch, you can run the script without the argument:
    sbatch scripts/abstractive.sh Path_of_Your_Processed_Dataset_Step1.csv

rsasumm/ provides a python package with an implementation of RSA incremental decoding and RSA reranking of candidates. mds/ provides the experiment scripts and analysis for the MultiDocument Summarization task.

Citation

If you use this code, please cite the following papers:

      title={GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews}, 
      author={Maxime Darrin and Ines Arous and Pablo Piantanida and Jackie CK Cheung},
      year={2024},
      eprint={2406.07359},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2406.07359}, 
}

About

Rational speech act model based multi document summarization and highlights selection.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •