Hyper-CL: Conditioning Sentence Representations with Hypernetworks

Official Repository for "Hyper-CL: Conditioning Sentence Representations with Hypernetworks" [Paper(arXiv)])

Young Hyun Yoo, Jii Cha, Changhyeon Kim and Taeuk Kim. Accepted to ACL2024 long paper.

C-STS

In this section, we describe how to train a Hyper-CL model by using our code. This code based on C-STS

Requirements

Run the following script, the requirements are the same as C-STS.

Data

Download the C-STS dataset and locate the file at data/ (reference the C-STS repository for more details.)

pip install -r requirements.txt

Training

We provide example training scripts for finetuning and evaluating the models in the paper. Go to C-STS/ and execute the following command

bash run_sts.sh

Following the arguments of C-STS, we explain the additional arguments in following :

--objective: (If you train Hyper-CL, you should use triplet_cl_mse)
--cl_temp: Temperature for contrastive loss
--cl_in_batch_neg: Add in-batch negative loss to main loss
--hypernet_scaler: To set the value of K for low-rank implemented Hyper-CL (i.e., hyper64-cl, hyper85-cl), we determine the divisor of the embedding size. For instance, in the base model, 'K=64' for hyper64-cl means the embedding size 768 is divided by 12. Thus, the hypernet_scaler is set to 12.
--hypernet_dual: Dual encoding that uses separate 2 encoders for sentences 1 and 2 and for the condition.

Hyperparameters

We use the following hyperparamters for training Hyper-CL:

Emb.Model	Learning rate (lr)	Weight decay (wd)	Temperature (temp)
DiffCSE_base+hyper-cl	3e-5	0.1	1.5
DiffCSE_base+hyper64-cl	1e-5	0.0	1.5
SimCSE_base+hyper-cl	3e-5	0.1	1.9
SimCSE_base+hyper64-cl	2e-5	0.1	1.7
SimCSE_large+hyper-cl	2e-5	0.1	1.5
SimCSE_large+hyper85-cl	1e-5	0.1	1.9

SimKGC

We provide example training scripts for finetuning and evaluating the models in the paper. Go to sim-kcg/ and execute the following command. This code is based on SimKCG

Preprocessing WN18RR dataset

bash scripts/preprocess.sh WN18RR

Training

bash scripts/train_wn.sh

We explain the arguments in following:

--pretrained-model: Backbone model checkpoint (bert-base-uncased or bert-large-uncased)
--encoding_type: Encoding type (bi_encoder or tri_encoder)
--triencoder_head: Triencoder head (concat, hadamard or hypernet)
Refer to config.py for other arguments.

Evaluation for Perfomance and Inference Time

bash scripts/eval.sh ./checkpoint/WN18RR/model_best.mdl WN18RR

Citation

Please cite our paper if you use Hyper-CL in your work:

@article{yoo2024hyper,
  title={Hyper-CL: Conditioning Sentence Representations with Hypernetworks},
  author={Yoo, Young Hyun and Cha, Jii and Kim, Changhyeon and Kim, Taeuk},
  journal={arXiv preprint arXiv:2403.09490},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
c-sts		c-sts
sim-kgc		sim-kgc
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hyper-CL: Conditioning Sentence Representations with Hypernetworks

Official Repository for "Hyper-CL: Conditioning Sentence Representations with Hypernetworks" [Paper(arXiv)])

Young Hyun Yoo, Jii Cha, Changhyeon Kim and Taeuk Kim. Accepted to ACL2024 long paper.

Table of Contents

C-STS

Requirements

Data

Training

Hyperparameters

SimKGC

Preprocessing WN18RR dataset

Training

Evaluation for Perfomance and Inference Time

Citation

About

Releases

Packages

Contributors 2

Languages

HYU-NLP/Hyper-CL

Folders and files

Latest commit

History

Repository files navigation

Hyper-CL: Conditioning Sentence Representations with Hypernetworks

Official Repository for "Hyper-CL: Conditioning Sentence Representations with Hypernetworks" [Paper(arXiv)])

Young Hyun Yoo, Jii Cha, Changhyeon Kim and Taeuk Kim. Accepted to ACL2024 long paper.

Table of Contents

C-STS

Requirements

Data

Training

Hyperparameters

SimKGC

Preprocessing WN18RR dataset

Training

Evaluation for Perfomance and Inference Time

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages