Skip to content

Exploration on the effect of pooled layer on LLM. This project is conducted in sentiment dataset with three classes.

License

Notifications You must be signed in to change notification settings

rsceth/Language-Model-Pooling-Exploration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤖 Exploring the Impact of Different Pooling Methods on XLM-RoBERTa

Hey there! Get ready to discover how they can help us better understand language! I'll be exploring two super cool techniques—LSTM pooling and weighted pooling—that will give you a deeper understanding of these models. These methods are tested in sentiment analysis tasks.

📍 Project structure

.
├── config                      
│   ├── main.yaml                   # Main configuration file
│   ├── model                       # Configurations for training model
│   │   ├── model1.yaml             # First variation of parameters to train model
│   │   └── model2.yaml             # Second variation of parameters to train model
│   └── process                     # Configurations for processing data
│       ├── process1.yaml           # First variation of parameters to process data
│       └── process2.yaml           # Second variation of parameters to process data
├── docs                            # documentation for your project
├── dvc.yaml                        # DVC pipeline
├── .flake8                         # configuration for flake8 - a Python formatter tool
├── .gitignore                      # ignore files that cannot commit to Git
├── Makefile                        # store useful commands to set up the environment
├── pyproject.toml                  # dependencies for poetry
├── README.md                       # describe your project
├── src                             # store source code
│   ├── __init__.py                 # make src a Python module 
│   ├── data_batcher.py             # process data before training model
│   └── data_loader.py              # batch the dataset
│   └── evaluate.py                 # evaluating during training
│   └── inference.py                # inference script
│   └── main.py                     # trainer class
│   └── model.py                    # model architecure
│   └── pretrainedModel.py          # download/load pretrained model 
│   └── train_utils.py              # train and evluate model
│   └── train.py                    # parse paramters to train
│   └── train.sh                    # parse paramters to train
│   └── utils.py                    # utils to train
│   └── visualize.py                # visualize weights for each epoch

⚙️ Architecure

🧩 Features

Feature Description
⚙️ Architecture The project leverages a Python 3.10 environment, utilizing the huggingface package for model training
🔩 Code Quality The codebase follows best practices with automated testing
📄 Documentation Medium Article
🧩 Modularity The codebase is modular with abstract factory modules for data loading, model creating, training and testing, even for inference for single testing
🧪 Testing src/inference.py
📦 Dependencies Key dependencies include Python, HuggingFace and CUML

🚀 Getting Started

🤖 Usage

From source

🤖 Models

Model Architecture Detail

 check on src/model.py

🚀 Train

Train

$ bash src/train.sh

🧪 Tests

Test

$ python src/inference.py

🤝 Contributing

📄 License

This project is protected under the LICENSE file.

About

Exploration on the effect of pooled layer on LLM. This project is conducted in sentiment dataset with three classes.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published