Detoxifier

Through this project we develop a pipeline to detoxify an input sentence without the usage of parallel detoxified data to train our model.

We identify the toxic word, mask it and run our fine tuned BERT to generate the most appropriate non toxic word as a replacement.

Approach Used

Identification of toxic word

To mask the toxic word we use the following approaches
- Bag of words: we simply mask the words present in a pre-existing list of toxic and non toxic words. We obtained this list by running NgramSalienceCalculator on the previous lists. (code present here)
- Linear Regression: A coefficient is generated for each word by fitting the data linearly onto the toxic and non toxic corpus.
- Using Roberta, a toxic sentence classifier. For a toxic sentence of n words, we create n variations of the sentence by masking a different word in each variant. If the sentence is below a toxic threshold, we return this sentence. Else, we further mask another token in this sentence. This process is done until the sentence is below the threshold.

Generating Alternatives

To generate substitutes for the masked tokens we use the BERT model trained on Masked Language Model.

Fine Tuning BERT

A list of masked positive and negative sentences is used to fine tune this model. To train, we pass the masked positive sentences with flag 0 and negative with flag 1 into the model, with target being the masked token.
While generating the actual tokens for our specific task, we always pass the flag 0 as we want the output to be non toxic.
We use the following approaches to MASK tokens for fine tuning:
- Random Approach: one word at random is masked in the sentence. This approach was suggested by the paper Conditional BERT Contextual Augmentation.
- Targetted approach: mask the words present in a pre-existing list of toxic and non toxic words.

Choosing the correct alternative

Now, this fine tuned BERT will give us 10 possible alternatives to the originally masked token.
We will evaluate the alternatives by using the product of similarity with respect to the original masked token with the fluency and non-toxicity.
Similarity will be determined by using cosine similarity, fluency by GPT2LMHeadModel and non-toxicity using Roberta
We also provide multi token alternatives by replacing the single masks with a double masks, and then comparing the output with that obtained from a single mask. An example is as follows:

Repository Structure

scripts
data
- jigsaw
- train
  - train_normal
  - train_toxic
- vocab
  - negative_words
  - positive_words

The script folder contains the notebooks for baseline models, fine tuning BERT and the final detoxifying pipeline.

The data folder contains the

jigsaw dataset: can be downloaded from here. This data is used to calculate the accuracy of our model.
training data: list of positive and negative sentences that is used to fine tune BERT
vocab: list of positive and negative words obtained

How to run?

pip install -r requirements.txt to install the required dependencies.
To detoxify a given sentence, refer this

Further,

To fine tune the BERT Model, refer this http://moss.stanford.edu/results/5/6862770529437
To run the baseline model, refer this

Models

The fine tuned BERT models can be found here

Qualitative Analysis

Some examples of the output obtained:

Observation

The targetted fine tuned model showcased better results as compared to the random fine tuned model suggested by the paper Conditional BERT Contextual Augmentation.

For example,

Results

The metric used to evaluate our model was (similarity*fluency)/(perplexities). The scores obtained were as follows:

Baseline Model (T5 Paraphraser): 4.370629371e-3
Random-masked BERT: 1.419647927e-2
Target-masked BERT: 1.744591766e-2

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
examples		examples
scripts		scripts
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Text Detoxification using Large Pre-trained Neural Models.pdf		Text Detoxification using Large Pre-trained Neural Models.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detoxifier

Approach Used

Identification of toxic word

Generating Alternatives

Fine Tuning BERT

Choosing the correct alternative

Repository Structure

How to run?

Models

Qualitative Analysis

Observation

Results

About

Releases

Packages

Contributors 3

Languages

License

esh04/soBERT

Folders and files

Latest commit

History

Repository files navigation

Detoxifier

Approach Used

Identification of toxic word

Generating Alternatives

Fine Tuning BERT

Choosing the correct alternative

Repository Structure

How to run?

Models

Qualitative Analysis

Observation

Results

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages