Machine Unlearning in Large Language Models

This repository contains the code for the project on "Machine Unlearning in Large Language Models"

Overview

The project synthesizes methods from the papers "Who’s Harry Potter? Approximate Unlearning in LLMs" and "Locating and Editing Factual Associations in GPT" to develop a framework capable of implementing two distinct unlearning approaches:

Selective Unlearning - Employs reinforced model predictions to selectively remove knowledge of specific content.
Rank-One Model Editing (ROME) - Utilizes direct manipulation of model weights to update factual associations precisely.

Repository Structure

/Selective_Unlearning: Contains scripts and notebooks implementing the selective unlearning process.
/ROME: Includes the implementation of Rank-One Model Editing (ROME) for precise factual modifications in models.

Authors

Mannal Kamble - [email protected]
Karthvik Sarvade - [email protected]

Acknowledgments

Thanks to Professor Gustavo Sandoval for his guidance and mentorship throughout this project.
Inspired by the methodologies detailed in recent unlearning research papers.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ROME		ROME
Selective Unlearning		Selective Unlearning
.DS_Store		.DS_Store
MachineUnlearning_in_LargeLanguageModels-Report.pdf		MachineUnlearning_in_LargeLanguageModels-Report.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Machine Unlearning in Large Language Models

Overview

Repository Structure

Authors

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

karthvik-s/unlearn-LLMs

Folders and files

Latest commit

History

Repository files navigation

Machine Unlearning in Large Language Models

Overview

Repository Structure

Authors

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages