🎛️ Regression Datasets

📋 Table of Contents

Installation
Usage
Datasets
License
Contact
Acknowledgments

This repository offers a diverse collection of regression datasets across vision, audio and text domains. It provides dataset classes that follow the PyTorch Datasets structure, allowing users to automatically download and load these datasets with ease. All datasets come with a permissive license, permitting their use for research purposes.

1. Installation

To install the regsets package, you can use pip:

python -m pip install regsets

Alternatively, you can download a specific dataset file (e.g., utkface.py) and include it in your project to load the dataset locally.

2. Usage

Below are examples of how to use the regsets package for loading datasets.

📸 Vision Datasets

from regsets.vision import UTKFace

utkface_trainset = UTKFace(root="./data", split="train", download=True)

for image, label in utkface_trainset:
    ...

🎧 Audio Datasets

from regsets.audio import VCC2018

vcc2018_trainset = VCC2018(root="./data", split="train", download=True)

for audio, sample_rate, label in vcc2018_trainset:
    ...

📝 Text Datasets

from regsets.text import Amazon_Review

amazon_review_trainset = Amazon_Review(root="./data", split="train", download=True)

for texts, label in amazon_review_trainset:
    (ori, aug_0, aug_1) = texts
    ...

(back to top)

3. Datasets

For datasets that do not provide a predefined train-test split, I randomly sample 80% of the data for training and reserve the remaining 20% for testing. Details for each dataset are provided below.

📸 Vision Datasets

Dataset	# Training Data	# Dev Data	# Test Data	Target Range
UTKFace	18,964	-	4,741	[1, 116]

🎧 Audio Datasets

Dataset	# Training Data	# Dev Data	# Test Data	Target Range
BVCC	4,974	1,066	1,066	[1, 5]
VCC2018	16,464	-	4,116	[1, 5]

📝 Text Datasets

Dataset	# Training Data	# Dev Data	# Test Data	Target Range
Amazon Review	250,000	25,000	650,000	[0, 4]
Yelp Review	250,000	25,000	50,000	[0, 4]

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
data		data
regsets		regsets
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎛️ Regression Datasets

1. Installation

2. Usage

📸 Vision Datasets

🎧 Audio Datasets

📝 Text Datasets

3. Datasets

📸 Vision Datasets

🎧 Audio Datasets

📝 Text Datasets

4. License

5. Contact

6. Acknowledgments

About

Uh oh!

Releases

Uh oh!

Languages

License

pm25/Regression-Datasets

Folders and files

Latest commit

History

Repository files navigation

🎛️ Regression Datasets

1. Installation

2. Usage

📸 Vision Datasets

🎧 Audio Datasets

📝 Text Datasets

3. Datasets

📸 Vision Datasets

🎧 Audio Datasets

📝 Text Datasets

4. License

5. Contact

6. Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Languages