Welcome to the Data-Centric Track of the Wake Vision Challenge 2! 🎉
The goal of this track is to push the boundaries of tiny computer vision by enhancing the data quality of the Wake Vision Dataset.
🔗 Learn More: Wake Vision Challenge 2 Details
Participants are invited to:
- Enhance the provided dataset to improve person detection accuracy.
- Train the MCUNet-VWW2 model, a state-of-the-art person detection model, on the enhanced dataset.
- Assess quality improvements on the public test set.
You can modify the dataset however you like, but the model architecture must remain unchanged. 🛠️
First, install Docker on your machine:
-
Sign up on Harvard Dataverse
-
On your account information page go to the API Token tab and create a new API Token for Harvard Dataverse
-
Substitute "your-api-token-goes-here" with your API token in the following command and run it inside the directory where you cloned this repository to download and build the Wake Vision Dataset:
sudo docker run -it --rm -v "$(pwd):/tmp" -w /tmp tensorflow/tensorflow:2.19.0 python download_and_build_wake_vision_dataset.py your-api-token-goes-here
💡 Note: Make sure to have at least 600 GB of free disk space.
Run the following command inside the directory where you cloned this repository:
sudo docker run -it --rm -v "$(pwd):/tmp" -w /tmp tensorflow/tensorflow:2.19.0 python data_centric.py
- This trains the MCUNet-VWW2 model on the original dataset.
- Modify the dataset to improve the model's test accuracy by correcting labels or augmenting data.
- Install the NVIDIA Container Toolkit.
- Verify your GPU drivers.
Run the following command inside the directory where you cloned this repository:
sudo docker run --gpus all -it --rm -v $PWD:/tmp -w /tmp tensorflow/tensorflow:2.19.0-gpu python data_centric.py
- This trains the MCUNet-VWW2 model on the original dataset.
- Modify the dataset to enhance test accuracy while keeping the model architecture unchanged.
- Focus on Data Quality: Explore label correction, data augmentation, and other preprocessing techniques.
- Stay Efficient: The dataset is large, so plan your modifications carefully.
- Collaborate: Join the community discussion on Discord to share ideas and tips!
Have questions or need help? Reach out on Discord.
🌟 Happy Innovating and Good Luck! 🌟