Sperm whale bioacoustics

Sperm whale clan membership and coda type detection using inter click intervals

This repository builds on the work in Deep Machine Learning Techniques for the Detection and Classification of Sperm Whale Bioacoustics, a fascinating paper by Peter C. Bermant, Michael M. Bronstein, Robert J. Wood, Shane Gero & David F. Gruber.

The purpose of the repository is to:

reproduce the RNN experiments, providing executable code with accompanying explanation to allow others to leverage these novel techniques on their data
identify whether architectural changes and advanced training schedules (1cycle policy) could alleviate the need for the cumbersome pretraining step
experiment with and report results of using Random Forest instead of the RNN, a model that can be trained to good effects without specialized deep learning knowledge, is not prone to overfitting, that can be trained without the need of a GPU and that lends itself very well to interpetation

Above all, the goal of the repository is to provide an introduction to the world of marine biology and modern machine learning techniques.

Summary of results

We chose to focus on the clan detection and coda type detection tasks on the dominicana dataset. Those were the only tasks (apart from classification using a CNN) where results were reported on the validation set and where we were able to reconstruct how the data was processed (based on the paper and the accompanying repository).

Our goal was not to go for the breadth of our inquiry, but rather to follow in the footsteps of the breakthrough paper to better understand the fascinating phenonemana of ICIs and their potential role in whale communication.

The notebooks were constructed in a way as to allow a broader audience to follow in our footsteps.

Clan detection

model	accuracy
RNN (paper)	95.3%
RNN (ours)	93.7%
Random Forest	95.3%

Coda type detection

model	accuracy
RNN (paper)	99.9%
RNN (ours)	100%
Random Forest	99.6%

Our RNN and Random Forest use the same sampling of the train and validation sets and thus these results are easiest to compare. The paper followed a different sampling of the train and validation sets.

Overall, the differences in performance, given the size of the validation set, are next to insignificant. All the models perform similarly.

An interesting result is the stronger performance of our Random Forest as compared to our RNN on the clan detection task.

We found no benefits to pretraining. For RNNs we got best results when training the entire model from scratch.

The main consideration that we would like to highlight is how much less compute and specialized knowledge was needed to get good results with Random Forest. Above all, the interpretability of the model is unparalleled. The robustness of the method to training on unbalanced data and its legendary ability to not overfit (while still packing enough variance to give good results) makes this a great model to have in one's toolbox.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github/workflows		.github/workflows
data		data
docs		docs
sperm_whale_bioacoustics		sperm_whale_bioacoustics
.gitignore		.gitignore
01_Introduction.ipynb		01_Introduction.ipynb
02_Data_preparation.ipynb		02_Data_preparation.ipynb
03_Baselines.ipynb		03_Baselines.ipynb
04_Pretraining.ipynb		04_Pretraining.ipynb
05_Clan_and_coda_type_detection_with_RNN.ipynb		05_Clan_and_coda_type_detection_with_RNN.ipynb
06_Clan_and_coda_type_detection_with_Random_Forest.ipynb		06_Clan_and_coda_type_detection_with_Random_Forest.ipynb
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
index.ipynb		index.ipynb
settings.ini		settings.ini
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sperm whale bioacoustics

Summary of results

Clan detection

Coda type detection

About

Releases

Packages

Languages

License

Beaverdome/sperm_whale_bioacoustics

Folders and files

Latest commit

History

Repository files navigation

Sperm whale bioacoustics

Summary of results

Clan detection

Coda type detection

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages