Urban Sound Tagging using EfficientNet

The purpose of this project is to investigate how urban audio samples in computer-readable forms might be classified into the appropriate classes using an image transfer learning model. As a baseline, the EfficientNet-B1 model is used, along with Log-mel Spectrogram feature representations. To improve the baseline results,various audio data augmentation techniques are investigated. The two notebooks available are for the ESC-50/ESC-10 dataset and the UrbanSound8k dataset.

Best Results

94.25% on ESC-10
85.9% on ESC-50
76.5% on UrbanSound8k

Authors

@katrin-ibrahim

Acknowledgements

The model architecture was adapted from J.Kim in the dcase 2020 challenge

Feedback

If you have any feedback, please reach out to me at [email protected]

To do

Experiment with larger network
Experiment with class conditional data augmentation

License

GPL v3

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
bg_noise_wav_files		bg_noise_wav_files
ESC50_ESC10_Final.ipynb		ESC50_ESC10_Final.ipynb
Final_UrbanSound8k.ipynb		Final_UrbanSound8k.ipynb
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Urban Sound Tagging using EfficientNet

Best Results

Authors

Acknowledgements

Feedback

To do

License

About

Releases

Packages

Languages

License

katrin-ibrahim/Urban-Sound-Tagging-using-EfficientNet

Folders and files

Latest commit

History

Repository files navigation

Urban Sound Tagging using EfficientNet

Best Results

Authors

Acknowledgements

Feedback

To do

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages