Going back and forth from the time to the frequency domain is an everyday task in audio processing. But can you go back in time without any phase information? The goal of this project is to compare data-driven (e.g., CNN-based) and hand-crafted (e.g., Griffin-Lim algorithm) solutions to reconstruct the audio waveform starting from a spectrogram (i.e., STFT magnitude with no phase information). Evaluations are performed on different audio genres including speech, music and urban sounds.
Please refer to this Wiki as the official guide of the project.
Here you can read the final report of this project.
Here you can listen to the results and the different algorithms used and take a look to additional images of the comparison.