Modelling auditory similarity judgements between musical instrument playing techniques with graph neural networks
Sound retrieval for assisted music composition depends on evaluating similarity between musical instrument sounds, which is partly influenced by playing techniques. Previous methods utilizing Euclidean nearest neighbours over acoustic features show some limitations in retrieving sounds sharing equivalent timbral properties, but potentially generated using a different instrument, playing technique, pitch or dynamic. In this paper, we present a metric learning system designed to approximate human similarity judgments between extended musical playing techniques using graph neural networks. Such structure is a natural candidate for solving similarity retrieval tasks, yet have seen little application in modelling perceptual music similarity. We optimize a Graph Convolutional Network (GCN) over acoustic features via a proxy metric learning loss to learn embeddings that reflect perceptual similarities. Specifically, we construct the graph's adjacency matrix from the acoustic data manifold with an example-wise adaptive k-nearest neighbourhood graph: Adaptive Neighbourhood Graph Neural Network (AN-GNN). Our approach achieves 96.4% retrieval accuracy compared to 38.5% with a Euclidean metric and 86.0% with a multilayer perceptron (MLP), while effectively considering retrievals from distinct playing techniques to the query example.
git clone https://github.com/cyrusvahidi/ipt-similarity.git
pip install -e .
pip install -r requirements.txt
SOL_0.9_HQ
is a proprietary dataset from IRCAM. Please contact the authors for details.- Similarity judgements coming soon ...
python ipt_sim/train.py
python ipt_sim/train_kfold.py data=kfold model=graph model.net.blocks=2 model.net.graph_type=ang model.net.dist=euclidean data.feature=openl3 model.prune_accuracy=True
- the
model.prune_accuracy
option determines if the retrieval metric ignores retrievals of the same IMT class as the query - trains with K-folds cross validation
- the
@inproceedings{vahidi2023perceptual,
title={Perceptual musical similarity metric learning with graph neural networks},
author={Vahidi, Cyrus and Singh, Shubhr and Benetos, Emmanouil and Phan, Huy and Stowell, Dan and Fazekas, Gy{\"o}rgy and Lagrange, Mathieu},
booktitle={2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
pages={1--5},
year={2023},
organization={IEEE}
}