Embeddings are non-deterministic even for durations < 7s #42

simonmandlik · 2024-09-21T13:52:04Z

Hi, I have similar problem to #24, but I'm using shorter audio than 6 seconds.

MWE:

from msclap import CLAP
import torch
import subprocess

with torch.no_grad():
    clap_model = CLAP(version = "2023", use_cuda=False)

    f = "/home/simon.mandlik/test.wav"

    audio_embeddings_1 = clap_model.get_audio_embeddings([f])
    audio_embeddings_2 = clap_model.get_audio_embeddings([f])

    print(audio_embeddings_1)
    print(audio_embeddings_2)

    mse = torch.mean((audio_embeddings_1 - audio_embeddings_2)**2)
    print(mse)
    print(subprocess.check_output(['ffprobe', f, '-hide_banner']))
    print(clap_model.args)

Output:

tensor([[-1.5895, -0.9305,  0.0572,  ...,  1.6071, -0.0361,  0.6508]])
tensor([[-1.5228, -1.0532,  0.0794,  ...,  1.6698, -0.0152,  0.4471]])
tensor(0.0190)
Input #0, wav, from '/home/simon.mandlik/test.wav':
  Metadata:
    encoder         : Lavf61.1.100
  Duration: 00:00:06.00, bitrate: 1536 kb/s
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, 2 channels, s16, 1536 kb/s
b''
Namespace(text_model='gpt2', text_len=77, transformer_embed_dim=768, freeze_text_encoder_weights=True, audioenc_name='HTSAT', out_emb=768, sampling_rate=44100, duration=7, fmin=50, fmax=8000, n_fft=1024, hop_size=320, mel_bins=64, window_size=1024, d_proj=1024, temperature=0.003, num_classes=527, batch_size=1024, demo=False)

The text was updated successfully, but these errors were encountered:

simonmandlik · 2024-09-23T13:59:49Z

I found why. The culprit is this line. My audio is dual channel and this line doubles the actual "length" from 6s to 12s

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embeddings are non-deterministic even for durations < 7s #42

Embeddings are non-deterministic even for durations < 7s #42

simonmandlik commented Sep 21, 2024

simonmandlik commented Sep 23, 2024

Embeddings are non-deterministic even for durations < 7s #42

Embeddings are non-deterministic even for durations < 7s #42

Comments

simonmandlik commented Sep 21, 2024

simonmandlik commented Sep 23, 2024