理解简单正弦波的梅尔频谱图

huangapple go评论160阅读模式
英文:

Understanding mel-scaled spectrogram for a simple sine wave

问题

I generate a simple sine wave with a frequency of 100 and calculate an FFT to check that the obtained frequency is correct.

然后我生成一个频率为100的简单正弦波,并计算FFT以检查得到的频率是否正确。

Then I calculate melspectrogram but do not understand what its output means? where do I see the frequency 100 in this output? Why is the yellow bar located in the 25th area?

然后我计算melspectrogram,但不明白它的输出是什么意思?我在这个输出中如何看到频率100?为什么黄色条在第25区域?

If I change the frequency to 200, melspectrogram it gives me this:

如果我将频率更改为200,melspectrogram会给我这个:

Why is the yellow bar in the 50 area?

为什么黄色条在第50区域?

英文:

I generate a simple sine wave with a frequency of 100 and calculate an FFT to check that the obtained frequency is correct.

Then I calculate melspectrogram but do not understand what its output means? where do I see the frequency 100 in this output? Why is the yellow bar located in the 25th area?

  1. # In[4]:
  2. import numpy as np
  3. import matplotlib.pyplot as plt
  4. import scipy.fft
  5. import librosa
  6. def generate_sine_wave(freq, sample_rate, duration)-> tuple[np.ndarray, np.ndarray]:
  7. x = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
  8. frequencies = x * freq
  9. # 2pi because np.sin takes radians
  10. y = np.sin(2 * np.pi * frequencies)
  11. return x, y
  12. sample_rate = 1024
  13. freq = 100
  14. x, y = generate_sine_wave(freq, sample_rate, 2)
  15. plt.figure(figsize=(10, 4))
  16. plt.plot(x, y)
  17. plt.grid(True)
  18. fft = scipy.fft.fft(y)
  19. fft = fft[0 : len(fft) // 2]
  20. fft = np.abs(fft)
  21. xs = np.linspace(0, sample_rate // 2, len(fft))
  22. plt.figure(figsize=(15, 4))
  23. plt.plot(xs, fft)
  24. plt.grid(True)
  25. melsp = librosa.feature.melspectrogram(sr=sample_rate, y=y)
  26. melsp = melsp.T
  27. plt.matshow(melsp)
  28. plt.title('melspectrogram')
  29. max = np.max(melsp)
  30. print('melsp.shape =', melsp.shape)
  31. print('melsp max =', max)

理解简单正弦波的梅尔频谱图

理解简单正弦波的梅尔频谱图

理解简单正弦波的梅尔频谱图

If I change the frequency to 200, melspectrogram it gives me this:

理解简单正弦波的梅尔频谱图

Why is the yellow bar in the 50 area?

答案1

得分: 2

librosa的melspectrogram函数计算梅尔标度的频谱图。这与通常的线性标度频谱图相同,但频率轴重新采样为扭曲的mel标度

将特定频率箱("为什么是25?")与Hz中的频率相关联是复杂但可行的:

  1. melspectrogram将频率范围[0, sr/2]映射到mel空间。在您的示例中,[0, 512] Hz映射到mel范围为0到7.68(等于librosa.hz_to_mel(512))。
  2. 该范围默认均匀分为128个箱。第i个mel箱中心对应于librosa.mel_to_hz(i * 7.68 / 127)

然后,对于特定的25和50箱,我们可以验证它们对应于预期的频率:

  • librosa.mel_to_hz(25 * 7.68 / 127) = 100.7874
  • librosa.mel_to_hz(50 * 7.68 / 127) = 201.5748

对于绘图,melspectrogram文档建议使用librosa.display.specshow来显示梅尔标度的频谱图,选项为y_axis='mel',如下所示:

  1. fig, ax = plt.subplots()
  2. S_dB = librosa.power_to_db(S, ref=np.max)
  3. img = librosa.display.specshow(S_dB, x_axis='time',
  4. y_axis='mel', sr=sr,
  5. fmax=8000, ax=ax)
  6. fig.colorbar(img, ax=ax, format='%+2.0f dB')
  7. ax.set(title='Mel-frequency spectrogram')

这将绘制梅尔频谱图,y轴标记为Hz,但正确扭曲为mel标度。

英文:

librosa's melspectrogram function computes a mel-scaled spectrogram. This is the same as the usual linear-scale spectrogram, but with the frequency axis resampled to a warped mel scale.

Relating a particular bin ("why 25?") to frequency in Hz is complicated but doable:

  1. melspectrogram maps frequency range [0, sr/2] to mel space. In your example, [0, 512] Hz maps to mel in the range 0 to 7.68 (= librosa.hz_to_mel(512)).
  2. The range is uniformly divided into 128 bins (by default). The ith mel bin center corresponds to librosa.mel_to_hz(i * 7.68 / 127).

Then for bins 25 and 50 in particular, we can verify that they correspond to the expected frequencies:

  • librosa.mel_to_hz(25 * 7.68 / 127) = 100.7874
  • librosa.mel_to_hz(50 * 7.68 / 127) = 201.5748

For plotting, the melspectrogram documentation suggests displaying mel-scale specrograms using librosa.display.specshow with the option y_axis='mel', like:

  1. fig, ax = plt.subplots()
  2. S_dB = librosa.power_to_db(S, ref=np.max)
  3. img = librosa.display.specshow(S_dB, x_axis='time',
  4. y_axis='mel', sr=sr,
  5. fmax=8000, ax=ax)
  6. fig.colorbar(img, ax=ax, format='%+2.0f dB')
  7. ax.set(title='Mel-frequency spectrogram')

This plots the mel specrogram with the y axis labeled in Hz, but correctly warped for the mel scale.

huangapple
  • 本文由 发表于 2023年6月6日 12:51:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/76411526.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定