理解简单正弦波的梅尔频谱图

huangapple go评论87阅读模式
英文:

Understanding mel-scaled spectrogram for a simple sine wave

问题

I generate a simple sine wave with a frequency of 100 and calculate an FFT to check that the obtained frequency is correct.

然后我生成一个频率为100的简单正弦波,并计算FFT以检查得到的频率是否正确。

Then I calculate melspectrogram but do not understand what its output means? where do I see the frequency 100 in this output? Why is the yellow bar located in the 25th area?

然后我计算melspectrogram,但不明白它的输出是什么意思?我在这个输出中如何看到频率100?为什么黄色条在第25区域?

If I change the frequency to 200, melspectrogram it gives me this:

如果我将频率更改为200,melspectrogram会给我这个:

Why is the yellow bar in the 50 area?

为什么黄色条在第50区域?

英文:

I generate a simple sine wave with a frequency of 100 and calculate an FFT to check that the obtained frequency is correct.

Then I calculate melspectrogram but do not understand what its output means? where do I see the frequency 100 in this output? Why is the yellow bar located in the 25th area?

# In[4]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.fft
import librosa

def generate_sine_wave(freq, sample_rate, duration)-> tuple[np.ndarray, np.ndarray]: 
    x = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
    frequencies = x * freq
    # 2pi because np.sin takes radians
    y = np.sin(2 * np.pi * frequencies)
    return x, y

sample_rate = 1024
freq = 100
x, y = generate_sine_wave(freq, sample_rate, 2)
plt.figure(figsize=(10, 4))
plt.plot(x, y)
plt.grid(True)

fft = scipy.fft.fft(y)
fft = fft[0 : len(fft) // 2]
fft = np.abs(fft)
xs = np.linspace(0, sample_rate // 2, len(fft))
plt.figure(figsize=(15, 4))
plt.plot(xs, fft)
plt.grid(True)

melsp = librosa.feature.melspectrogram(sr=sample_rate, y=y)
melsp = melsp.T
plt.matshow(melsp)
plt.title('melspectrogram')
max = np.max(melsp)
print('melsp.shape =', melsp.shape)
print('melsp max =', max)

理解简单正弦波的梅尔频谱图

理解简单正弦波的梅尔频谱图

理解简单正弦波的梅尔频谱图

If I change the frequency to 200, melspectrogram it gives me this:

理解简单正弦波的梅尔频谱图

Why is the yellow bar in the 50 area?

答案1

得分: 2

librosa的melspectrogram函数计算梅尔标度的频谱图。这与通常的线性标度频谱图相同,但频率轴重新采样为扭曲的mel标度

将特定频率箱("为什么是25?")与Hz中的频率相关联是复杂但可行的:

  1. melspectrogram将频率范围[0, sr/2]映射到mel空间。在您的示例中,[0, 512] Hz映射到mel范围为0到7.68(等于librosa.hz_to_mel(512))。
  2. 该范围默认均匀分为128个箱。第i个mel箱中心对应于librosa.mel_to_hz(i * 7.68 / 127)

然后,对于特定的25和50箱,我们可以验证它们对应于预期的频率:

  • librosa.mel_to_hz(25 * 7.68 / 127) = 100.7874
  • librosa.mel_to_hz(50 * 7.68 / 127) = 201.5748

对于绘图,melspectrogram文档建议使用librosa.display.specshow来显示梅尔标度的频谱图,选项为y_axis='mel',如下所示:

fig, ax = plt.subplots()
S_dB = librosa.power_to_db(S, ref=np.max)
img = librosa.display.specshow(S_dB, x_axis='time',
                         y_axis='mel', sr=sr,
                         fmax=8000, ax=ax)
fig.colorbar(img, ax=ax, format='%+2.0f dB')
ax.set(title='Mel-frequency spectrogram')

这将绘制梅尔频谱图,y轴标记为Hz,但正确扭曲为mel标度。

英文:

librosa's melspectrogram function computes a mel-scaled spectrogram. This is the same as the usual linear-scale spectrogram, but with the frequency axis resampled to a warped mel scale.

Relating a particular bin ("why 25?") to frequency in Hz is complicated but doable:

  1. melspectrogram maps frequency range [0, sr/2] to mel space. In your example, [0, 512] Hz maps to mel in the range 0 to 7.68 (= librosa.hz_to_mel(512)).
  2. The range is uniformly divided into 128 bins (by default). The ith mel bin center corresponds to librosa.mel_to_hz(i * 7.68 / 127).

Then for bins 25 and 50 in particular, we can verify that they correspond to the expected frequencies:

  • librosa.mel_to_hz(25 * 7.68 / 127) = 100.7874
  • librosa.mel_to_hz(50 * 7.68 / 127) = 201.5748

For plotting, the melspectrogram documentation suggests displaying mel-scale specrograms using librosa.display.specshow with the option y_axis='mel', like:

fig, ax = plt.subplots()
S_dB = librosa.power_to_db(S, ref=np.max)
img = librosa.display.specshow(S_dB, x_axis='time',
                         y_axis='mel', sr=sr,
                         fmax=8000, ax=ax)
fig.colorbar(img, ax=ax, format='%+2.0f dB')
ax.set(title='Mel-frequency spectrogram')

This plots the mel specrogram with the y axis labeled in Hz, but correctly warped for the mel scale.

huangapple
  • 本文由 发表于 2023年6月6日 12:51:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/76411526.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定