理解简单正弦波的MFCC输出

huangapple go评论57阅读模式
英文:

Understanding MFCC output for a simple sine wave

问题

我生成了一个频率为200的简单正弦波,并计算FFT以检查获得的频率是否正确。

然后,我计算了MFCC,但不明白它的输出代表什么意思?
输出的解释是什么,我在这个输出中如何看到频率200?

# In[3]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.fft
import librosa

def generate_sine_wave(freq, sample_rate, duration):
    x = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
    frequencies = x * freq
    # 2pi because np.sin takes radians
    y = np.sin(2 * np.pi * frequencies)
    return x, y

sample_rate = 1024
freq = 200
x, y = generate_sine_wave(freq, sample_rate, 1)
plt.figure(figsize=(10, 4))
plt.plot(x, y)
plt.grid(True)

fft = scipy.fft.fft(y)
fft = fft[0 : len(fft) // 2]
fft = np.abs(fft)
xs = np.linspace(0, sample_rate // 2, len(fft))
plt.figure(figsize=(10, 4))
plt.plot(xs, fft)
plt.grid(True)

mfcc_feat = librosa.feature.mfcc(sr=sample_rate, y=y)
print('MFCC Parameters:')
print('   窗口数量              =', mfcc_feat.shape[0])
print('   单个特征长度          =', mfcc_feat.shape[1])
mfcc_feat = mfcc_feat.T
plt.matshow(mfcc_feat)
plt.title('MFCC 特征 - librosa')

如果我将频率更改为400,MFCC会给出这个输出:

这三行中所有颜色的含义是什么?

英文:

I generate a simple sine wave with a frequency of 200 and calculate an FFT to check that the obtained frequency is correct.

Then I calculate MFCC but do not understand what its output means?
What is the explanation of the output, and where do I see the frequency 200 in this output?

# In[3]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.fft
import librosa

def generate_sine_wave(freq, sample_rate, duration):
    x = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
    frequencies = x * freq
    # 2pi because np.sin takes radians
    y = np.sin(2 * np.pi * frequencies)
    return x, y

sample_rate = 1024
freq = 200
x, y = generate_sine_wave(freq, sample_rate, 1)
plt.figure(figsize=(10, 4))
plt.plot(x, y)
plt.grid(True)

fft = scipy.fft.fft(y)
fft = fft[0 : len(fft) // 2]
fft = np.abs(fft)
xs = np.linspace(0, sample_rate // 2, len(fft))
plt.figure(figsize=(10, 4))
plt.plot(xs, fft)
plt.grid(True)

mfcc_feat = librosa.feature.mfcc(sr=sample_rate, y=y)
print('\nMFCC Parameters:\n   Window Count              =', mfcc_feat.shape[0])
print('   Individual Feature Length =', mfcc_feat.shape[1])
mfcc_feat = mfcc_feat.T
plt.matshow(mfcc_feat)
plt.title('MFCC Features - librosa')

理解简单正弦波的MFCC输出

理解简单正弦波的MFCC输出

理解简单正弦波的MFCC输出

If I change the frequency to 400 MFCC it gives me this:

理解简单正弦波的MFCC输出

What is the meaning of all these colors in three rows?

答案1

得分: 1

个体MFCC通常难以解释,因此绘制和研究它们并不是非常有用的,因为很难将特定时间帧内的频率分bin的变化与原始信号相关联。你可以在这里找到有关如何计算MFCC的详细解释。

同时,让我简要解释一下你看到的内容以及原因:
你的信号 y 被分割成帧,对于每一帧,你得到20个(librosa 默认值)系数。内部的 hop_length 参数设置为512个样本,因此在填充后,你的1秒序列(1024个样本)被转换成3帧。由于你的信号在频域上相对静态,MFCCs 变化不大,因此列中颜色的变化很小。

英文:

Individual MFCCs are generally not explainable, so plotting and studying them is not very useful, because it is hard to correlate the changes in frequency bins in certain time frame with the original signal. You can get a good explanation on how MFCCs are computed in this.

Meanwhile let me give a very short explanation what you see and why:
You signal y is split into frames and for each frame you get get 20 (the librosa default) coefficients. The internal hop_length parameter is set to 512 samples so after padding your sequence of 1 second (1024 samples) it is converted to 3 frames. As your signal is pretty static in frequency domain, the MFCCs does not change much, hence the change in colors in columns is minimal.

huangapple
  • 本文由 发表于 2023年6月6日 03:18:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76409411.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定