Swift. frequency of sound got from vDSP.DCT output differs from iPhone and iPad.

huangapple go评论55阅读模式
英文:

Swift. frequency of sound got from vDSP.DCT output differs from iPhone and iPad

问题

我试图找出由麦克风捕捉的声音的每个频率的幅度。

就像这个例子 https://developer.apple.com/documentation/accelerate/visualizing_sound_as_an_audio_spectrogram 一样。

我从麦克风捕捉样本,将其复制到循环缓冲区,然后对其执行ForwardDCT,就像这样:

func processData(values: [Int16]) {
    
    vDSP.convertElements(of: values,
                         to: &timeDomainBuffer)
    
    vDSP.multiply(timeDomainBuffer,
                  hanningWindow,
                  result: &timeDomainBuffer)
    
    forwardDCT.transform(timeDomainBuffer,
                         result: &frequencyDomainBuffer)
    
    vDSP.absolute(frequencyDomainBuffer,
                  result: &frequencyDomainBuffer)
    
    vDSP.convert(amplitude: frequencyDomainBuffer,
                 toDecibels: &frequencyDomainBuffer,
                 zeroReference: Float(Microphone.sampleCount))
    
    if frequencyDomainValues.count > Microphone.sampleCount {
        frequencyDomainValues.removeFirst(Microphone.sampleCount)
    }
    
    frequencyDomainValues.append(contentsOf: frequencyDomainBuffer)

}

timeDomainBuffer是包含样本的float16数组,样本计数为sampleCount,而frequencyDomainBuffer是每个频率的振幅,频率用其数组索引表示,其值表示该频率的振幅。

我试图获取每个频率的振幅,就像这样:

for index in frequencyDomainBuffer {
    let frequency = index * (AVAudioSession().sampleRate / Double(Microphone.sampleCount) / 2)
}

我假设frequencyDomainBuffer的索引将与实际频率呈线性关系,因此sampleRate除以sampleCount的一半将是正确的(sampleCount是timeDomainBuffer的长度)。

在我的iPad上运行时,结果是正确的,但在iPhone上,频率会高出10%。

我怀疑AVAudioSession().sampleRate是否可以在iPhone上使用?

当然,我可以添加条件,比如“如果是iPhone”,但我想知道为什么以及在我尚未测试的其他设备上是否正确。

英文:

I'm trying to figure out the amplitude of each frequency of sound captured by microphone.

Just like this example <https://developer.apple.com/documentation/accelerate/visualizing_sound_as_an_audio_spectrogram>

I captured sample from microphone to sample buffer, copy to a circle buffer, and then performed ForwardDCT on it, just like this:

    func processData(values: [Int16]) {
        
        
        vDSP.convertElements(of: values,
                             to: &amp;timeDomainBuffer)
        
        vDSP.multiply(timeDomainBuffer,
                      hanningWindow,
                      result: &amp;timeDomainBuffer)
        
        forwardDCT.transform(timeDomainBuffer,
                             result: &amp;frequencyDomainBuffer)
        
        vDSP.absolute(frequencyDomainBuffer,
                      result: &amp;frequencyDomainBuffer)
        
        vDSP.convert(amplitude: frequencyDomainBuffer,
                     toDecibels: &amp;frequencyDomainBuffer,
                     zeroReference: Float(Microphone.sampleCount))
        
        if frequencyDomainValues.count &gt; Microphone.sampleCount {
            frequencyDomainValues.removeFirst(Microphone.sampleCount)
        }
        
        frequencyDomainValues.append(contentsOf: frequencyDomainBuffer)

    }

the timeDomainBuffer is the float16 Array contains samples counting sampleCount,
while the frequencyDomainBuffer is the amplitude of each frequency, frequency is denoted as it's array index with it's value expressing amplitude of this frequency.

I'm trying to get amplitude of each frequency, just like this:

    for index in frequencyDomainBuffer{
        let frequency = index * (AVAudioSession().sampleRate/Double(Microphone.sampleCount)/2)
    }

I supposed the index of freqeuencyDomainBuffer will be linear to the actual frequency, so sampleRate divided by half of sampleCount will be correct. (sampleCount is the timeDomainBuffer length)

The result is correct when running on my iPad, but the frequency got 10% higher when on iPhone.

I'm dubious whether AVAudioSession().sampleRate can be used on iPhone?

Of course I can add a condition like if iPhone, but I'd like to know why and will it be correct on other devices I haven't tested on?

答案1

得分: 2

如果你看到一个持续的10%差异,我打赌它实际上是8.9%的差异。我还没有研究过你的代码,但我会查找是否有一个硬编码的44.1kHz的地方。iPhone的采样率通常是48kHz。

还要记住,正如你怀疑的那样,频率的中心与采样率成比例。因此,在不同的采样率下,这些频率的中心是不同的。根据你使用的频率分组数量,这可能会导致较大的差异(不是真正的“错误”,因为这些频率分组是范围,但如果你假设它们恰好是中心频率,那么这可能与你的10%匹配)。

英文:

If you're seeing a consistent 10% difference, I'm betting it's actually an 8.9% difference. I haven't studied your code, but I'd look for a hard-coded 44.1kHz somewhere. The sample rate on iPhones is generally 48kHz.

Remember also that the bins are (as you suspect) proportional to the sampling rate. So at different sampling rates the center of the bins are different. Depending on the number of bins you're using, this could represent a large difference (not really an "error" since the bins are ranges, but if you assume it's precisely the center frequency, this could match your 10%).

huangapple
  • 本文由 发表于 2023年2月6日 20:04:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/75361086.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定