问题

以下是您提供的代码的翻译部分：

我有一个函数，用于计算传入音频流的傅立叶变换，然后识别音频中最显著的音符。一切都正常运行（尽管我的代码可能在许多方面都有问题）。我想要解决的主要问题是函数的速度。我使用的块大小为2^14，这使得函数相当精确，但速度比我想要的慢得多。我想知道是否有办法实现一个能够更快运行的C fft函数，但仍然能够在我的Python代码中运行。我也愿意听取有关如何加快此函数运行速度的其他建议。
以下是当前的函数：
```python
def pitch_calculations(stream, CHUNK, RATE):
    # 读取麦克风流，并调用struct.unpack将二进制数据转换回浮点数
    data = stream.read(CHUNK, exception_on_overflow=False)
    dataInt = np.array(struct.unpack(str(CHUNK) + 'h', data))
    
    # 对输入数据应用窗口函数（Hamming）
    windowed_data = np.hamming(CHUNK) * dataInt
    # 使用numpy的快速傅立叶变换将麦克风数据转换为频率
    fft_result = np.abs(fft.fft(windowed_data, threads=6)) * 2 / (11000 * CHUNK)
    fft_result_copy = fft_result
    freqs = np.fft.fftfreq(len(windowed_data), d=1.0 / RATE)
    
    fft_result = fft_result[:int(len(fft_result)/2)]
    freqs = freqs[:int(len(freqs)/2)]
    
    # 使用scipy.signal find_peaks找到频率谱中的局部最大值的索引
    localmax_indices = find_peaks(fft_result, height=0.04)[0]
    
    # 获取局部最大值的幅度
    strong_freqs = fft_result[localmax_indices]
    
    # 按降序对幅度进行排序
    sorted_indices = np.argsort(strong_freqs)[::-1]
    
    # 从峰值计算音高，计算多少个cent偏差
    pitches = []
    
    for index in sorted_indices:
        pitches.append(abs(freqs[localmax_indices[index]]))
        
    note_list = [calculate_tuning(pitch)[0] for pitch in pitches]
    cent_list = [calculate_tuning(pitch)[1] for pitch in pitches]
    note_list = order_pitches(note_list)
    note_list = remove_duplicates(note_list)
    
    return note_list, cent_list

希望这能帮助您理解代码的翻译部分。如果您需要更多信息或有其他问题，请随时提出。

英文:

I have a function that computes the fourier transform of an incoming audio stream, and then identifies what musical notes are most dominant within the audio. Everything is working (though I'm sure my code may be janky in many ways). The main thing I'd like to fix is the speed of the function. I'm using a chunk size of 2^14, which makes the function quite accurate, but also way slower than I'd like. I was wondering if there is any way to implement a C fft function that could run faster, but still have it run within my Python code. I'm also open to any other suggestions on how to get this function to run faster.

Here is the current function:

def pitch_calculations(stream, CHUNK, RATE):
    # Read mic stream and then call struct.unpack to convert from binary data back to floats
    data = stream.read(CHUNK, exception_on_overflow=False)
    dataInt = np.array(struct.unpack(str(CHUNK) + &#39;h&#39;, data))
    
    # Apply a window function (Hamming) to the input data
    windowed_data = np.hamming(CHUNK) * dataInt
    # Using numpy fast Fourier transform to convert mic data into frequencies
    fft_result = np.abs(fft.fft(windowed_data, threads=6)) * 2 / (11000 * CHUNK)
    fft_result_copy = fft_result
    freqs = np.fft.fftfreq(len(windowed_data), d=1.0 / RATE)
    
    fft_result = fft_result[:int(len(fft_result)/2)]
    freqs = freqs[:int(len(freqs)/2)]
    
    # Find the indices of local maxima in the frequency spectrum using scipy.signal find_peaks
    localmax_indices = find_peaks(fft_result, height=0.04)[0]
    
    # Get the magnitudes of the local maxima
    strong_freqs = fft_result[localmax_indices]
    
    # Sort the magnitudes in descending order
    sorted_indices = np.argsort(strong_freqs)[::-1]
    
    # Calculate the pitches from the peaks, calculate how many cents sharp/flat
    pitches = []
    
    for index in sorted_indices:
        pitches.append(abs(freqs[localmax_indices[index]]))
        
    note_list = [calculate_tuning(pitch)[0] for pitch in pitches]
    cent_list = [calculate_tuning(pitch)[1] for pitch in pitches]
    note_list = order_pitches(note_list)
    note_list = remove_duplicates(note_list)
    
    
    return note_list, cent_list

答案1

得分: 2

正如其他人所说，NumPy已经使用优化的本机代码执行了底层FFT计算。

真正的问题在于，2^14的变换大小对于任何实现来说都是一个不小的成本。加速它的建议如下：

降低音频采样率和/或变换大小，如果可能的话。对于确定音乐音符频率来说，2^14似乎非常高。可以尝试将音频降采样到，例如，24千赫兹（例如使用librosa.resample）并使用4096的变换大小，以获得每个频谱的频率分辨率为5.9赫兹。然后使用插值来估算子频带精度的峰值。
使用float32而不是float64：在变换之前，使用windowed_data = windowed_data.astype(np.float32)来将数据转换为float32。
使用实数到复数FFT，使用np.fft.rfft而不是标准的复数到复数变换。这可能会节省近2倍的计算时间。
批量处理一批窗口，而不是逐个处理。特别是，有些情况下，批量FFT的计算效率比单独计算FFT更高。例如，使用np.fft.rfft(x, axis=0)，其中x是一个2D数组，以计算x的每一列的实际到复数FFT。

英文:

As others have said, numpy already performs the underlying FFT computation with optimized native code.

The real problem is that a transform size of 2^14 is a nontrivial cost by any implementation. Suggestions to speed it up:

Reduce the audio sample rate and/or the transform size, if possible. 2^14 seems very high for determining musical note frequencies. It might be enough to downsample the audio to, say, 24 kHz (e.g. with librosa.resample) and use a transform size of 4096, for a spectrum with frequency resolution of 5.9 Hz per bin. Then use interpolation to estimate peaks with sub-bin percision.
Use float32 instead of float64: Before the transform, use windowed_data = windowed_data.astype(np.float32) to cast to float32.
Use real-to-complex FFTs with np.fft.rfft instead of the standard complex-to-complex transform. This might save near a factor 2 in computation time.
Process a batch of windows at a time, instead of one by one. Particularly, a batch of FFTs is in some cases more efficiently computed than computing FFTs individually. Use for instance np.fft.rfft(x, axis=0) with x being a 2D array to compute the real-to-complex FFT of each column of x.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用C语言在Python代码中进行更快的傅立叶变换？

问题

答案1

为什么arcpy中的len()函数与实际情况不匹配

如何从 pandas 数据框中删除具有一些复杂条件的重复项？

multiprocessing为每个线程创建基本变量的克隆。

阻止输入在Python 3中被打印出来

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。