问题

以下是您要翻译的内容：

AudioInputStream stream = AudioSystem.getAudioInputStream(new File("file_a4.wav"));

我正在寻找一种方法来识别录制在.wav文件中的音乐音阶声音的频率（例如，A4 = 440赫兹）。我已经阅读了很多关于FFT的内容，但有人建议音乐音阶上的频率与FFT不匹配。

我也听说过DTFT。我应该使用什么方法来识别音频文件中的频率？

英文:

AudioInputStream stream = AudioSystem.getAudioInputStream(new File(&quot;file_a4.wav&quot;));

I am looking for a way to recognise the frequency of a musical scale sound (e.g. A4 = 440 Hz) recorded on a .wav file. I have read a lot about FFT, but it has been suggested that the frequencies on the musical scale do not match the FFT.

I have also heard about DTFT. What should I use to recognise the frequency from a sound file?

答案1

得分: 1

根据您的问题，我理解您希望识别音频文件中乐器演奏的音符。如果是这样的话，有几种算法可以实现这个目标，您也可以训练神经网络来完成这项任务。需要注意的一些重要事项包括：

任何乐器（对人声产生的音乐音调也是如此）在演奏音符时都有其独特的“色彩”。这种色彩被称为音色（https://en.wikipedia.org/wiki/Timbre），它由在您听到特定音符时在其周围出现的谐波和非谐波频率组成。这就是为什么您不能仅仅寻找FFT的峰值来检测音乐音符的原因，也是为什么钢琴在演奏相同音符时与吉他的声音不同的原因。
对音频信号的分析通常通过对信号进行窗口化并计算窗口部分的DFT来进行。然后，每个窗口将产生自己的频谱，通过分析每个单独频谱和/或分析它们的相互作用，您（或例如您的卷积神经网络）将得出结论/结果。这个对信号进行窗口化并计算DFT的过程会产生频谱图（https://en.wikipedia.org/wiki/Spectrogram#:~:text=A%20spectrogram%20is%20a%20visual,sonographs%2C%20voiceprints%2C%20or%20voicegrams.）。

在这个简短的介绍之后，以下是一些在音频文件中识别单个音符的简单算法。您可以在互联网上找到这些算法的实现，以及许多其他算法。检测和弦产生的音符要更复杂一些，但可以使用其他算法或神经网络来完成。

使用自相关分析进行音高检测的方法：https://ieeexplore.ieee.org/document/1162905
YIN算法：http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf

英文:

What I understand from your question is that you want to recognize the musical note/s an instrument is playing in a wav file. If that is the case, there are several algorithms for doing that, and you could always train a neural network for doing that too.
Some important Things to take into account are:

Any instrument (the same would happen for the musical sounds produced by the human voice) has its own particular "color" when producing a note. This color is called the timbre (https://en.wikipedia.org/wiki/Timbre), and is composed by the harmonic and inharmonic frequencies that surround the frequency you psychoacoustically perceive when listening to that specific note. This is why you cannot just look for the peak of an FFT to detect the musical note, and it is also the reason why a piano sounds different than a guitar when playing the same note.
The analysis of an audio signal is often performed by windowing the signal and calculating the DFT of the windowed part of the signal. Each window would then produce its own spectrum, and it s from the analysis of each individual spectrum and/or the analysis of how they interact that you (or your CNN, for example) will obtain your conclusions/results. This process of windowing the signal and calculating the DFTs produces a spectogram (https://en.wikipedia.org/wiki/Spectrogram#:~:text=A%20spectrogram%20is%20a%20visual,sonographs%2C%20voiceprints%2C%20or%20voicegrams.)

After that short introduction, here are some simple algorithms for identifying single notes in a wav file. You will be able to find implementations of those algorithms on the internet, and many others. The detection of the notes produced by chords is more complex but can be done with other algorithms or neural networks.

On the use of autocorrelation analysis for pitch detection: https://ieeexplore.ieee.org/document/1162905
YIN algorithm: http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在Java中从一个.WAV文件中识别声音频率

问题

答案1

可以制作一个对象的ArrayList吗？

为什么我不能在@Mapping属性中引用@Context参数？

如何阻止方法put()和take()从’无限等待’中恢复？

确定传感器是否存在并正常工作的可靠方法是什么？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论