2023年7月12日 22:28:19go评论99阅读模式

英文:

What exactly happens to large amplitude values when we load a .wav file using librosa?

问题

我想了解使用librosa加载.wav文件时，大振幅值会发生什么。

当我使用librosa查看波形时，尝试了解.wav文件的振幅值。现在，我想看看缩放这些振幅值如何影响声音。因此，我将这些振幅值乘以一个缩放因子。然而，当我使用IPython.display.Audio播放时，我没有看到声音上的任何效果：

scaled_signal = signal * 10 # signal是原始样本
# 播放缩放后的信号
print('播放缩放后的样本:')
display(Audio(data=scaled_signal, rate=sr))

因此，我将文件保存到了我的电脑上，我能听到差异。振幅确实被缩放了。然后，我决定再次使用librosa重新加载此文件。令人惊讶的是，现在当我在我的jupyter-notebook中再次播放此文件时，我能够听到缩放的效果：

soundfile.write('scaled_signal.wav', scaled_signal, sr)
# 再次加载缩放后的信号
scaled_signal, sr = librosa.load('scaled_signal.wav', sr=sr)
print('重新加载的缩放样本')
display(Audio(data=scaled_signal, rate=sr))

然而，通过绘制波形（见下文），我可以看到其形状已发生了变化。帮助我理解发生了什么以及为什么会这样？似乎它对振幅的幅度应用了一个上限。

fig, axs = plt.subplots(1, 2)
fig.set_figwidth(18)
waveshow(signal, sr=sr, ax=axs[0])
waveshow(scaled_signal, sr=sr, ax=axs[1])

原始信号和缩放信号的波形

英文:

I want to understand what happens to large amplitude values of a .wav file, when I load them using librosa.

I was trying to understand the amplitude values of .wav files when I see the waveform using librosa. Now, I want to see how scaling these values of amplitude affects the sound. Hence, I multiplied the values with a scaling factor. However, when I played that using IPython.display.Audio, I was not able to see any effect on the sound:

scaled_signal = signal * 10 # signal is the original sample
# play the scaled signal
print(&#39;Play the scaled sample:&#39;)
display(Audio(data = scaled_signal, rate = sr))

So I saved the file to my PC and I could hear the difference. The amplitude was indeed scaled. Then, I decided to reload this file using librosa. Surprisingly, now when I played this file again in my jupyter-notebook, I was able to hear the effect of scaling:

soundfile.write(&#39;scaled_signal.wav&#39;, scaled_signal, sr)
# loading the scaled signal again
scaled_signal, sr = librosa.load(&#39;scaled_signal.wav&#39;, sr = sr)
print(&#39;The scaled sample loaded again&#39;)
display(Audio(data = scaled_signal, rate = sr))

However, on plotting the waveform (see below) I could see that its shape has changed. Help me understand what happened and why? It appears as if it applied an upper_bound on magnitude of amplitudes.

fig, axs = plt.subplots(1, 2)
fig.set_figwidth(18)
waveshow(signal, sr = sr, ax = axs[0])
waveshow(scaled_signal, sr = sr, ax = axs[1])

The waveform of original signal and the scaled signal

答案1

得分: 0

librosa.load() 不会应用任何数据相关的标准化/缩放。只会将 int16/32 格式映射到 0.0-1.0 范围。

从用于播放音频的 IPython.display.Audio 的文档中：

如果使用数组选项，波形将被标准化。

英文:

librosa.load() does not apply any data-dependent normalization/scaling. Only mapping between int16/32 formats to a 0.0-1.0 range.

From the documentation for IPython.display.Audio, which you are using to play back the audio:

> If the array option is used the waveform will be normalized.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

大振幅值在使用 librosa 加载 .wav 文件时会发生什么？

问题

答案1

如何将数组中的每个元素变成另一个数组。

Python – 分解值的百分比并向列表添加新行

组合优化在Python中

How to prevent freezing in Flask web app during computations and redirect upon completion? (Python)

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。