How do I export audio stored in a numpy array in a lossy format like m4a?

huangapple go评论63阅读模式
英文:

How do I export audio stored in a numpy array in a lossy format like m4a?

问题

我有一些文本转语音的代码它给我一个用于音频输出的numpy数组我可以将这个音频数组导出为WAV文件如下所示

```python
sample_rate = 48000

audio_normalized = audio
audio_normalized = audio_normalized / np.max(np.abs(audio_normalized))

# [[https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.write.html][scipy.io.wavfile.write — SciPy v1.10.0 Manual]]
scipy.io.wavfile.write(output_path, sample_rate, audio_normalized,)

但是当文本很长时,我会得到这个错误:

  File "/Users/evar/code/python/blackbutler/blackbutler/butler.py", line 216, in cmd_zsh
    scipy.io.wavfile.write(output_path,
  File "/Users/evar/mambaforge/lib/python3.10/site-packages/scipy/io/wavfile.py", line 812, in write
    raise ValueError("Data exceeds wave file size limit")
ValueError: Data exceeds wave file size limit

所以我认为我需要将numpy数组转换为一个小的、有损的格式,比如m4amp3,然后保存它。


<details>
<summary>英文:</summary>

I have some text-to-speech code that gives me a numpy array for its audio output. I can export this audio array to a WAV file like so:
```python
sample_rate = 48000

audio_normalized = audio
audio_normalized = audio_normalized / np.max(np.abs(audio_normalized))

# [[https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.write.html][scipy.io.wavfile.write — SciPy v1.10.0 Manual]]
scipy.io.wavfile.write(output_path, sample_rate, audio_normalized,)

But when the text is long, I get this error:

  File &quot;/Users/evar/code/python/blackbutler/blackbutler/butler.py&quot;, line 216, in cmd_zsh
    scipy.io.wavfile.write(output_path,
  File &quot;/Users/evar/mambaforge/lib/python3.10/site-packages/scipy/io/wavfile.py&quot;, line 812, in write
    raise ValueError(&quot;Data exceeds wave file size limit&quot;)
ValueError: Data exceeds wave file size limit

So I think I need to convert the numpy array to a small, lossy format like m4a or mp3 using Python, and then save that.

答案1

得分: 1

Check out pydub (functions pydub .AudioSegment() and .export()). It can save to mp3 from numpy. Related questions: Link to Stack Overflow Question 1 Link to Stack Overflow Question 2

If to stay on .wav is more prefered, actually:

  • you can devide your output into fixed length parts and keep storing them as .wav
  • you can also resample (lower sample rate) your audio using librosa .resample() or change bit depth using Soundfile .write().
英文:

Check out pydub (functions pydub .AudioSegment() and .export()). It can save to mp3 from numpy.
Related questions:
https://stackoverflow.com/questions/53633177/how-to-read-a-mp3-audio-file-into-a-numpy-array-save-a-numpy-array-to-mp3
https://stackoverflow.com/questions/66191480/how-to-convert-a-numpy-array-to-a-mp3-file

If to stay on .wav is more prefered, actually:

  • you can devide your output into fixed length parts and keep storing
    them as .wav
  • you can also resample (lower sample rate) your audio
    using librosa .resample() or change bit depth using Soundfile
    .write().

huangapple
  • 本文由 发表于 2023年2月24日 02:16:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/75548786.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定