英文:
How do I export audio stored in a numpy array in a lossy format like m4a?
问题
我有一些文本转语音的代码,它给我一个用于音频输出的numpy数组。我可以将这个音频数组导出为WAV文件,如下所示:
```python
sample_rate = 48000
audio_normalized = audio
audio_normalized = audio_normalized / np.max(np.abs(audio_normalized))
# [[https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.write.html][scipy.io.wavfile.write — SciPy v1.10.0 Manual]]
scipy.io.wavfile.write(output_path, sample_rate, audio_normalized,)
但是当文本很长时,我会得到这个错误:
File "/Users/evar/code/python/blackbutler/blackbutler/butler.py", line 216, in cmd_zsh
scipy.io.wavfile.write(output_path,
File "/Users/evar/mambaforge/lib/python3.10/site-packages/scipy/io/wavfile.py", line 812, in write
raise ValueError("Data exceeds wave file size limit")
ValueError: Data exceeds wave file size limit
所以我认为我需要将numpy数组转换为一个小的、有损的格式,比如m4a
或mp3
,然后保存它。
<details>
<summary>英文:</summary>
I have some text-to-speech code that gives me a numpy array for its audio output. I can export this audio array to a WAV file like so:
```python
sample_rate = 48000
audio_normalized = audio
audio_normalized = audio_normalized / np.max(np.abs(audio_normalized))
# [[https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.write.html][scipy.io.wavfile.write — SciPy v1.10.0 Manual]]
scipy.io.wavfile.write(output_path, sample_rate, audio_normalized,)
But when the text is long, I get this error:
File "/Users/evar/code/python/blackbutler/blackbutler/butler.py", line 216, in cmd_zsh
scipy.io.wavfile.write(output_path,
File "/Users/evar/mambaforge/lib/python3.10/site-packages/scipy/io/wavfile.py", line 812, in write
raise ValueError("Data exceeds wave file size limit")
ValueError: Data exceeds wave file size limit
So I think I need to convert the numpy array to a small, lossy format like m4a
or mp3
using Python, and then save that.
答案1
得分: 1
Check out pydub (functions pydub .AudioSegment() and .export()). It can save to mp3 from numpy. Related questions: Link to Stack Overflow Question 1 Link to Stack Overflow Question 2
If to stay on .wav is more prefered, actually:
- you can devide your output into fixed length parts and keep storing them as .wav
- you can also resample (lower sample rate) your audio using librosa .resample() or change bit depth using Soundfile .write().
英文:
Check out pydub (functions pydub .AudioSegment() and .export()). It can save to mp3 from numpy.
Related questions:
https://stackoverflow.com/questions/53633177/how-to-read-a-mp3-audio-file-into-a-numpy-array-save-a-numpy-array-to-mp3
https://stackoverflow.com/questions/66191480/how-to-convert-a-numpy-array-to-a-mp3-file
If to stay on .wav is more prefered, actually:
- you can devide your output into fixed length parts and keep storing
them as .wav - you can also resample (lower sample rate) your audio
using librosa .resample() or change bit depth using Soundfile
.write().
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论