英文:
How to send an m4a file from FastAPI to OpenAI transcribe
问题
I'm trying to get an m4a
file transcribed. I'm receiving this file at a FastAPI endpoint and then attempting to send it to OpenAI's transcribe
but it seems like the format/shape is off. How can I turn the UploadFile into something that OpenAI will accept? The OpenAI docs for transcribe are essentially:
Here's my current code:
@app.post("/transcribe")
async def transcribe_audio_file(file: UploadFile = File(...)):
contents = await file.read()
contents_str = contents.decode()
buffer = io.StringIO(contents_str)
transcript_response = openai.Audio.transcribe("whisper-1", buffer)
I've modified the above code to several different scenarios, which return the respective errors:
transcript_response = openai.Audio.transcribe("whisper-1", file) # AttributeError: 'UploadFile' object has no attribute 'name'
transcript_response = openai.Audio.transcribe("whisper-1", contents) # AttributeError: 'bytes' object has no attribute 'name'
transcript_response = openai.Audio.transcribe("whisper-1", contents_str) # UnicodeDecodeError: 'utf-8' codec can't decode byte 0x86 in position 13: invalid start byte
transcript_response = openai.Audio.transcribe("whisper-1", buffer) # UnicodeDecodeError: 'utf-8' codec can't decode byte 0x86 in position 13: invalid start byte
I have something similar working in a vanilla CLI python script that looks like this:
audio_file = open("./audio-file.m4a", "rb")
transcript_response = openai.Audio.transcribe("whisper-1", audio_file)
So I also tried using a method like that:
with open(file.filename, "rb") as audio_file:
transcript = openai.Audio.transcribe("whisper-1", audio_file)
But that gave the error:
FileNotFoundError: [Errno 2] No such file or directory: '6ad52ad0-2fce-4d79-b4ac-e154379ceacd'
Any tips on how to debug this myself are also welcome. I'm coming from TypeScript land.
英文:
I'm trying to get an m4a
file transcribed. I'm receiving this file at a FastAPI endpoint and then attempting to send it to OpenAI's transcribe
but it seems like the format/shape is off. How can I turn the UploadFile into something that OpenAI will accept? The OpenAI docs for transcribe are essentially:
> The transcriptions API takes as input the audio file you want to transcribe and the desired output file format for the transcription of the audio. We currently support multiple input and output file formats.
Here's my current code:
@app.post("/transcribe")
async def transcribe_audio_file(file: UploadFile = File(...)):
contents = await file.read()
contents_str = contents.decode()
buffer = io.StringIO(contents_str)
transcript_response = openai.Audio.transcribe("whisper-1", buffer)
I've modified the above code to several different scenarios, which return the respective errors:
transcript_response = openai.Audio.transcribe("whisper-1", file) # AttributeError: 'UploadFile' object has no attribute 'name'
transcript_response = openai.Audio.transcribe("whisper-1", contents) # AttributeError: 'bytes' object has no attribute 'name'
transcript_response = openai.Audio.transcribe("whisper-1", contents_str) # UnicodeDecodeError: 'utf-8' codec can't decode byte 0x86 in position 13: invalid start byte
transcript_response = openai.Audio.transcribe("whisper-1", buffer) # UnicodeDecodeError: 'utf-8' codec can't decode byte 0x86 in position 13: invalid start byte
I have something similar working in a vanilla CLI python script that looks like this:
audio_file = open("./audio-file.m4a", "rb")
transcript_response = openai.Audio.transcribe("whisper-1", audio_file)
So I also tried using a method like that:
with open(file.filename, "rb") as audio_file:
transcript = openai.Audio.transcribe("whisper-1", audio_file)
But that gave the error:
FileNotFoundError: [Errno 2] No such file or directory: '6ad52ad0-2fce-4d79-b4ac-e154379ceacd'
Any tips on how to debug this myself are also welcome. I'm coming from TypeScript land.
答案1
得分: 1
如在此问题中所提到,解决方案如下:
@app.post("/transcribe")
async def transcribe_audio_file(file: UploadFile = File(...)):
audio = await file.read()
buffer = io.BytesIO(audio)
buffer.name = 'audio.m4a' # 几乎任何字符串都可以在这里使用
transcript_response = openai.Audio.transcribe("whisper-1", buffer)
return transcript_response
英文:
As mentioned in this question, the solution is as follows:
@app.post("/transcribe")
async def transcribe_audio_file(file: UploadFile = File(...)):
audio = await file.read()
buffer = io.BytesIO(audio)
buffer.name = 'audio.m4a' # pretty sure any string here will do
transcript_response = openai.Audio.transcribe("whisper-1", buffer)
return transcript_response
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论