2023年6月29日 03:07:19go评论116阅读模式

英文:

How to send an m4a file from FastAPI to OpenAI transcribe

问题

I'm trying to get an m4a file transcribed. I'm receiving this file at a FastAPI endpoint and then attempting to send it to OpenAI's transcribe but it seems like the format/shape is off. How can I turn the UploadFile into something that OpenAI will accept? The OpenAI docs for transcribe are essentially:

Here's my current code:

@app.post("/transcribe")
async def transcribe_audio_file(file: UploadFile = File(...)):
    contents = await file.read()
    contents_str = contents.decode()
    buffer = io.StringIO(contents_str)
    transcript_response = openai.Audio.transcribe("whisper-1", buffer)

I've modified the above code to several different scenarios, which return the respective errors:

    transcript_response = openai.Audio.transcribe("whisper-1", file) # AttributeError: 'UploadFile' object has no attribute 'name'
    transcript_response = openai.Audio.transcribe("whisper-1", contents) # AttributeError: 'bytes' object has no attribute 'name'
    transcript_response = openai.Audio.transcribe("whisper-1", contents_str) # UnicodeDecodeError: 'utf-8' codec can't decode byte 0x86 in position 13: invalid start byte
    transcript_response = openai.Audio.transcribe("whisper-1", buffer) # UnicodeDecodeError: 'utf-8' codec can't decode byte 0x86 in position 13: invalid start byte

I have something similar working in a vanilla CLI python script that looks like this:

audio_file = open("./audio-file.m4a", "rb")
transcript_response = openai.Audio.transcribe("whisper-1", audio_file)

So I also tried using a method like that:

    with open(file.filename, "rb") as audio_file:
        transcript = openai.Audio.transcribe("whisper-1", audio_file)

But that gave the error:

FileNotFoundError: [Errno 2] No such file or directory: '6ad52ad0-2fce-4d79-b4ac-e154379ceacd'

Any tips on how to debug this myself are also welcome. I'm coming from TypeScript land.

英文:

> The transcriptions API takes as input the audio file you want to transcribe and the desired output file format for the transcription of the audio. We currently support multiple input and output file formats.

Here's my current code:

@app.post(&quot;/transcribe&quot;)
async def transcribe_audio_file(file: UploadFile = File(...)):
    contents = await file.read()
    contents_str = contents.decode()
    buffer = io.StringIO(contents_str)
    transcript_response = openai.Audio.transcribe(&quot;whisper-1&quot;, buffer)

I've modified the above code to several different scenarios, which return the respective errors:

    transcript_response = openai.Audio.transcribe(&quot;whisper-1&quot;, file) # AttributeError: &#39;UploadFile&#39; object has no attribute &#39;name&#39;
    transcript_response = openai.Audio.transcribe(&quot;whisper-1&quot;, contents) # AttributeError: &#39;bytes&#39; object has no attribute &#39;name&#39;
    transcript_response = openai.Audio.transcribe(&quot;whisper-1&quot;, contents_str) # UnicodeDecodeError: &#39;utf-8&#39; codec can&#39;t decode byte 0x86 in position 13: invalid start byte
    transcript_response = openai.Audio.transcribe(&quot;whisper-1&quot;, buffer) # UnicodeDecodeError: &#39;utf-8&#39; codec can&#39;t decode byte 0x86 in position 13: invalid start byte

I have something similar working in a vanilla CLI python script that looks like this:

audio_file = open(&quot;./audio-file.m4a&quot;, &quot;rb&quot;)
transcript_response = openai.Audio.transcribe(&quot;whisper-1&quot;, audio_file)

So I also tried using a method like that:

    with open(file.filename, &quot;rb&quot;) as audio_file:
        transcript = openai.Audio.transcribe(&quot;whisper-1&quot;, audio_file)

But that gave the error:

FileNotFoundError: [Errno 2] No such file or directory: &#39;6ad52ad0-2fce-4d79-b4ac-e154379ceacd&#39;

Any tips on how to debug this myself are also welcome. I'm coming from TypeScript land.

答案1

得分: 1

如在此问题中所提到，解决方案如下：

@app.post("/transcribe")
async def transcribe_audio_file(file: UploadFile = File(...)):
    audio = await file.read()
    buffer = io.BytesIO(audio)
    buffer.name = 'audio.m4a' # 几乎任何字符串都可以在这里使用
    transcript_response = openai.Audio.transcribe("whisper-1", buffer)
    return transcript_response

英文:

As mentioned in this question, the solution is as follows:

@app.post(&quot;/transcribe&quot;)
async def transcribe_audio_file(file: UploadFile = File(...)):
    audio = await file.read()
    buffer = io.BytesIO(audio)
    buffer.name = &#39;audio.m4a&#39; # pretty sure any string here will do
    transcript_response = openai.Audio.transcribe(&quot;whisper-1&quot;, buffer)
    return transcript_response

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何从FastAPI将一个m4a文件发送到OpenAI进行转录。

问题

答案1

将从Python发送的字节数组字符串转换为ArrayBuffer。

Python Streamlit 侧边栏在输入时重置

open model zoo multi_camera_multi_target_tracking_demo

多层C类型结构的字典的YAML表示得到了一个奇怪的对象。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。