Google翻译API瓶颈

huangapple go评论58阅读模式
英文:

google translate API bottleneck

问题

I'm currently working on an ETL pipeline and it takes too long to run after checking which part of the code takes the longest I found this:
我目前正在开发ETL管道,运行时间过长,经过检查,我发现代码中哪部分运行时间最长:

I'm using the Google Cloud Translate API to translate keywords that don't have translations in my db, but I'm running into a bottleneck when I try to translate a large number of keywords. Here's the code I'm using:
我正在使用Google Cloud翻译API来翻译数据库中没有翻译的关键词,但当我尝试翻译大量关键词时,遇到了瓶颈。以下是我正在使用的代码:

from google.cloud import translate_v2 as gt

gt_client = gt.Client(target_language="de")
for keywd in no_translations:
    keywd_translated[keywd] = gt_client.translate(keywd)["translatedText"]
    if keywd_translated[keywd] == "":
        keywd_translated[keywd] = keywd

The problem is that this code is taking a long time to execute when there are a lot of keywords to translate (10min out of 13 min is consumed by this part). Is there a way to optimize this code or the use of the API to make it faster? Any suggestions would be greatly appreciated. Thanks!
问题在于,当需要翻译大量关键词时,这段代码执行时间较长(其中10分钟占用于此部分)。是否有方法可以优化这段代码或API的使用以提高速度?任何建议将不胜感激。谢谢!

I tried converting this piece of code to using asyncio but with no noticeable improvement
我尝试将这段代码转换为使用asyncio,但没有明显的改进。

英文:

I'm currently working on an ETL pipeline and it takes too long to run after checking which part of the code takes the longest I found this:
I'm using the Google Cloud Translate API to translate keywords that don't have translations in my db, but I'm running into a bottleneck when I try to translate a large number of keywords. Here's the code I'm using:

from google.cloud import translate_v2 as gt

gt_client = gt.Client(target_language="de")
for keywd in no_translations:
    keywd_translated[keywd] = gt_client.translate(keywd)["translatedText"]
    if keywd_translated[keywd] == "":
        keywd_translated[keywd] = keywd

The problem is that this code is taking a long time to execute when there are a lot of keywords to translate (10min out of 13 min is consumed by this part). Is there a way to optimize this code or the use of the API to make it faster? Any suggestions would be greatly appreciated. Thanks!

I tried converting this piece of code to using asyncio but with no noticable improvement

答案1

得分: 0

API文档说明您可以在每个调用中传递多个值:

from google.cloud import translate_v2 as gt

def translate(words, to_language="de"):
    client = gt.Client(target_language=to_language)
    result = {}
    for value in client.translate(words):
        original = value["input"]
        trans = value["translatedText"]
        result[original] = trans if trans != "" else original
    return result
英文:

The API docs say you can pass multiple values per call:

from google.cloud import translate_v2 as gt

def translate(words, to_language="de"):
    client = gt.Client(target_language=to_language)
    result = {}
    for value in client.translate(words):
        original = value["input"]
        trans = value["translatedText"]
        result[original] = trans if trans != "" else original
    return result

huangapple
  • 本文由 发表于 2023年3月3日 18:57:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/75626189.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定