英文:
google translate API bottleneck
问题
I'm currently working on an ETL pipeline and it takes too long to run after checking which part of the code takes the longest I found this:
我目前正在开发ETL管道,运行时间过长,经过检查,我发现代码中哪部分运行时间最长:
I'm using the Google Cloud Translate API to translate keywords that don't have translations in my db, but I'm running into a bottleneck when I try to translate a large number of keywords. Here's the code I'm using:
我正在使用Google Cloud翻译API来翻译数据库中没有翻译的关键词,但当我尝试翻译大量关键词时,遇到了瓶颈。以下是我正在使用的代码:
from google.cloud import translate_v2 as gt
gt_client = gt.Client(target_language="de")
for keywd in no_translations:
keywd_translated[keywd] = gt_client.translate(keywd)["translatedText"]
if keywd_translated[keywd] == "":
keywd_translated[keywd] = keywd
The problem is that this code is taking a long time to execute when there are a lot of keywords to translate (10min out of 13 min is consumed by this part). Is there a way to optimize this code or the use of the API to make it faster? Any suggestions would be greatly appreciated. Thanks!
问题在于,当需要翻译大量关键词时,这段代码执行时间较长(其中10分钟占用于此部分)。是否有方法可以优化这段代码或API的使用以提高速度?任何建议将不胜感激。谢谢!
I tried converting this piece of code to using asyncio but with no noticeable improvement
我尝试将这段代码转换为使用asyncio,但没有明显的改进。
英文:
I'm currently working on an ETL pipeline and it takes too long to run after checking which part of the code takes the longest I found this:
I'm using the Google Cloud Translate API to translate keywords that don't have translations in my db, but I'm running into a bottleneck when I try to translate a large number of keywords. Here's the code I'm using:
from google.cloud import translate_v2 as gt
gt_client = gt.Client(target_language="de")
for keywd in no_translations:
keywd_translated[keywd] = gt_client.translate(keywd)["translatedText"]
if keywd_translated[keywd] == "":
keywd_translated[keywd] = keywd
The problem is that this code is taking a long time to execute when there are a lot of keywords to translate (10min out of 13 min is consumed by this part). Is there a way to optimize this code or the use of the API to make it faster? Any suggestions would be greatly appreciated. Thanks!
I tried converting this piece of code to using asyncio but with no noticable improvement
答案1
得分: 0
API文档说明您可以在每个调用中传递多个值:
from google.cloud import translate_v2 as gt
def translate(words, to_language="de"):
client = gt.Client(target_language=to_language)
result = {}
for value in client.translate(words):
original = value["input"]
trans = value["translatedText"]
result[original] = trans if trans != "" else original
return result
英文:
The API docs say you can pass multiple values per call:
from google.cloud import translate_v2 as gt
def translate(words, to_language="de"):
client = gt.Client(target_language=to_language)
result = {}
for value in client.translate(words):
original = value["input"]
trans = value["translatedText"]
result[original] = trans if trans != "" else original
return result
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论