英文:
Parse HTML and replace Text Python BeautifulSoup
问题
def translate_tag(tag):
if tag.string:
trans_text = translator.translate(tag.string)
tag.string.replace_with(trans_text)
else:
for child in tag:
if child.name:
translate_tag(child)
这是修复后的函数,它会正确地翻译嵌套标签中的文本内容,而不会破坏标签结构。
英文:
I have the following structure tag:
<p class="" id="">
Hi, My name is Vinay Kesharwani &amp; I am the Founder of
<a class="" href="https://scriptmint.com" rel="noopener ugc nofollow" target="_blank">ScriptMint</a>
. I am a Full Stack Developer working with Laravel, Vue.js &amp; Tailwind CSS.
</p>
I want to translate the text in the whole body, even in nested tags, without breaking the tag structure.
I wrote a recursive function that takes a tag and then translates the content, as follows:
def translate_tag(tag):
if tag.string:
trans_text = translator.translate(
tag.string
)
tag.string = trans_text
else:
for child in tag:
if child.name:
translate_tag(child)
However, the result is not entirely correct; only the contents of the "a" tag are translated:
<p class="" id="">Hi, My name is Vinay Kesharwani &amp; I am the Founder of
<a class="af np" href="https://scriptmint.com" rel="noopener ugc nofollow" target="_blank">スクリプトミント</a>
. I am a Full Stack Developer working with Laravel, Vue.js &amp; Tailwind CSS.
</p>
Please help me to fix my function for the correct translation of nested tags.
答案1
得分: 0
以下是翻译好的代码部分:
from bs4 import NavigableString, BeautifulSoup as bs
h = """
<p class="">
Hi, My name is Vinay Kesharwani & I am the Founder of
<a class="" href="https://scriptmint.com" rel="noopener ugc nofollow" target="_blank">ScriptMint</a>
. I am a Full Stack Developer working with Laravel, Vue.js & Tailwind CSS.
</p>
"""
soup = bs(h, "html.parser")
def translate_recurse(contents):
for i in range(len(contents)):
if type(contents[i]) == NavigableString:
contents[i] = \
NavigableString(translate(contents[i].text)) # translating model you use
else:
translate_recurse(contents[i].contents)
translate_recurse(soup.contents)
英文:
if you are using BeautifulSoup
this is how i would have done it:
from bs4 import NavigableString ,BeautifulSoup as bs
h = """
<p class="" id="">
Hi, My name is Vinay Kesharwani &amp; I am the Founder of
<a class="" href="https://scriptmint.com" rel="noopener ugc nofollow" target="_blank">ScriptMint</a>
. I am a Full Stack Developer working with Laravel, Vue.js &amp; Tailwind CSS.
</p>
"""
soup = bs(h, "html.parser")
def translate_recurse(contents):
for i in range(len(contents)):
if type(contents[i])== NavigableString:
contents[i] =\
NavigableString(translate(contents[i].text)) # translating model you use
else:
translate_recurse(contents[i].contents)
translate_recurse(soup.contents)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论