解析HTML并替换文本,使用Python的BeautifulSoup。

huangapple go评论89阅读模式
英文:

Parse HTML and replace Text Python BeautifulSoup

问题

def translate_tag(tag):
    if tag.string:
        trans_text = translator.translate(tag.string)
        tag.string.replace_with(trans_text)
    else:
        for child in tag:
            if child.name:
                translate_tag(child)

这是修复后的函数,它会正确地翻译嵌套标签中的文本内容,而不会破坏标签结构。

英文:

I have the following structure tag:

<p class="" id="">
Hi, My name is Vinay Kesharwani & I am the Founder of 
<a class="" href="https://scriptmint.com" rel="noopener ugc nofollow" target="_blank">ScriptMint</a>
. I am a Full Stack Developer working with Laravel, Vue.js & Tailwind CSS.
</p>

I want to translate the text in the whole body, even in nested tags, without breaking the tag structure.

I wrote a recursive function that takes a tag and then translates the content, as follows:

    def translate_tag(tag):
        if tag.string: 
            trans_text = translator.translate(
                    tag.string
                    )
            tag.string = trans_text
        else: 
            for child in tag:
                if child.name:
                    translate_tag(child)

However, the result is not entirely correct; only the contents of the "a" tag are translated:

<p class="" id="">Hi, My name is Vinay Kesharwani & I am the Founder of 
<a class="af np" href="https://scriptmint.com" rel="noopener ugc nofollow" target="_blank">スクリプトミント</a>
. I am a Full Stack Developer working with Laravel, Vue.js & Tailwind CSS.
</p>

Please help me to fix my function for the correct translation of nested tags.

答案1

得分: 0

以下是翻译好的代码部分:

from bs4 import NavigableString, BeautifulSoup as bs
h = """
<p class="">
Hi, My name is Vinay Kesharwani &amp; I am the Founder of 
<a class="" href="https://scriptmint.com" rel="noopener ugc nofollow" target="_blank">ScriptMint</a>
. I am a Full Stack Developer working with Laravel, Vue.js &amp; Tailwind CSS.
</p>
"""
soup = bs(h, "html.parser")

def translate_recurse(contents):

    for i in range(len(contents)):
        if type(contents[i]) == NavigableString:
            contents[i] = \
                NavigableString(translate(contents[i].text)) # translating model you use
        else:
            translate_recurse(contents[i].contents)

translate_recurse(soup.contents)
英文:

if you are using BeautifulSoup this is how i would have done it:

from bs4 import NavigableString ,BeautifulSoup as bs
h = &quot;&quot;&quot;
&lt;p class=&quot;&quot; id=&quot;&quot;&gt;
Hi, My name is Vinay Kesharwani &amp;amp; I am the Founder of 
&lt;a class=&quot;&quot; href=&quot;https://scriptmint.com&quot; rel=&quot;noopener ugc nofollow&quot; target=&quot;_blank&quot;&gt;ScriptMint&lt;/a&gt;
. I am a Full Stack Developer working with Laravel, Vue.js &amp;amp; Tailwind CSS.
&lt;/p&gt;
&quot;&quot;&quot;

soup = bs(h, &quot;html.parser&quot;)


def translate_recurse(contents):

    for i in range(len(contents)):
        if type(contents[i])== NavigableString:
            contents[i] =\
NavigableString(translate(contents[i].text)) # translating model you use
        else:
            translate_recurse(contents[i].contents)

translate_recurse(soup.contents)

huangapple
  • 本文由 发表于 2023年6月26日 03:48:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/76552170.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定