如何使用XSLT 1.0将HTML ul树转换为一维XML?

huangapple go评论53阅读模式
英文:

How to transform an html ul tree to 1D xml using xslt 1.0?

问题

你想要将HTML的嵌套ul树转换为一维XML,使用XSLT 1.0。以下是期望的XML输出(注意:pid是父级id):

<?xml version="1.0" encoding="UTF-8"?>
<items>
	<item id="1" pid="0" name="Name 1">Hi!</item>
	<item id="2" pid="1" name="Name 2">Hello</item>
	<item id="3" pid="2" name="Name 3">Ho ho</item>
	<item id="4" pid="2" name="Name 4">How do you do?</item>
	<item id="5" pid="4" name="Name 5">I'm fine</item>
	<item id="6" pid="5" name="Name 4">Ok</item>
	<item id="7" pid="1" name="Name 3">Hi. How do you do?</item>
	<item id="8" pid="7" name="Name 1">Fine</item>
</items>

你提供的XSLT看起来基本正确,但它可能无法处理完整的HTML。你需要确保XSLT可以从给定的HTML中提取所需的数据。如果你遇到问题,可能需要进一步调整XSLT以适应你的HTML结构。

英文:

How to transform an html ul tree to 1D xml using xslt 1.0?

I want to transform the chat tree into a flat list. This is my first time working with xml and I have already done 2 other transformations. But I couldn't make xml from html. How is this done?

An input html structure with a nested chat tree with messages and names.

&lt;!DOCTYPE html&gt;
&lt;html&gt;
&lt;head&gt;
    &lt;title&gt;Chat&lt;/title&gt;
&lt;/head&gt;
&lt;body&gt;
    &lt;ul&gt;
        &lt;li&gt;
            &lt;b&gt;Name 1&lt;/b&gt; say: Hi!&lt;ul&gt;
                &lt;li&gt;
                    &lt;b&gt;Name 2&lt;/b&gt; say: Hello&lt;ul&gt;
                        &lt;li&gt;
                            &lt;b&gt;Name 3&lt;/b&gt; say: Ho ho
                        &lt;/li&gt;
                        &lt;li&gt;
                            &lt;b&gt;Name 4&lt;/b&gt; say: How do you do?&lt;ul&gt;
                                &lt;li&gt;
                                    &lt;b&gt;Name 5&lt;/b&gt; say: I&#39;m fine&lt;ul&gt;
                                        &lt;li&gt;
                                            &lt;b&gt;Name 4&lt;/b&gt; say: Ok
                                        &lt;/li&gt;
                                    &lt;/ul&gt;
                                &lt;/li&gt;
                            &lt;/ul&gt;
                        &lt;/li&gt;
                    &lt;/ul&gt;
                &lt;/li&gt;
                &lt;li&gt;
                    &lt;b&gt;Name 3&lt;/b&gt; say: Hi. How do you do?&lt;ul&gt;
                        &lt;li&gt;
                            &lt;b&gt;Name 1&lt;/b&gt; say: Fine
                        &lt;/li&gt;
                    &lt;/ul&gt;
                &lt;/li&gt;
            &lt;/ul&gt;
        &lt;/li&gt;
    &lt;/ul&gt;
&lt;/body&gt;
&lt;/html&gt;

Expected xml output (clarification: pid is the parent id):

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;items&gt;
	&lt;item id=&quot;1&quot; pid=&quot;0&quot; name=&quot;Name 1&quot;&gt;Hi!&lt;/item&gt;
	&lt;item id=&quot;2&quot; pid=&quot;1&quot; name=&quot;Name 2&quot;&gt;Hello&lt;/item&gt;
	&lt;item id=&quot;3&quot; pid=&quot;2&quot; name=&quot;Name 3&quot;&gt;Ho ho&lt;/item&gt;
	&lt;item id=&quot;4&quot; pid=&quot;2&quot; name=&quot;Name 4&quot;&gt;How do you do?&lt;/item&gt;
	&lt;item id=&quot;5&quot; pid=&quot;4&quot; name=&quot;Name 5&quot;&gt;I&#39;m fine&lt;/item&gt;
	&lt;item id=&quot;6&quot; pid=&quot;5&quot; name=&quot;Name 4&quot;&gt;Ok&lt;/item&gt;
	&lt;item id=&quot;7&quot; pid=&quot;1&quot; name=&quot;Name 3&quot;&gt;Hi. How do you do?&lt;/item&gt;
	&lt;item id=&quot;8&quot; pid=&quot;7&quot; name=&quot;Name 1&quot;&gt;Fine&lt;/item&gt;
&lt;/items&gt;

Maybe like this? But it doesn't work with full html.

&lt;xsl:stylesheet version=&quot;1.0&quot; encoding=&quot;utf-8&quot;
	xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;&gt;
	&lt;xsl:output method=&quot;xml&quot; indent=&quot;yes&quot;/&gt;
	&lt;xsl:template match=&quot;b&quot;&gt;
		&lt;item&gt;
			&lt;xsl:attribute name=&quot;name&quot;&gt;
				&lt;xsl:value-of select=&quot;@*|node()&quot;/&gt;
			&lt;/xsl:attribute&gt;
		&lt;/item&gt;
	&lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;

答案1

得分: 2

请看这个作为你的起点:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/">
    <items>
        <xsl:for-each select="//li">
            <item id="{generate-id()}" pid="{generate-id(ancestor::li[1])}" name="{b[1]}">
                <xsl:value-of select="normalize-space(substring-after(text()[1], 'say: '))" />
            </item>
        </xsl:for-each>
    </items>
</xsl:template>

</xsl:stylesheet>

请注意,idpid 值的格式取决于处理器。

英文:

Try this as your starting point:

XSLT 1.0

&lt;xsl:stylesheet version=&quot;1.0&quot; 
xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;&gt;
&lt;xsl:output method=&quot;xml&quot; version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; indent=&quot;yes&quot;/&gt;
&lt;xsl:strip-space elements=&quot;*&quot;/&gt;

&lt;xsl:template match=&quot;/&quot;&gt;
	&lt;items&gt;
		&lt;xsl:for-each select=&quot;//li&quot;&gt;
			&lt;item id=&quot;{generate-id()}&quot; pid=&quot;{generate-id(ancestor::li[1])}&quot; name=&quot;{b[1]}&quot;&gt;
				&lt;xsl:value-of select=&quot;normalize-space(substring-after(text()[1], &#39;say: &#39;))&quot; /&gt;
			&lt;/item&gt;
		&lt;/xsl:for-each&gt;
	&lt;/items&gt;
&lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;

Note that the format of the id and pid values is processor-dependent.

huangapple
  • 本文由 发表于 2023年6月12日 21:41:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/76457259.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定