使用XSLT 2.0从分组元素中提取可用内容。

huangapple go评论69阅读模式
英文:

Output available content from Grouped elements using XSLT 2.0

问题

根据您的要求,以下是翻译好的部分:

跟随这个[帖子][1]提供的解决方案(由`Martin Honnen`提供的),在那里我需要按照`id`对XML中的元素进行分组,并根据一些规则对重复项进行计数。

现在我遇到了同一组不同内容的分组元素的问题,这是预期的,对于相同的`case id`和`cont="n | n"`,在一个页面中有一个内容在另一个页面中不存在。

**源XML:**

```xml
<?xml version="1.0" encoding="utf-8"?>
<cases>
    <case id="1" cont="1 | 2">
        <serial>111</serial>
        <content total="10">
          <misc val="5" />
          <misc val="5" />
        </content>
    </case>
    <case id="1" cont="2 | 2">
        <serial>111</serial>
        <message>this is a note 1</message>
    </case>
    <case id="2" cont="">
        <serial>222</serial>
        <content total="8">
          <misc val="3" />
          <misc val="5" />
        </content>
        <message>this is a note 2</message>
    </case>
</cases>

XSLT 2.0:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
				xmlns:xs="http://www.w3.org/2001/XMLSchema" 
				exclude-result-prefixes="#all" 
				version="2.0">
				
<xsl:output method="xml" indent="yes"/>

<xsl:template match="cases">
    <output>
        <xsl:for-each-group 
            select="case[@cont = '' or count(distinct-values(tokenize(@cont, '\s*\|\s*'))) = 1]" 
            group-by="@id">
            <xsl:for-each-group select="current-group()" group-by="if (@cont = '') then '' else tokenize(@cont, '\s*\|\s*')[2]">
                <val id="{@id}">
                    <duplicates>
                        <xsl:value-of select="count(current-group()) - 1"/>
                    </duplicates>
                    <content total="{content/@total}">
                        <xsl:for-each select="content/misc">
                            <item>
                                <xsl:value-of select="@val"/>
                            </item>
                        </xsl:for-each>
                    </content>
                    <message val="{message}" />
                </val>
            </xsl:for-each-group>
        </xsl:for-each-group>
    </output>
</xsl:template>

</xsl:stylesheet>

预期输出:

<?xml version="1.0" encoding="UTF-8"?>
<output>
   <val id="1">
      <duplicates>0</duplicates>
      <content total="10">
         <item>5</item>
         <item>5</item>
      </content>
      <message val="this is a note 1"/>
   </val>
   <val id="2">
      <duplicates>0</duplicates>
      <content total="8">
         <item>3</item>
         <item>5</item>
      </content>
      <message val="this is a note 2"/>
   </val>
</output>

目前,id="1"的输出内容为空:<content total=""/>。我猜测需要对分组部分进行一些更改,因为我正在测试显示分组中的内容,而case id="1"的内容根本没有出现。

为了使相同的case idcont="n | n"在所有页面上都包含所有内容,我需要进行哪些更改?

从提出的解决方案中获取的信息:

源XML:

<?xml version="1.0" encoding="utf-8"?>
<cases>
    <case id="2" cont="">
        <serial>222</serial>
        <content total="8">
          <misc val="3" />
          <misc val="5" />
        </content>
        <message>this is a note 2</message>
    </case>
    <case id="1" cont="1 | 2">
        <serial>111</serial>
        <content total="10">
          <misc val="5" />
          <misc val="5" />
        </content>
    </case>
    <case id="1" cont="2 | 2">
        <serial>111</serial>
        <message>this is a note 1</message>
    </case>
    <case id="2" cont="">
        <serial>222</serial>
        <content total="8">
          <misc val="3" />
          <misc val="5" />
        </content>
        <message>this is a note 2</message>
    </case>
</cases>

输出:

<?xml version="1.0" encoding="UTF-8"?>
<output>
   <val id="2">
      <duplicates>1</duplicates>
      <content total="8 10">
         <item>3</item>
         <item>5</item>
         <item>5</item>
         <item>5</item>
      </content>
      <message val="this is a note 2"/>
   </val>
   <val id="1">
      <duplicates>0</duplicates>
      <content total="8 10">
         <item>3</item>
         <item>5</item>
         <item>5</item>
         <item>5</item>
      </content>
      <message val="this is a note 1"/>
   </val>
</output>

现在,对于重复的元素,内容得到了复制。

另一组输入数据不起作用:

<?xml version="1.0" encoding="utf-8"?>
<cases>
    <case id="1" cont="1 | 2">
        <serial>111</serial>
        <content total="10">
          <misc val="5" />
          <misc val="5" />
        </content>
    </case>
    <case id="1" cont="2 | 2">
        <serial>111</serial>
        <message>this is a note 1</message>
    </case>
    <case id="1" cont="1 | 2">
        <serial>111</serial>
        <content total="10">
          <misc val="5" />


<details>
<summary>英文:</summary>

Following the solution provided to this [post][1] (credits to: `Martin Honnen`) where I needed to group the elements in the XML by the `id` and counting the duplicates based on some rules.

Now I&#39;m having an issue with the same grouped elements that have different content, which is expected, that for the same `case id` with `cont=&quot;n | n&quot;` to have a content in one page that is not going to be present in the other.

**Source XML:**

    &lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;
    &lt;cases&gt;
        &lt;case id=&quot;1&quot; cont=&quot;1 | 2&quot;&gt;
            &lt;serial&gt;111&lt;/serial&gt;
            &lt;content total=&quot;10&quot;&gt;
              &lt;misc val=&quot;5&quot; /&gt;
              &lt;misc val=&quot;5&quot; /&gt;
            &lt;/content&gt;
        &lt;/case&gt;
        &lt;case id=&quot;1&quot; cont=&quot;2 | 2&quot;&gt;
            &lt;serial&gt;111&lt;/serial&gt;
            &lt;message&gt;this is a note 1&lt;/message&gt;
        &lt;/case&gt;
        &lt;case id=&quot;2&quot; cont=&quot;&quot;&gt;
            &lt;serial&gt;222&lt;/serial&gt;
            &lt;content total=&quot;8&quot;&gt;
              &lt;misc val=&quot;3&quot; /&gt;
              &lt;misc val=&quot;5&quot; /&gt;
            &lt;/content&gt;
            &lt;message&gt;this is a note 2&lt;/message&gt;
        &lt;/case&gt;
    &lt;/cases&gt;

**XSLT 2.0:**

    &lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
    &lt;xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot; 
    				xmlns:xs=&quot;http://www.w3.org/2001/XMLSchema&quot; 
    				exclude-result-prefixes=&quot;#all&quot; 
    				version=&quot;2.0&quot;&gt;
    				
    &lt;xsl:output method=&quot;xml&quot; indent=&quot;yes&quot;/&gt;
    
    &lt;xsl:template match=&quot;cases&quot;&gt;
    	&lt;output&gt;
    		&lt;xsl:for-each-group 
    			select=&quot;case[@cont = &#39;&#39; or count(distinct-values(tokenize(@cont, &#39;\s*\|\s*&#39;))) = 1]&quot; 
    			group-by=&quot;@id&quot;&gt;
    			&lt;xsl:for-each-group select=&quot;current-group()&quot; group-by=&quot;if (@cont = &#39;&#39;) then &#39;&#39; else tokenize(@cont, &#39;\s*\|\s*&#39;)[2]&quot;&gt;
    				&lt;val id=&quot;{@id}&quot;&gt;
    					&lt;duplicates&gt;
    						&lt;xsl:value-of select=&quot;count(current-group()) - 1&quot;/&gt;
    					&lt;/duplicates&gt;
    					&lt;content total=&quot;{content/@total}&quot;&gt;
    						&lt;xsl:for-each select=&quot;content/misc&quot;&gt;
    							&lt;item&gt;
    								&lt;xsl:value-of select=&quot;@val&quot;/&gt;
    							&lt;/item&gt;
    						&lt;/xsl:for-each&gt;
    					&lt;/content&gt;
    					&lt;message val=&quot;{message}&quot; /&gt;
    				&lt;/val&gt;
    			&lt;/xsl:for-each-group&gt;
    		&lt;/xsl:for-each-group&gt;
    	&lt;/output&gt;
    &lt;/xsl:template&gt;
    
    &lt;/xsl:stylesheet&gt;

**Expected output:**

    &lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
    &lt;output&gt;
       &lt;val id=&quot;1&quot;&gt;
          &lt;duplicates&gt;0&lt;/duplicates&gt;
          &lt;content total=&quot;10&quot;&gt;
             &lt;item&gt;5&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
          &lt;/content&gt;
          &lt;message val=&quot;this is a note 1&quot;/&gt;
       &lt;/val&gt;
       &lt;val id=&quot;2&quot;&gt;
          &lt;duplicates&gt;0&lt;/duplicates&gt;
          &lt;content total=&quot;8&quot;&gt;
             &lt;item&gt;3&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
          &lt;/content&gt;
          &lt;message val=&quot;this is a note 2&quot;/&gt;
       &lt;/val&gt;
    &lt;/output&gt;

Right now the content in the output for `id=&quot;1&quot;` is coming empty: `&lt;content total=&quot;&quot;/&gt;`. I guess some changes need to be done to the grouping piece because I was testing showing the content of what is coming in the grouping and the content from `case id=&quot;1&quot;` is not coming at all.

What changes do I need to do in order for the same `case id` with `cont=&quot;n | n&quot;` to have all the content present in all the pages?

**From the proposed solution:**

**Source XML:**

    &lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;
    &lt;cases&gt;
        &lt;case id=&quot;2&quot; cont=&quot;&quot;&gt;
            &lt;serial&gt;222&lt;/serial&gt;
            &lt;content total=&quot;8&quot;&gt;
              &lt;misc val=&quot;3&quot; /&gt;
              &lt;misc val=&quot;5&quot; /&gt;
            &lt;/content&gt;
            &lt;message&gt;this is a note 2&lt;/message&gt;
        &lt;/case&gt;
        &lt;case id=&quot;1&quot; cont=&quot;1 | 2&quot;&gt;
            &lt;serial&gt;111&lt;/serial&gt;
            &lt;content total=&quot;10&quot;&gt;
              &lt;misc val=&quot;5&quot; /&gt;
              &lt;misc val=&quot;5&quot; /&gt;
            &lt;/content&gt;
        &lt;/case&gt;
        &lt;case id=&quot;1&quot; cont=&quot;2 | 2&quot;&gt;
            &lt;serial&gt;111&lt;/serial&gt;
            &lt;message&gt;this is a note 1&lt;/message&gt;
        &lt;/case&gt;
        &lt;case id=&quot;2&quot; cont=&quot;&quot;&gt;
            &lt;serial&gt;222&lt;/serial&gt;
            &lt;content total=&quot;8&quot;&gt;
              &lt;misc val=&quot;3&quot; /&gt;
              &lt;misc val=&quot;5&quot; /&gt;
            &lt;/content&gt;
            &lt;message&gt;this is a note 2&lt;/message&gt;
        &lt;/case&gt;
    &lt;/cases&gt;

**Output:**

    &lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
    &lt;output&gt;
       &lt;val id=&quot;2&quot;&gt;
          &lt;duplicates&gt;1&lt;/duplicates&gt;
          &lt;content total=&quot;8 10&quot;&gt;
             &lt;item&gt;3&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
          &lt;/content&gt;
          &lt;message val=&quot;this is a note 2&quot;/&gt;
       &lt;/val&gt;
       &lt;val id=&quot;1&quot;&gt;
          &lt;duplicates&gt;0&lt;/duplicates&gt;
          &lt;content total=&quot;8 10&quot;&gt;
             &lt;item&gt;3&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
          &lt;/content&gt;
          &lt;message val=&quot;this is a note 1&quot;/&gt;
       &lt;/val&gt;
    &lt;/output&gt;

This other input data not working:

    &lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;
    &lt;cases&gt;
        &lt;case id=&quot;1&quot; cont=&quot;1 | 2&quot;&gt;
            &lt;serial&gt;111&lt;/serial&gt;
            &lt;content total=&quot;10&quot;&gt;
              &lt;misc val=&quot;5&quot; /&gt;
              &lt;misc val=&quot;5&quot; /&gt;
            &lt;/content&gt;
        &lt;/case&gt;
        &lt;case id=&quot;1&quot; cont=&quot;2 | 2&quot;&gt;
            &lt;serial&gt;111&lt;/serial&gt;
            &lt;message&gt;this is a note 1&lt;/message&gt;
        &lt;/case&gt;
        &lt;case id=&quot;1&quot; cont=&quot;1 | 2&quot;&gt;
            &lt;serial&gt;111&lt;/serial&gt;
            &lt;content total=&quot;10&quot;&gt;
              &lt;misc val=&quot;5&quot; /&gt;
              &lt;misc val=&quot;5&quot; /&gt;
            &lt;/content&gt;
        &lt;/case&gt;
        &lt;case id=&quot;1&quot; cont=&quot;2 | 2&quot;&gt;
            &lt;serial&gt;111&lt;/serial&gt;
            &lt;message&gt;this is a note 1&lt;/message&gt;
        &lt;/case&gt;
        &lt;case id=&quot;2&quot; cont=&quot;&quot;&gt;
            &lt;serial&gt;222&lt;/serial&gt;
            &lt;content total=&quot;8&quot;&gt;
              &lt;misc val=&quot;3&quot; /&gt;
              &lt;misc val=&quot;5&quot; /&gt;
            &lt;/content&gt;
            &lt;message&gt;this is a note 2&lt;/message&gt;
        &lt;/case&gt;
    &lt;/cases&gt;

**Output:**

    &lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
    &lt;output&gt;
       &lt;val id=&quot;1&quot;&gt;
          &lt;duplicates&gt;1&lt;/duplicates&gt;
          &lt;content total=&quot;10 10&quot;&gt;
             &lt;item&gt;5&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
          &lt;/content&gt;
          &lt;message val=&quot;this is a note 1&quot;/&gt;
       &lt;/val&gt;
       &lt;val id=&quot;2&quot;&gt;
          &lt;duplicates&gt;0&lt;/duplicates&gt;
          &lt;content total=&quot;8&quot;&gt;
             &lt;item&gt;3&lt;/item&gt;
             &lt;item&gt;5&lt;/item&gt;
          &lt;/content&gt;
          &lt;message val=&quot;this is a note 2&quot;/&gt;
       &lt;/val&gt;
    &lt;/output&gt;


Now then content is getting duplicated for the duplicated elements.

  [1]: https://stackoverflow.com/q/76654821/808891

</details>


# 答案1
**得分**: 1

使用XSLT 3(由当前/支持的Saxon Java版本、Saxon .NET、SaxonC和SaxonJS支持)可以使用以下代码:首先处理所有`case`以识别以空白或以`1 | n`开头的`@cont`的案例组。对于每个组,收集的元素存储在映射的`group`属性中。然后,代码执行前一个解决方案中的分组(即仅查看`@cont`为空或为`n | n`的`case`)。在分组内,为了引用其他项,代码只需检查存储的组序列中包含当前元素的那个组,这是通过`is` XPath节点身份运算符完成的。然后,所有元素都用于搜索`content/@total`和`content/misc`:

```xml
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="#all"
    expand-text="yes"
    version="3.0">

  <xsl:mode on-no-match="shallow-skip"/>

  <xsl:output method="xml" indent="yes" cdata-section-elements="groups" />

  <xsl:template match="cases">
    <output>
      <xsl:variable name="groups" as="map(*)*">
          <xsl:for-each-group select="case" group-starting-with="case[@cont = '' or tokenize(@cont, '\s*\|\s*')[1] = '1']">
            <xsl:sequence select="map { 'id' : xs:integer(@id), 'pos' : position(), 'group' : current-group() }"/>
          </xsl:for-each-group>
      </xsl:variable>
      <xsl:for-each-group 
          select="case[@cont = '' or count(distinct-values(tokenize(@cont, '\s*\|\s*'))) = 1]" 
          composite="yes" 
          group-by="if (@cont = '') then (@id, '') else (@id, tokenize(@cont, '\s*\|\s*')[2])">
        <val id="{@id}">
           <duplicates>{count(current-group()) - 1}</duplicates>
           <xsl:variable name="complete-group" select="$groups[some $case in ?group satisfies current() is $case]"/>
           <content total="{$complete-group?group/content/@total}">
              <xsl:for-each select="$complete-group?group/content/misc">
                  <item>
                      <xsl:value-of select="@val"/>
                  </item>
              </xsl:for-each>
           </content>
           <message val="{message}" />
        </val>
      </xsl:for-each-group>
    </output>
  </xsl:template>
  
</xsl:stylesheet>

如果存在重复项,对于totalmisc的期望结果我没有清晰的描述。

英文:

With XSLT 3 (as supported by the current/supported version of Saxon Java, Saxon .NET, SaxonC and SaxonJS) you could use the following code: it first processes all cases to identify groups of cases starting with @cont being empty or @cont starting with 1 | n. For each group the collected elements are stored in the group property of a map.
Then the code does the grouping from the previous solution (i.e. only looks at the cases with @cont being empty or being n | n (i.e. the last page). Inside of the grouping, to reference back to the other items, the code just checks in the stored sequence of the groups for the one which contains the current element, that is done with the is XPath node identity operator. Then all the elements are used for search content/@total and content/misc:

&lt;xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;
xmlns:xs=&quot;http://www.w3.org/2001/XMLSchema&quot;
exclude-result-prefixes=&quot;#all&quot;
expand-text=&quot;yes&quot;
version=&quot;3.0&quot;&gt;
&lt;xsl:mode on-no-match=&quot;shallow-skip&quot;/&gt;
&lt;xsl:output method=&quot;xml&quot; indent=&quot;yes&quot; cdata-section-elements=&quot;groups&quot; /&gt;
&lt;xsl:template match=&quot;cases&quot;&gt;
&lt;output&gt;
&lt;xsl:variable name=&quot;groups&quot; as=&quot;map(*)*&quot;&gt;
&lt;xsl:for-each-group select=&quot;case&quot; group-starting-with=&quot;case[@cont = &#39;&#39; or tokenize(@cont, &#39;\s*\|\s*&#39;)[1] = &#39;1&#39;]&quot;&gt;
&lt;xsl:sequence select=&quot;map { &#39;id&#39; : xs:integer(@id), &#39;pos&#39; : position(), &#39;group&#39; : current-group() }&quot;/&gt;
&lt;/xsl:for-each-group&gt;
&lt;/xsl:variable&gt;
&lt;xsl:for-each-group 
select=&quot;case[@cont = &#39;&#39; or count(distinct-values(tokenize(@cont, &#39;\s*\|\s*&#39;))) = 1]&quot; 
composite=&quot;yes&quot; 
group-by=&quot;if (@cont = &#39;&#39;) then (@id, &#39;&#39;) else (@id, tokenize(@cont, &#39;\s*\|\s*&#39;)[2])&quot;&gt;
&lt;val id=&quot;{@id}&quot;&gt;
&lt;duplicates&gt;{count(current-group()) - 1}&lt;/duplicates&gt;
&lt;xsl:variable name=&quot;complete-group&quot; select=&quot;$groups[some $case in ?group satisfies current() is $case]&quot;/&gt;
&lt;content total=&quot;{$complete-group?group/content/@total}&quot;&gt;
&lt;xsl:for-each select=&quot;$complete-group?group/content/misc&quot;&gt;
&lt;item&gt;
&lt;xsl:value-of select=&quot;@val&quot;/&gt;
&lt;/item&gt;
&lt;/xsl:for-each&gt;
&lt;/content&gt;
&lt;message val=&quot;{message}&quot; /&gt;
&lt;/val&gt;
&lt;/xsl:for-each-group&gt;
&lt;/output&gt;
&lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;

I have no clear idea/description of the wanted result for total and misc if there are duplicates.

huangapple
  • 本文由 发表于 2023年7月13日 23:29:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76681092.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定