XSLT 3.0 Streaming (Saxon) : processing multiple xml files in a folder using saxon:while is throwing error

huangapple go评论49阅读模式
英文:

XSLT 3.0 Streaming (Saxon) : processing multiple xml files in a folder using saxon:while is throwing error

问题

I have multiple xml files in a folder, the number of files in the folder is going to be passed to xslt during runtime along with filename and path, when I am trying to process each file using saxon:while i am getting null pointer exception, I tried without while loop with single file and code worked.

XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:f="http://www.test.com/function" version="3.0" exclude-result-prefixes="#all" xmlns:saxon="http://saxon.sf.net/" extension-element-prefixes="saxon">
<xsl:output method="text" omit-xml-declaration="no" indent="yes" />
<xsl:mode streamable="yes" on-no-match="shallow-skip" />
<xsl:param name="inputPath"/>
<xsl:param name="filename"/>
<xsl:param name="numberOfFiles"/>
<xsl:variable name="fileCount" select="0" saxon:assignable="yes"/>
<xsl:variable name="fullFileName" select="''" saxon:assignable="yes"/>

<xsl:template match="Input">
<saxon:while test="$numberOfFiles &gt; $fileCount">
<saxon:assign  name="fullFileName" select="concat($inputPath,$filename,'_',$fileCount,'.xml')"/>
<out><xsl:value-of select="$fullFileName"/></out>
<xsl:apply-templates select="doc($fullFileName)/Input/InData/InfoArray/copy-of(.)" mode="s" />
<saxon:assign name="fileCount" select="$fileCount+1"/>
</saxon:while>
</xsl:template>

<xsl:template  match="Info/UnitSellPrice" mode="s">
<price><xsl:value-of select="."/></price>
</xsl:template>

</xsl:stylesheet>

XML 1:

<?xml version="1.0" encoding="UTF-8"?>
<Input>
<InData>
<InfoArray>
<Info>
<UnitSellPrice>100</UnitSellPrice>
</Info>
<Info>
<UnitSellPrice>324.19</UnitSellPrice>
</InfoArray>
</InData>
</Input>

XML 2:

<?xml version="1.0" encoding="UTF-8"?>
<Input>
<InData>
<InfoArray>
<Info>
<UnitSellPrice>500</UnitSellPrice>
</Info>
<Info>
<UnitSellPrice>200</UnitSellPrice>
</InfoArray>
</InData>
</Input>

error:

Looking for function Q{http://www.w3.org/2005/xpath-functions}concat#5
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
Looking for function Q{http://www.w3.org/2005/xpath-functions}doc#1
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
Looking for function Q{http://www.w3.org/2005/xpath-functions}copy-of#1
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
Looking for function Q{http://www.w3.org/2005/xpath-functions}concat#5
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
Looking for function Q{http://www.w3.org/2005/xpath-functions}doc#1
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
Looking for function Q{http://www.w3.org/2005/xpath-functions}copy-of#1
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor2720.invoke(Unknown Source)

Caused by: java.lang.NullPointerException
com.saxonica.ee.stream.Streamability.getItemType(Streamability.java:267)
英文:

I have multiple xml files in a folder, the number of files in the folder is going to be passed to xslt during runtime along with filename and path, when I am trying to process each file using saxon:while i am getting null pointer exception, I tried without while loop with single file and code worked.

I am sharing simplified version of xml and xslt to show issue iam facing. Iam using Saxon 11.2 EE

XSLT:

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot; xmlns:f=&quot;http://www.test.com/function&quot; version=&quot;3.0&quot; exclude-result-prefixes=&quot;#all&quot; xmlns:saxon=&quot;http://saxon.sf.net/&quot; extension-element-prefixes=&quot;saxon&quot; &gt;
&lt;xsl:output method=&quot;text&quot; omit-xml-declaration=&quot;no&quot; indent=&quot;yes&quot; /&gt;
&lt;xsl:mode streamable=&quot;yes&quot; on-no-match=&quot;shallow-skip&quot; /&gt;
&lt;xsl:param name=&quot;inputPath&quot;/&gt;
&lt;xsl:param name=&quot;filename&quot;/&gt;
&lt;xsl:param name=&quot;numberOfFiles&quot;/&gt;
&lt;xsl:variable name=&quot;fileCount&quot; select=&quot;0&quot; saxon:assignable=&quot;yes&quot;/&gt;
&lt;xsl:variable name=&quot;fullFileName&quot; select=&quot;&#39;&#39;&quot; saxon:assignable=&quot;yes&quot;/&gt;

&lt;xsl:template match=&quot;Input&quot;&gt;
&lt;saxon:while test=&quot;$numberOfFiles &amp;gt; $fileCount&quot;&gt;
&lt;saxon:assign  name=&quot;fullFileName&quot; select=&quot;concat($inputPath,$filename,&#39;_&#39;,$fileCount,&#39;.xml&#39;)&quot;/&gt;
&lt;out&gt;&lt;xsl:value-of select=&quot;$fullFileName&quot;/&gt;&lt;/out&gt;
&lt;xsl:apply-templates select=&quot;doc($fullFileName)/Input/InData/InfoArray/copy-of(.)&quot; mode=&quot;s&quot; /&gt;
&lt;saxon:assign name=&quot;fileCount&quot; select=&quot;$fileCount+1&quot;/&gt;
&lt;/saxon:while&gt;
&lt;/xsl:template&gt;

&lt;xsl:template  match=&quot;Info/UnitSellPrice&quot; mode=&quot;s&quot;&gt;
&lt;price&gt;&lt;xsl:value-of select=&quot;.&quot;/&gt;&lt;/price&gt;
&lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;

XML 1:

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;Input&gt;
&lt;InData&gt;
&lt;InfoArray&gt;
&lt;Info&gt;
&lt;UnitSellPrice&gt;100&lt;/UnitSellPrice&gt;
&lt;/Info&gt;
&lt;Info&gt;
&lt;UnitSellPrice&gt;324.19&lt;/UnitSellPrice&gt;
&lt;/InfoArray&gt;
&lt;/InData&gt;
&lt;/Input&gt;

XML 2:

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;Input&gt;
&lt;InData&gt;
&lt;InfoArray&gt;
&lt;Info&gt;
&lt;UnitSellPrice&gt;500&lt;/UnitSellPrice&gt;
&lt;/Info&gt;
&lt;Info&gt;
&lt;UnitSellPrice&gt;200&lt;/UnitSellPrice&gt;
&lt;/InfoArray&gt;
&lt;/InData&gt;
&lt;/Input&gt;

error:

Looking for function Q{http://www.w3.org/2005/xpath-functions}concat#5
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
Looking for function Q{http://www.w3.org/2005/xpath-functions}doc#1
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
Looking for function Q{http://www.w3.org/2005/xpath-functions}copy-of#1
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
Looking for function Q{http://www.w3.org/2005/xpath-functions}concat#5
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
Looking for function Q{http://www.w3.org/2005/xpath-functions}doc#1
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
Looking for function Q{http://www.w3.org/2005/xpath-functions}copy-of#1
Trying net.sf.saxon.functions.registry.XSLT30FunctionSet
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor2720.invoke(Unknown Source)

Caused by: java.lang.NullPointerException
com.saxonica.ee.stream.Streamability.getItemType(Streamability.java:267)

答案1

得分: 2

不依赖扩展,我认为你想要的是,例如:

<xsl:param name="input-uris" as="xs:string*" select="'file1.xml', 'file2.xml', 'file3.xml'"/>

然后使用初始模板(命令行上的选项 -it,没有 -s 选项;使用 s9apicallTemplate(null,..))例如:

<xsl:template name="xsl:initial-template">
  <xsl:for-each select="$input-uris">
    <xsl:source-document href="." streamable="yes">
      <xsl:apply-templates select="Input/InData/InfoArray/copy-of(.)" mode="s"/>
    </xsl:source-document>
  </xsl:for-each>
</xsl:template>
英文:

Without relying on extensions, I would think you want e.g. &lt;xsl:param name=&quot;input-uris&quot; as=&quot;xs:string*&quot; select=&quot;&#39;file1.xml&#39;, &#39;file2.xml&#39;, &#39;file3.xml&#39;&quot;/&gt; and then start with the initial template (option -it on the command line, no -s option; from s9api using callTemplate(null,..)) using e.g.

&lt;xsl:template name=&quot;xsl:initial-template&quot;&gt;
  &lt;xsl:for-each select=&quot;$input-uris&quot;&gt;
    &lt;xsl:source-document href=&quot;{.}&quot; streamable=&quot;yes&quot;&gt;
      &lt;xsl:apply-templates select=&quot;Input/InData/InfoArray/copy-of(.)&quot; mode=&quot;s&quot;/&gt;
    &lt;/xsl:source-document&gt;
  &lt;/xsl:for-each&gt;
&lt;/xsl:template&gt;

答案2

得分: 2

首先,我强烈建议您避免使用saxon:assignsaxon:while。这些扩展是在产品早期添加的,现在实际上不再需要,因为XSLT已经具备像隧道参数、xsl:for-each-groupxsl:iterate等构造。例如,您可以这样编写:

<xsl:template match="Input">
  <xsl:for-each select="1 to $fileCount">
    <xsl:variable name="fullFileName" select="concat($inputPath,$filename,'_',.,'.xml')"/>
    <out>{$fullFileName}</out>
    <xsl:apply-templates select="doc($fullFileName)/Input/InData/InfoArray/copy-of(.)" mode="s"/>
  </xsl:for-each>
</xsl:template>

我实际上不知道这是否与问题有关,但我相当肯定没有人测试过这些扩展是否适用于可流式处理的样式表 - 根据提供的有限堆栈跟踪信息,它似乎在流式处理分析期间失败了,这很可能意味着saxon:assignsaxon:while的流式处理分析存在缺陷。如果情况是这样的,并且我们修复了它,我们几乎可以确定这个指令不适用于流式处理。

仔细查看您的代码后,不太清楚您如何尝试使用流式处理。您有一个模板规则,匹配流式处理模式中的Input元素,但实际上似乎没有使用Input元素内的任何数据;同时,通过doc()访问的辅助输入文件没有进行流式处理。

英文:

Firstly, I would strongly encourage you to avoid using saxon:assign and saxon:while. These extensions were added to the product very early in its life and they really aren't needed now that XSLT has constructs like tunnel parameters, xsl:for-each-group and xsl:iterate. You could for example write:

&lt;xsl:template match=&quot;Input&quot;&gt;
  &lt;xsl:for-each select=&quot;1 to $fileCount&quot;&gt;
     &lt;xsl:variable name=&quot;fullFileName&quot;
         select=&quot;concat($inputPath,$filename,&#39;_&#39;,.,&#39;.xml&#39;)&quot;/&gt;
     &lt;out&gt;{$fullFileName}&lt;/out&gt;
     &lt;xsl:apply-templates
         select=&quot;doc($fullFileName)/Input/InData/InfoArray/copy-of(.)&quot; 
         mode=&quot;s&quot; /&gt;
  &lt;/xsl:for-each&gt;
&lt;/xsl:template&gt;

I don't actually know whether this has any bearing on the problem, but I'm pretty sure that no-one has tested whether these extensions work in a streamable stylesheet - from the limited stacktrace supplied, it seems to have failed during the streamability analysis, and this might well mean that the streamability analysis for saxon:assign and saxon:while is defective. If that's the case and we fixed it, we would almost certainly decide that the instruction isn't streamable.

Looking at your code more carefully, it's not clear how you are trying to use streaming. You've got a template rule that matches Input elements in streaming mode, but it doesn't actually seem to make use of any data within the Input element; while the secondary input files (accessed using doc()) are processed without streaming.

huangapple
  • 本文由 发表于 2023年3月7日 18:25:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/75660735.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定