英文:
XSL 1.0, How to split string with taking care about not slicing words
问题
我必须改进在XSL中拆分长字符串的方法。行大小为60个字符。当出现相当长的字符串时,它会以一种不太优雅的方式拆分成多行。我尝试实现处理空格的机制,以避免在单词中间切割。
现在,代码看起来像这样:
<xsl:template name="split_text">
<xsl:param name="sText"/>
<xsl:param name="lineSize">60</xsl:param>
<xsl:variable name="toDisplay" saxon:assignable="yes"/>
<xsl:variable name="toProcess" saxon:assignable="yes" select="$sText"/>
<saxon:while test="string-length($toProcess) > $lineSize">
<saxon:assign name="toDisplay" select="substring($toProcess, 1, $lineSize)"/>
<saxon:assign name="toProcess" select="substring($toProcess, $lineSize + 1)"/>
<xsl:value-of select="$toDisplay"/><br/>
</saxon:while>
<xsl:value-of select="$toProcess"/>
</xsl:template>
它只会在文本长度超过行容量时进行拆分。
我希望处理行容量在某些单词中间结束的情况。我了解了标记器(tokenizers)、substring-before-last等内容。但我在Java中遇到了一些异常。可能我正在使用过旧的XSL版本,但不是不可能将其升级,所以我必须使用现有的内容。
我担心仅仅依赖每行中最后一个空格字符的出现,因为输入可以是一个长的字符序列而没有任何空格,此时最佳选项仍然是使用我上面粘贴的代码。
在XSL中,是否有一种简单的方式可以进行标记化?
我应该对完整字符串进行标记化,并在它们的总长度小于行容量时附加每个下一个标记吗?
或者,我应该检查每行的最后一个字符是否为空格字符,然后进行一些附加操作?
我感到很困惑,这是我和XSL的第一次约会。
附加编辑:
我找到了对我来说有趣的函数saxon:tokenize
。文档中的描述听起来很不错 - 这就是我所需要的。但在XSL 1.0和Saxon中是否可以使用 - 这是来自清单的粘贴内容:
Manifest-Version: 1.0
Main-Class: com.icl.saxon.StyleSheet
Created-By: 1.3.1_16 (Sun Microsystems Inc.)
如果可以的话,如何在其上进行迭代?我在网上找到了一些不同的迭代风格,我不知道它们之间的区别、优缺点是什么。
英文:
I have to improve splitting long strings in XSL. The line size is 60 characters. When there appears quite a long string, it is splitting into lines in so inelegant way.
I try to implement the mechanism of taking care of spaces, to avoid slicing words in the middle of them.
Now, the code looks like that:
<xsl:template name="split_text">
<xsl:param name="sText"/>
<xsl:param name="lineSize">60</xsl:param>
<xsl:variable name="toDisplay" saxon:assignable="yes"/>
<xsl:variable name="toProcess" saxon:assignable="yes" select="$sText"/>
<saxon:while test="string-length($toProcess) > $lineSize">
<saxon:assign name="toDisplay" select="substring($toProcess, 1, $lineSize)"/>
<saxon:assign name="toProcess" select="substring($toProcess, $lineSize + 1)"/>
<xsl:value-of select="$toDisplay"/><br/>
</saxon:while>
<xsl:value-of select="$toProcess"/>
</xsl:template>
It's just split text if it is longer than line capacity.
I want to handle cases when line capacity ends in the middle of some words. I read about tokenizers, substring-before-last, etc. But I got some exceptions in java. Probably I am working on too old XSL version, but it is not impossible to upgrade it, so I have to use what I have.
I am afraid of depending on the last occurrence of space char in every line because the input can be a long char sequence without any spaces, and then the best option will be still using code which I pasted upside.
Is it in XSL some simple way, to tokenize?
Should I tokenize full string and append every next token as long as their summary length is smaller than line capacity?
Or maybe should I check if the last character in line is space char, or not, and then make some additional operations?
I am so confused, it is my first date with XSL.
ADDITIONAL EDIT:
I found interesting for me function saxon:tokenize
. Description in documentation sounds great - this is what I need. But it is possible to use in XSL 1.0 and Saxon - here paste from Manifest:
Manifest-Version: 1.0
Main-Class: com.icl.saxon.StyleSheet
Created-By: 1.3.1_16 (Sun Microsystems Inc.)
```.
If yes, how to iterate over that? I found on the web some various styles of iterating and I don't know and don't understand what differences, pros, and cons are between they
</details>
# 答案1
**得分**: 0
好的,我已经完成了,所以我将分享我的解决方案,也许有人会遇到类似的问题。
```xml
<xsl:template name="split_text">
<xsl:param name="sText"/>
<xsl:param name="lineSize">60</xsl:param>
<xsl:variable name="remainder" saxon:assignable="yes"/>
<xsl:variable name="textTokens" saxon:assignable="yes" select="saxon:tokenize($sText)" />
<xsl:choose>
<!-- 如果行长度已满,则打印该行并清除剩余部分 -->
<xsl:when test="(string-length($remainder) >= $lineSize)">
<xsl:value-of select="$remainder"/><br/>
<saxon:assign name="remainder" select="''"/>
</xsl:when>
<!-- 逐个单词添加到行中,直到行填满 -->
<xsl:otherwise>
<saxon:assign name="remainder" select="concat($remainder, ' ', $currentToken, ' ')"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
我使用了Saxon的tokenize函数,并开始迭代标记列表,在每次循环后检查行长度。
英文:
Okay, I have done it, so I will share my solution, maybe somebody will have similar problem.
<xsl:template name="split_text">
<xsl:param name="sText"/>
<xsl:param name="lineSize">60</xsl:param>
<xsl:variable name="remainder" saxon:assignable="yes"/>
<xsl:variable name="textTokens" saxon:assignable="yes" select="saxon:tokenize($sText)" />
<xsl:choose>
<!-- If line length is fill, then it is printed and remainder is cleared -->
<xsl:when test="(string-length($remainder) >= $lineSize)">
<xsl:value-of select="$remainder"/><br/>
<saxon:assign name="remainder" select="''"/>
</xsl:when>
<!-- Words are sequentially adding to line until it become filled -->
<xsl:otherwise>
<saxon:assign name="remainder" select="concat($remainder, ' ', $currentToken, ' ')"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
I used saxon's tokenize, and start to iterate over list of tokens, checking line length after every loop.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论