2023年7月6日 14:16:01go评论68阅读模式

英文:

entity translation to customized entity

问题

以下是您要翻译的内容：

在 xml 数据中有一些用户定义的实体。为了取消转义这些实体，我们正在使用以下代码：-

<xsl:stylesheet version='3.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' >
<xsl:output method="xml" omit-xml-declaration="no" use-character-maps="mdash" />
<xsl:character-map name="mdash">
<xsl:output-character character="&#x2014;" string="&amp;mdash;"/>
<xsl:output-character character="&amp;" string="&amp;amp;"/>
<xsl:output-character character="&quot;" string="&amp;quot;&quot;"/>
<xsl:output-character character="&apos;" string="&amp;apos;"/>
<xsl:output-character character="&#167;" string="&amp;sect;"/>
<xsl:output-character character="&#36;" string="&amp;dollar;"/>
<xsl:output-character character="&#47;" string="&amp;sol;"/>
<xsl:output-character character="&#45;" string="&amp;hyphen;"/>
</xsl:character-map>
<!--=================================================================-->
<xsl:template match="@* | node()">
<!--=================================================================-->
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

但存在一种特殊情况，其中数据中出现了 &sect; 两次，例如：-

例如- 数字 &sect;&sect; 1234

上述示例应该转换为特殊的用户定义实体，即

输出- 数字 &multisect; 1234

&sect;&sect; 应该转换为 &multisect;

英文:

There are some user defined entites in the xml data. In order to unescape those entities, we are using below code:-

&lt;xsl:stylesheet version=&#39;3.0&#39; xmlns:xsl=&#39;http://www.w3.org/1999/XSL/Transform&#39; &gt;
&lt;xsl:output method=&quot;xml&quot; omit-xml-declaration=&quot;no&quot; use-character-maps=&quot;mdash&quot; /&gt;
&lt;xsl:character-map name=&quot;mdash&quot;&gt;
&lt;xsl:output-character character=&quot;&amp;#x2014;&quot; string=&quot;&amp;amp;mdash;&quot;/&gt;
&lt;xsl:output-character character=&quot;&amp;amp;&quot; string=&quot;&amp;amp;amp;&quot; /&gt;
&lt;xsl:output-character character=&quot;&amp;quot;&quot; string=&quot;&amp;amp;quot;&quot; /&gt;
&lt;xsl:output-character character=&quot;&amp;apos;&quot; string=&quot;&amp;amp;apos;&quot; /&gt;
&lt;xsl:output-character character=&quot;&amp;#167;&quot; string=&quot;&amp;amp;sect;&quot;/&gt;
&lt;xsl:output-character character=&quot;&amp;#36;&quot; string=&quot;&amp;amp;dollar;&quot; /&gt;
&lt;xsl:output-character character=&quot;&amp;#47;&quot; string=&quot;&amp;amp;sol;&quot; /&gt;
&lt;xsl:output-character character=&quot;&amp;#45;&quot; string=&quot;&amp;amp;hyphen;&quot; /&gt;
&lt;/xsl:character-map&gt;
&lt;!--=================================================================--&gt;
&lt;xsl:template match=&quot;@* | node()&quot;&gt;
&lt;!--=================================================================--&gt;
&lt;xsl:copy&gt;
&lt;xsl:apply-templates select=&quot;@* | node()&quot;/&gt;
&lt;/xsl:copy&gt;
&lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;

But there is a special case where &sect; is appearing twice in data, for example:-

Ex- The number &sect;&sect; 1234

The above should example should be converted to a special userdefined entity i.e.

Output- The number &multisect; 1234

The &sect;&sect; should be converted to &multisect;

答案1

得分: 1

你不能直接在序列化器中实现这一点，就像处理单个字符那样。您要么必须在转换过程中识别 "§§"（也许将其转换为某个私有使用区字符，然后由 xsl:output-character 捕获），要么可以通过在字符流级别后处理输出来实现。

英文:

You can't achieve this directly in the serializer, as you can with single characters. You will either have to recognise "§§" in the transformation proper (perhaps converting it to some private-use-area character, which is then picked up by xsl:output-character), or you could do it by post-processing the output at the character-stream level.

答案2

得分: 1

如果您想使用字符映射，首先需要处理您期望出现两个特殊字符的文本节点，并将它们替换为一个您不希望在其他地方使用的单个字符；然后该字符可以由映射转换为字符串 &multisect;，例如样式表：

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
	xmlns:xs="http://www.w3.org/2001/XMLSchema"
	xmlns:fn="http://www.w3.org/2005/xpath-functions"
	exclude-result-prefixes="#all"
	expand-text="yes"
	version="3.0">

  <xsl:param name="multisect-sub" static="yes" as="xs:string" select="'&#171;'"/>

  <xsl:character-map name="sub">
    <xsl:output-character _character="{$multisect-sub}" string="&amp;multisect;"/>
  </xsl:character-map>

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:output method="xml" indent="yes" use-character-maps="sub"/>

  <xsl:template match="text()">
    <xsl:apply-templates mode="analyze" select="analyze-string(., '&#xA7;&#xA7;')"/>
  </xsl:template>

  <xsl:template mode="analyze" match="fn:match">
    <xsl:text>{$multisect-sub}</xsl:text>
  </xsl:template>

</xsl:stylesheet>

将输入：

<!DOCTYPE text [
  <!ENTITY sect "&#xA7;">
]>
<text>&amp;sect;&amp;sect; 1234</text>

转换为输出：

<?xml version="1.0" encoding="UTF-8"?>
<text>&amp;multisect; 1234</text>

请注意，我主要使用了 '«' 作为示例，您可能需要使用一个私有字符或确保在您的输入/输出数据中不会出现的其他字符。如果您希望结果是规范的，还需要向输出添加一个文档类型（doctype），例如 xsl:output doctype-system="some.dtd"，在其中确保 some.dtd 声明了 <!ENTITY multisect "§§">。

英文:

If you want to use a character map, you would first need to process text nodes where you expect the two sect characters to be present and replace them with a single character you don't expect to be used elsewhere; that character could then be converted by the map to the string &multisect; e.g. the stylesheet

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;
	xmlns:xs=&quot;http://www.w3.org/2001/XMLSchema&quot;
	xmlns:fn=&quot;http://www.w3.org/2005/xpath-functions&quot;
	exclude-result-prefixes=&quot;#all&quot;
	expand-text=&quot;yes&quot;
	version=&quot;3.0&quot;&gt;
  
  &lt;xsl:param name=&quot;multisect-sub&quot; static=&quot;yes&quot; as=&quot;xs:string&quot; select=&quot;&#39;&#171;&#39;&quot;/&gt;
  
  &lt;xsl:character-map name=&quot;sub&quot;&gt;
    &lt;xsl:output-character _character=&quot;{$multisect-sub}&quot; string=&quot;&amp;amp;multisect;&quot;/&gt;
  &lt;/xsl:character-map&gt;

  &lt;xsl:mode on-no-match=&quot;shallow-copy&quot;/&gt;

  &lt;xsl:output method=&quot;xml&quot; indent=&quot;yes&quot; use-character-maps=&quot;sub&quot;/&gt;
  
  &lt;xsl:template match=&quot;text()&quot;&gt;
    &lt;xsl:apply-templates mode=&quot;analyze&quot; select=&quot;analyze-string(., &#39;&amp;#xA7;&amp;#xA7;&#39;)&quot;/&gt;
  &lt;/xsl:template&gt;
  
  &lt;xsl:template mode=&quot;analyze&quot; match=&quot;fn:match&quot;&gt;
    &lt;xsl:text&gt;{$multisect-sub}&lt;/xsl:text&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;

transforms the input

&lt;!DOCTYPE text [
  &lt;!ENTITY sect &quot;&amp;#xA7;&quot;&gt;
]&gt;
&lt;text&gt;&amp;sect;&amp;sect; 1234&lt;/text&gt;

into the output

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;text&gt;&amp;multisect; 1234&lt;/text&gt;

Note that I used '«' primarily as an example, you might want to need to use a private char or some other character you are sure doesn't occur in your input/output data.

If you want the result to be well-formed you would also need to add a doctype to the output with e.g. xsl:output doctype-system="some.dtd" where you ensure that some.dtd declares e.g. <!ENTITY multisect "&#xA7;&#xA7;">

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

实体翻译为自定义实体

问题

答案1

答案2

将一个节点中的元素复制到另一个节点中，使用XSL。

对于 xml-2-xml XSLT，它将从未选择的元素中输出这些文本。

合并两个XML文件使用XSLT 3.0

在XSLT 1.0中，使用空格进行填充和数字格式化。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论