2023年6月9日 08:47:19go评论97阅读模式

英文:

From R xml2 library, I don't understand how xml_find_all and xml_find_first work

问题

以下是您要翻译的内容：

I am trying to mimic a simple example to retrieve named nodes with xml_find_first() and xml_find_all() functions. The simple example works very well:

library(xml2)
x <- read_xml("<foo><bar><baz/></bar><baz/></foo>")
xml_find_all(x, ".//baz")
xml_find_all(x, ".//bar")
xml_find_first(x, ".//bar")

As expected, the output for the three cases is:

{xml_nodeset (2)}
[1] <baz/>
[2] <baz/>

{xml_nodeset (1)}
[1] <bar>\n <baz/>\n</bar>

{xml_node}
<bar>
[1] <baz/>

Now, with the more complex, production example, it seems that the two functions behave differently

library(xml2)
yy <- read_xml(
''<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<fileVersion appName="xl" lastEdited="3" lowestEdited="5" rupBuild="9302"/>
<workbookPr/>
<workbookProtection/>
<bookViews>
<workbookView windowWidth="27090" windowHeight="8700" tabRatio="500" activeTab="1"/>
</bookViews>
<sheets>
<sheet name="PARTICIPANTES" sheetId="1" r:id="rId1"/>
<sheet name="ORDENADOS" sheetId="2" r:id="rId2"/>
</sheets>
<calcPr calcId="144525"/>
</workbook>'
)

xml_find_first(yy, ".//sheets")
xml_find_first(yy, "//sheets")
xml_find_all(yy, "//sheets")

In all cases, the answer is a missing node:

{xml_missing}
<NA>

{xml_nodeset (0)}

Is there something I am missing about these functions?

英文:

I am trying to mimic a simple example to retrieve named nodes with xml_find_first() and xml_find_all() functions. The simple example works very well:

library(xml2)
x &lt;- read_xml(&quot;&lt;foo&gt;&lt;bar&gt;&lt;baz/&gt;&lt;/bar&gt;&lt;baz/&gt;&lt;/foo&gt;&quot;)
xml_find_all(x, &quot;.//baz&quot;)
xml_find_all(x, &quot;.//bar&quot;)
xml_find_first(x, &quot;.//bar&quot;)

As expected, the output for the three cases is:

{xml_nodeset (2)}
[1] &lt;baz/&gt;
[2] &lt;baz/&gt;
{xml_nodeset (1)}
[1] &lt;bar&gt;\n  &lt;baz/&gt;\n&lt;/bar&gt;
{xml_node}
&lt;bar&gt;
[1] &lt;baz/&gt;

Now, with the more complex, production example, it seems that the two functions behave differently

library(xml2)
yy &lt;- read_xml(
  &#39;&lt;workbook xmlns=&quot;http://schemas.openxmlformats.org/spreadsheetml/2006/main&quot; xmlns:r=&quot;http://schemas.openxmlformats.org/officeDocument/2006/relationships&quot;&gt;
      &lt;fileVersion appName=&quot;xl&quot; lastEdited=&quot;3&quot; lowestEdited=&quot;5&quot; rupBuild=&quot;9302&quot;/&gt;
      &lt;workbookPr/&gt;
      &lt;workbookProtection/&gt;
      &lt;bookViews&gt;
          &lt;workbookView windowWidth=&quot;27090&quot; windowHeight=&quot;8700&quot; tabRatio=&quot;500&quot; activeTab=&quot;1&quot;/&gt;
      &lt;/bookViews&gt;
      &lt;sheets&gt;
          &lt;sheet name=&quot;PARTICIPANTES&quot; sheetId=&quot;1&quot; r:id=&quot;rId1&quot;/&gt;
          &lt;sheet name=&quot;ORDENADOS&quot; sheetId=&quot;2&quot; r:id=&quot;rId2&quot;/&gt;
      &lt;/sheets&gt;
      &lt;calcPr calcId=&quot;144525&quot;/&gt;
  &lt;/workbook&gt;&#39;
)
xml_find_first(yy, &quot;.//sheets&quot;)
xml_find_first(yy, &quot;//sheets&quot;)
xml_find_all(yy, &quot;//sheets&quot;)

In all cases, the answer is a missing node:

{xml_missing}
&lt;NA&gt;
{xml_missing}
&lt;NA&gt;
{xml_nodeset (0)}

Is there something I am missing about these functions?

答案1

得分: 1

请注意使用 xml_ns_rename 来重命名默认命名空间，其标识为 xmlns="..."，与带前缀的命名空间 xmlns:r="..." 不同。重命名后，您可以在任何 XPath 表达式中使用临时前缀。

ns &lt;- xml_ns_rename(xml_ns(yy), d1 = &quot;doc&quot;)
xml_find_first(yy, &quot;.//doc:sheets&quot;, ns)
xml_find_first(yy, &quot;//doc:sheets&quot;, ns)
xml_find_all(yy, &quot;//doc:sheets&quot;, ns)

英文:

Consider xml_ns_rename to rename the default namespace, identified by xmlns="..." which differs from prefixed namespace xmlns:r="...". Renaming allows you then to use a temporary prefix in any XPath expression.

ns &lt;- xml_ns_rename(xml_ns(yy), d1 = &quot;doc&quot;)
xml_find_first(yy, &quot;.//doc:sheets&quot;, ns)
xml_find_first(yy, &quot;//doc:sheets&quot;, ns)
xml_find_all(yy, &quot;//doc:sheets&quot;, ns)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

From R xml2 library, I don’t understand how xml_find_all and xml_find_first work.

问题

答案1

提取给定列名的最后一个非NA值

在数据框上根据变量的数值应用条件函数到多个列。

如何在大数据集中找到指定范围内的最大值？

使用matplot()函数为多个图形上色，使它们呈现不同的颜色。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。