英文:
How to use AND operator in Google Sheets IMPORTXML XPath?
问题
我正在尝试编写一个XPath查询,用于从特定类型的页面中收集链接。我以为可以使用“AND”运算符(竖线字符),但不太确定如何实现。到目前为止,我有以下内容,但是它是错误的。
=IMPORTXML(B2,"//a[not(starts-with(@href, '/'))]/@href | //a[not(contains(@href, 'example.com'))]/@href")
我的想法是,我想收集除了包含example.com的链接和以斜杠开头的链接之外的所有链接。
令人惊讶的是,它仍然会提取页面上的所有链接,完全忽略了我的指示。
非常感谢任何帮助。
英文:
I'm trying to write an XPath query for Google Sheets to gather links from a specific type of page.
I thought I could use an "AND" operator (pipe character), but can't quite figure out how to do it.
Here's what I've got so far, but it's wrong.
=IMPORTXML(B2,"//a[not(starts-with(@href, '/'))]/@href | //a[not(contains(@href, 'example.com'))]/@href")
The idea is that I want to gather all links except for ones that contain example.com and ones that begin with a forward slash.
The absolutely surprising thing is that it will still extract all links from a page just completely ignoring my instructions.
Any help would be greatly appreciated.
答案1
得分: 2
你错了。在XPath中,|
运算符不表示“and”(与)。它的含义是“合并节点集”。所以你是在合并第一个表达式的结果与第二个的结果。
要实现你想要的效果,可以尝试以下方法:
=IMPORTXML(B2, "//a[not(starts-with(@href, '/') or contains(@href, 'example.com'))]/@href")
英文:
You are mistaken. The |
operator does not mean "and" in XPath. Its meaning is "merge nodesets". So you were merging the results of the first expression with the results of the second.
To realize what you want, try this approach:
=IMPORTXML(B2,"//a[not(starts-with(@href, '/') or contains(@href, 'example.com'))]/@href")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论