英文:
Whole word regex matching and hyperlinking in Javascript
问题
我需要一点正则表达式的帮助。
我正在使用JavaScript和JQuery来在HTML文档中创建超链接术语,为此我正在使用以下代码。我正在为大量文档中的多个术语执行此操作。
var searchterm = "Water";
jQuery('#content p').each(function() {
var content = jQuery(this),
txt = content.html(),
found = content.find(searchterm).length,
regex = new RegExp('(' + searchterm + ')(?![^(<a.*?>).*?</a>))','gi');
if (found != -1) {
//超链接搜索词
txt = txt.replace(regex, '<a href="/somelink">$1</a>');
content.html(txt);
}
});
然而,有一些情况我不想匹配,由于时间限制和思维混乱,我寻求一些帮助。
编辑:我已根据@ggorlen提供的出色示例更新了下面的codepen链接,谢谢!
示例:https://codepen.io/julian-young/pen/KKwyZMr
英文:
I need a little help with Regular Expressions.
I'm using Javascript and JQuery to hyperlink terms within an HTML document, to do this I'm using the following code. I'm doing this for a number of terms in a massive document.
var searchterm = "Water";
jQuery('#content p').each(function() {
var content = jQuery(this),
txt = content.html(),
found = content.find(searchterm).length,
regex = new RegExp('(' + searchterm + ')(?![^(<a.*?>).]*?<\/a>)','gi');
if (found != -1) {
//hyperlink the search term
txt = txt.replace(regex, '<a href="/somelink">$1</a>');
content.html(txt);
}
});
There are however a number of instances I do not want to match and due to time constraints and brain melt, I'm reaching out for some assistance.
EDIT: I've updated the codepen below based on the excellent example provided by @ggorlen, thank you!
答案1
得分: 2
以下是您要翻译的内容:
将整个DOM转储为原始文本并使用正则表达式解析它,绕过了jQuery(以及JS)的主要目的,即将DOM视为节点的抽象树来遍历和操作。
文本节点具有 nodeType
Node.TEXT_NODE
,我们可以在遍历中使用它来识别您感兴趣的非链接节点。
获得文本节点后,可以适当地应用正则表达式(解析文本,而不是HTML)。我在演示中使用了 <mark>
,但您可以将其更改为锚标签或您需要的任何内容。
jQuery 提供了 replaceWith
方法,可以在进行所需的正则表达式替换后替换节点的内容。
您可以在不使用jQuery的情况下应用于文档中的所有内容:
英文:
Dumping the entire DOM to raw text and parsing it with regex circumvents the primary purpose of jQuery (and JS, by extension), which is to traverse and manipulate the DOM as an abstract tree of nodes.
Text nodes have a nodeType
Node.TEXT_NODE
which we can use in a traversal to identify the non-link nodes you're interested in.
After obtaining a text node, regex can be applied appropriately (parsing text, not HTML). I used <mark>
for demonstration purposes, but you can make this an anchor tag or whatever you need.
jQuery gives you a replaceWith
method that replaces the content of a node after you've made the desired regex substitution.
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-js -->
$('#content li').contents().each(function () {
if (this.nodeType === Node.TEXT_NODE) {
var pattern = /(\b[Ww]aters?(?!-)\b)/g;
var replacement = '<mark>$1</mark>';
$(this).replaceWith(this.nodeValue.replace(pattern, replacement));
}
});
<!-- language: lang-html -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<h1>Example Content</h1>
<div id="content">
<ul>
<li>Water is a fascinating subject. - <strong>match</strong></li>
<li>We all love water. - <strong>match</strong></li>
<li>ice; water; steam - <strong>match</strong></li>
<li>The beautiful waters of the world - <strong>match</strong> (including the s)</li>
<li>and all other water-related subjects - <strong>no match</strong></li>
<li>and this watery topic of - <strong>no match</strong></li>
<li>of WaterStewardship looks at how best - <strong>no match</strong></li>
<li>On the topic of <a href="/governance">water governance</a> - <strong>no match</strong></li>
<li>and other <a href="/water">water</a> related things - <strong>no match</strong></li>
<li>the best of <a href="/allthingswater">all things water</a> - <strong>no match</strong></li>
</ul>
</div>
<!-- end snippet -->
You can do it without jQ and apply to everything in the document:
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-js -->
for (const parent of document.querySelectorAll("body *:not(a)")) {
for (const child of parent.childNodes) {
if (child.nodeType === Node.TEXT_NODE) {
const pattern = /(\b[Ww]aters?(?!-)\b)/g;
const replacement = "<mark>$1</mark>";
const subNode = document.createElement("span");
subNode.innerHTML = child.textContent.replace(pattern, replacement);
parent.insertBefore(subNode, child);
parent.removeChild(child);
}
}
}
<!-- language: lang-html -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div>
hello water
<div>
<div>
I love Water.
<a href="">more water</a>
</div>
watership down
<h4>watery water</h4>
<p>
waters
</p>
foobar <a href="">water</a> water
</div>
</div>
<!-- end snippet -->
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论