英文:
NodeJS regex returns 0 on string.search()
问题
I'm working on a NodeJS script (launched from the CMD with the node command) getting me some HTML content in a string, in which I need to extract some data between a specific <div>
element. I'm having a hard time figuring why this portion of code doesn't give me the desired output.
const input = '<div class="some_class">Some data</div><div class="some_other_class">< class="some_other_other_class">...</div></div>';
const regex = new RegExp(/<div class="some_class">(.*?)<\/div>/g);
let obj = {
'tmp': input.search(regex),
}
console.log(obj) // outputs { tmp: 0}
console.log(input.search(/<div class="some_class">(.*?)<\/div>/g)) // outputs 0
const x = input.search(/<div class="some_class">(.*?)<\/div>/g);
console.log(x) // outputs 0
I know this seems a bit of a regular issue here, but I tried passing the Regex with string format (between single quotes '), passing it as a Regex (between delimiter /) and finally by defining a new RegExp element, but without success. I always happen to get 0 as an output.
However, when I test it on an online tool, it does match and capture the desired data in the group #1: RegexTester
I don't know if I'm missing something or if I'm doing something wrong, but after some hours spent on this issue, I'm quite struggling to get my ideas straight.
英文:
I'm working on a NodeJS script (launched from the CMD with the node command) getting me some HTML content in a string, in which I need to extract some data between a specific <div>
element. I'm having a hard time firguring why this portion of code doesn't give me the desired output.
const input = '<div class="some_class">Some data</div><div class="some_other_class">< class="some_other_other_class">...</div></div>'
const regex = new RegExp(/<div class="some_class"\>(.*?)<\/div>/g)
let obj = {
'tmp': input.search(regex),
}
console.log(obj) // outputs { tmp: 0}
console.log(input.search(/<div class="some_class"\>(.*?)<\/div>/g)) // outputs 0
const x = input.search(/<div class="some_class"\>(.*?)<\/div>/g)
console.log(x) // outputs 0
I know this seems a bit of a regular issue here, but I tried passing the Regex with string format (between single quotes '), passing it as a Regex (between delimiter /) and finally by defining a new RegExp element, but without success. I always happen to get 0 as an output.
However, when I test it on an online tool, it does match and capture the desired data in the group #1 : https://www.regextester.com/?fam=131034
I don't know if I'm missing something or if I'm doing something wrong, but after some hours spent on this issue, I'm quite struggling to get my ideas straight.
答案1
得分: 1
String::search()
返回找到的字符串位置,这在你的情况下是 0
,这是完全正确的。
你需要使用 String::match()
,别忘了获取正确的正则表达式组索引。
const input = '<div class="some_class">Some data</div><div class="some_other_class">< class="some_other_other_class">...</div></div>';
console.log(input.match(/<div class="some_class">(.*?)<\/div>/)?.[1])
要避免处理组,有时我更喜欢使用断言:
const input = '<div class="some_class">Some data</div><div class="some_other_class">< class="some_other_other_class">...</div></div>';
console.log(...input.match(/(?<=<div class="some_class">).*?(?=<\/div>)/))
如果你的HTML经常更改,我建议使用 https://www.npmjs.com/package/jsdom 来使用DOM来访问你所需标签内的内容。
英文:
String::search()
returns the found string's position, which is 0
in your case which is perfectly right.
You need String::match()
and don't forget to get the right regexp group index:
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-js -->
const input = '<div class="some_class">Some data</div><div class="some_other_class">< class="some_other_other_class">...</div></div>'
console.log(input.match(/<div class="some_class">(.*?)<\/div>/)?.[1])
<!-- end snippet -->
To avoid bothering with the groups I prefer sometimes use assertions:
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-js -->
const input = '<div class="some_class">Some data</div><div class="some_other_class">< class="some_other_other_class">...</div></div>'
console.log(...input.match(/(?<=<div class="some_class">).*?(?=<\/div>)/))
<!-- end snippet -->
If your html changes often I recommend to use https://www.npmjs.com/package/jsdom
to use DOM to access content inside your needed tags.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论