英文:
How do I quote the special character + and a one-letter-plus-two-digit string?
问题
I want to replace d01
with a variable ${Dom}
and DA
with ${simul}
, so that I have:
"tti_${Dom}_${simul}_" + it + "H"
我想将d01
替换为变量${Dom}
,将DA
替换为${simul}
,以便得到:
"tti_${Dom}_${simul}_" + it + "H"
英文:
I have the following text to modify in Perl:
"tti_d01_DA_" + it + "H"
In my replacement text, I want to replace d01
with a variable ${Dom}
and DA
with ${simul}
, so that I have:
"tti_${Dom}_${simul}_" + it + "H"
I have tried the following searches for the original string:
s/"tti_.\d{2}_\w+_" \+ \w+ \+ "H"/"tti_${Dom}_${simul}_" + it + "H"/;
s/"tti_(?=.*[a-z])(?=.*[0-9]) (?=.*[_]) \w+ (?=.*[_]) (?=.*["])\s(?=.*[+])\s\w+\s(?=.*[+])\s"H"/"tti_${Dom}_${simul}_" + it + "H"/;
I thought this should be a very simple issue, but somehow I am not able to get it. Both the above do not find what I am looking for.
答案1
得分: 3
以下是翻译好的内容:
这在我的测试中有效
s/"tti_\K d[0-9]{2}_\w+_"(?= \+ it \+ "H")/${Dom}_${simul}_/
在原始的相同字符串中,有一个 DA
模式,正如注释所说,它也可以是 NODA
。这由上面的 \w+
捕获,遵循问题中的尝试。然而,如果它只能是 DA
或 NODA
中的一个,正如注释所说,那么可以使用 DA|NODA
(而不是 \w+
),这样匹配就会失败,不会匹配其他任何内容。或者介于两者之间,比如 [A-Z]+
。
这是一个依赖于上下文的选择,要么尽量使模式尽可能受限,以防止意外(和不可接受的)输入,要么使它更宽容,以捕获更多可接受的内容。
问题没有说明所示的替代内容会发生什么,但下面是一些注释。
我能立即看到的唯一明显的问题是替换部分中的 $
,如果要将它们用作文字字符,那么它们必须被转义,否则它将寻找变量 $Dom
和 $simul
。†(但是,如果程序中没有这样的变量,那么 strict
不会让程序编译通过 - 你的程序中是否有 use strict;
?)
接下来,我使用 \K
向后查找,这样到目前为止匹配的所有内容都被丢弃了(从 $&
,因此保留在字符串中),不需要重新输入替换部分。
同样地,我使用 前瞻 (?=...)
用于需要替换的模式之后的模式。这是一个 断言(零长度),它不消耗任何内容,因此这些子模式也不需要替换到字符串中。
这两个调整可以使替换部分更加轻量化(但不是必需的)。
测试程序
use warnings;
use strict;
use feature 'say';
my $str = shift // q("tti_d01_DA_" + it + "H");
#say $str;
$str =~ s/"tti_\Kd[0-9]{2}_\w+_"(?= \+ it \+ "H")/${Dom}_${simul}_/;
say $str;
† 看起来从一个注释中可以看出问题可能在于变量命名:程序中有数组 @Dom
和 @simul
,替代部分需要是 $Dom[index]
(等等)。
如果这确实是整个问题,那么在开头使用 use strict;
将会捕获它。它甚至不会让程序编译,因为没有这样的变量 $Dom
。使用 use strict;
和 use warnings;
直接有帮助,我会说这是真正必需的。难怪许多工具默认启用它们,最后,在较新的 Perl 版本中也启用它们。
英文:
This works in my tests
s/"tti_\K d[0-9]{2}_\w+_"(?= \+ it \+ "H")/${Dom}_${simul}_/
In the original same string there is DA
pattern, which as comments say can be NODA
instead. That is captured by \w+
above, following the attempt in the question. However, if it can only be either DA
or NODA
, as a comment says, then one may want to use DA|NODA
(instead of \w+
), so that the match fails for anything else. Or something in between, like [A-Z]+
.
This is a choice to make that depends on the context -- either make a pattern as restrictive as possible, to protect from unexpected (and unacceptable) input, or make it more permissive to catch more of what is acceptable.
The question doesn't say what happens with shown attermps but here are some comments.
The only outright problem that I readily see is the $
s in the replacement side, which have to be escaped if they are to be used as literal characters, otherwise it will look for variables $Dom
and $simul
.<sup>†</sup> (But then strict
wouldn't let the program compile because there are no such variables -- do you have use strict;
in your program?)
Next, I use \K
lookbehind so that all that's been matched up to that point is dropped (from $&
and so kept in the string) and doesn't have to be re-entered in the replacement side.
In the same vein, I use a lookahead (?=...)
for patterns that follow what needs to be replaced. That is an assertion (zero-length), which doesn't consume anything, so those sub patterns don't need to be replaced into the string either.
These two tweaks 'lighten' the replacement side (but are not necessary).
Test program
use warnings;
use strict;
use feature 'say';
my $str = shift// q("tti_d01_DA_" + it + "H");
#say $str;
$str =~ s/"tti_\Kd[0-9]{2}_\w+_"(?= \+ it \+ "H")/${Dom}_${simul}_/;
say $str;
<sup>†</sup> It appears from a comment that the problem may have been with variable naming: there are arrays @Dom
and @simul
in the program, and the replacements need be $Dom[index]
(etc).
If that indeed is the whole problem then having use strict;
at the beginning would've caught it. It would not let the program even compile and it would report that the variable $Dom
does not exist.
Having use strict;
and use warnings;
is directly helpful, and really necessary I'd say. No wonder that they are enabled by default by many tools and, finally, in newer Perls.
答案2
得分: 3
以下是您要翻译的内容:
正则表达式模式您正在使用是正确的。
然而,在双引号字符串文字中,$
是特殊的,其中包括 s/// 的第二部分。只需对它们进行转义。
始终使用 use strict; use warnings;
或等效的内容,因为它会捕获此错误!
换句话说,您只需要用以下内容替换:
s/"tti_.\d{2}_\w+_" \+ \w+ \+ "H"/"tti_${Dom}_${simul}_" + it + "H"/;
替换为
s/"tti_.\d{2}_\w+_" \+ \w+ \+ "H"/"tti_${Dom}_${simul}_" + it + "H"/;
演示:
$ perl -Mv5.10 -we'
$_ = q{"tti_d01_DA_" + it + "H"};
s/"tti_.\d{2}_\w+_" \+ \w+ \+ "H"/"tti_${Dom}_${simul}_" + it + "H"/;
say;
'
"tti_${Dom}_${simul}_" + it + "H"
英文:
The regexp pattern you are using is correct.
However, $
is special in double-quote string literals, which include the second part of s///. Just escape them.
Always use use strict; use warnings;
or equivalent as it would caught this error!
In other words, all you need to do is replace
s/"tti_.\d{2}_\w+_" \+ \w+ \+ "H"/"tti_${Dom}_${simul}_" + it + "H"/;
with
s/"tti_.\d{2}_\w+_" \+ \w+ \+ "H"/"tti_${Dom}_${simul}_" + it + "H"/;
Demo:
$ perl -Mv5.10 -we'
$_ = q{"tti_d01_DA_" + it + "H"};
s/"tti_.\d{2}_\w+_" \+ \w+ \+ "H"/"tti_${Dom}_${simul}_" + it + "H"/;
say;
'
"tti_${Dom}_${simul}_" + it + "H"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论