英文:
sed + remove word from text without additional spaces
问题
我们想要从以下文件中删除单词 "-XX:+UseCMSInitiatingOccupancyOnly":
more hdfs.conf
SHARED_HADOOP_NAMENODE_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseCMSInitiatingOccupancyOnly -Xms{{namenode_heapsize}}"
因此,我们执行了以下操作:
sed -i -E 's/\-XX:\+UseCMSInitiatingOccupancyOnly//g' hdfs.conf
-E 启用扩展正则表达式(需要用于 + 和分组)。以及我在 "-" 和 "+" 前面使用了 ""。
请注意 - 欢迎评论我的 sed 语法,以及是否有遗漏的内容。
我的 sed 存在一个问题,即在删除单词时会多出一个额外的空格(根据我的 sed 建议)。
以下是我们获得的示例:
more hdfs.conf
SHARED_HADOOP_NAMENODE_OPTS="-server -XX:ParallelGCThreads=8 -Xms{{namenode_heapsize}}"
而我们希望得到没有额外空格的行,如下所示:
more hdfs.conf
SHARED_HADOOP_NAMENODE_OPTS="-server -XX:ParallelGCThreads=8 -Xms{{namenode_heapsize}}"
那么如何改进我的 sed 语法以删除额外的空格呢?
英文:
we want to remove the word - -XX:+UseCMSInitiatingOccupancyOnly
from the following file
more hdfs.conf
SHARED_HADOOP_NAMENODE_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseCMSInitiatingOccupancyOnly -Xms{{namenode_heapsize}}"
so we did the following:
sed -i -E 's/\-XX:\+UseCMSInitiatingOccupancyOnly//g' hdfs.conf
-E enables extended regular expressions (needed for + and grouping). , and I using the "" before the "-" and "+"
Note - appreciate comments comments about my sed syntax and if I missing something
the problem with my sed is that we have one additional space when we delete the word ( according to my sed suggestion )
example of what we get
more hdfs.conf
SHARED_HADOOP_NAMENODE_OPTS="-server -XX:ParallelGCThreads=8 -Xms{{namenode_heapsize}}"
instead to get the line without additional spaces as
more hdfs.conf
SHARED_HADOOP_NAMENODE_OPTS="-server -XX:ParallelGCThreads=8 -Xms{{namenode_heapsize}}"
so how to improve my sed syntax in order to delete also the additional space ?
答案1
得分: 1
额外的空格 不应该 影响稍后解析这些选项的任何内容,除非代码编写得非常糟糕。假设情况是这样的,并且多余的空格引发错误...
一如既往,如果您想在脚本中编辑文件,而您的第一个想法是使用 sed -i
,我建议改用 ed
。与 sed
的 -i
选项不同,它是标准化的,并且在任何地方都表现相同,这意味着在不同环境中运行时不太可能遇到意外情况。如果真的需要,您可以调整以下正则表达式以与 sed
配合使用:
ed -s hdfs.conf <<'EOF'
/^SHARED_HADOOP_NAMENODE_OPTS=/ s/\( *\)-XX:+UseCMSInitiatingOccupancyOnly *//
w
EOF
这里的技巧是同时匹配前后的0个或多个空格,但只保留其中之一(在此示例中是第一个)在输出中。
这也仅尝试替换您感兴趣的内容的特定变量设置行,以防要删除的选项出现在其他地方(比如注释中),并且您想保留该出现。
既然这个问题有Perl标签,这里是一个Perl版本:
perl -pi -e 's/\s*\K\Q-XX:+UseCMSInitiatingOccupancyOnly\E\s*// if /^SHARED_HADOOP_NAMENODE_OPTS/' hdfs.conf
(\Q
... \E
内部的内容被视为文字,因此 +
无需转义,并且 \K
基本上会从最终匹配的文本中丢弃其之前匹配的内容,这意味着您不需要显式捕获前导空格字符的捕获组(这是由 \s
匹配的,而不是字面上的空格)。)
英文:
The extra spaces shouldn't matter to whatever parses those options later unless it's really badly written code. Assuming that's the case and an extra space causes an error...
As always, if you want to edit a file in a script and your first inclination is to turn to sed -i
, I suggest using ed
instead. Unlike sed
's -i
option, it's standardized and behaves the same everywhere, meaning you're less likely to run into unwelcome surprises when running in different environments. You can adjust the following regular expression to work with sed
if really desired, though:
ed -s hdfs.conf <<'EOF'
/^SHARED_HADOOP_NAMENODE_OPTS=/ s/\( *\)-XX:+UseCMSInitiatingOccupancyOnly *//
w
EOF
The trick here is to also match 0 or more preceding and following spaces, but only leave one of the two (The first in this case) present in the output.
This also only tries to substitute on the line setting the particular variable whose contents you're interested in, in case the option you're removing appears elsewhere (In a comment, say) and you want to keep that occurrence.
And since this is tagged perl, a perl
version:
perl -pi -e 's/\s*\K\Q-XX:+UseCMSInitiatingOccupancyOnly\E\s*// if /^SHARED_HADOOP_NAMENODE_OPTS/' hdfs.conf
(Stuff inside \Q
... \E
is treated literally so the +
doesn't need to be escaped, and \K
basically discards what matches before it from the final matched text, meaning you don't need the explicit capture group of the leading whitespace characters (Which \s
matches instead of a literal space))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论