英文:
Simple way to get the first non whitespace character from string
问题
The first non-whitespace character is a
in abc
, I get it with:
echo ' abc' | grep -o '^\s\+.'
a
Is there a shorter way to do this?
英文:
The first non whitespace character is a
in abc
,i get it with:
echo ' abc' | grep -o '^\s\+.' |tr -d ' '
a
Is there more shorter way to do so?
答案1
得分: 1
echo ' abc sd ggf ' | awk '{print $1}' | cut -c 1
echo ' abc sd ggf ' | sed 's/^[ \t]*//' | cut -d ' ' -f 1 | cut -c 1
echo ' abc sd ggf ' | sed -r 's/\s+//' | cut -c -1
PS:已经更正以打印第一个字符。
英文:
echo ' abc sd ggf ' | awk '{print $1}' | cut -c 1
echo ' abc sd ggf ' | sed 's/^[ \t]*//' | cut -d' ' -f 1 | cut -c 1
echo ' abc sd ggf ' | sed -r 's/\s+//' | cut -c -1
PS: Corrected for printing the first character.
答案2
得分: 1
从BASH_REMATCH[]
数组中检索正则表达式匹配项:
str=' abc'
regex='[^[:space:]]' # 匹配单个非空格字符
[[ "${str}" =~ ${regex} ]] && char1="${BASH_REMATCH[0]}"
结果:
$ typeset -p char1
declare -- char1="a"
$ echo "${char1}"
a
另一种使用参数扩展来去除空格字符并进行子字符串调用的方法:
str=' abc'
newstr="${str//[[:space:]]/}" # 去除空格
char1="${newstr:0:1}" # 通过子字符串提取第一个字符(起始位置0,长度1)
结果:
$ typeset -p char1
declare -- char1="a"
$ echo "${char1}"
a
注意: 这种方法打字少一些,还可以消除为每个管道命令(例如,| grep
,| tr
,| awk
,| sed
,| cut
)创建子shell所带来的性能开销。
英文:
Retrieving a regex match from the BASH_REMATCH[]
array:
str=' abc'
regex='[^[:space:]]' # match a single non-whitespace character
[[ "${str}" =~ ${regex} ]] && char1="${BASH_REMATCH[0]}"
Result:
$ typeset -p char1
declare -- char1="a"
$ echo "${char1}"
a
NOTE: while this requires a bit more typing it has the benefit of eliminating the performance overhead of creating a subshell for each piped command (eg, | grep
, |tr
, | awk
, | sed
, | cut
)
Another idea using parameter expansion to strip out whitespace characters plus a substring call:
str=' abc'
newstr="${str//[[:space:]]/}" # strip out whitespace
char1="${newstr:0:1}" # extract 1st character via substring (start position 0, length of 1)
Result:
$ typeset -p char1
declare -- char1="a"
$ echo "${char1}"
a
NOTE: a little less typing and this also eliminates the performance overhead of creating a subshell for each piped command (eg, | grep
, |tr
, | awk
, | sed
, | cut
)
答案3
得分: 1
以下是您要翻译的内容:
让我们首先通过去除无用的空格并使用here字符串来缩短您自己的命令,而不是使用 echo
和管道。让我们还使用bash变量使其与字符串无关:
s=' abc'
grep -o '^\s\+.'<<<$s|tr -d ' '
现在它有32个字符长。我们能找到更短的吗?是的,使用tr
和cut
。如果空格是实际空格(不是制表符),则以下是一个22个字符长的脚本:
tr -d ' '<<<$s|cut -c1
如果空格也可以是制表符,我们需要两个额外的字符,即24:
tr -d ' \t'<<<$s|cut -c1
如果它们可以是任何水平空白字符,我们需要6个额外的字符(30):
tr -d '[:blank:]'<<<$s|cut -c1
如果它们可以是任何水平或垂直空白字符(但如果字符串包含换行符,您的基于grep
的解决方案将不再有效),我们仍然有一个30个字符长的脚本:
tr -d '[:space:]'<<<$s|cut -c1
英文:
Let's shorten your own command first by removing useless spaces and using a here string instead of echo
and a pipe. Let's also make it string independent with a bash variable:
s=' abc'
grep -o '^\s\+.'<<<$s|tr -d ' '
It is now 32 characters long. Can we find shorter? Yes, with tr
and cut
. If the whitespaces are real spaces (not tabs) here is a 22 characters long script:
tr -d ' '<<<$s|cut -c1
If the whitespaces can also be tabs we need two more characters, that is, 24:
tr -d ' \t'<<<$s|cut -c1
If they can be any horizontal whitespace we need 6 more (30):
tr -d '[:blank:]'<<<$s|cut -c1
And if they can be any horizontal or vertical whitespace (but if the string contains newlines your grep
-based solution does not work any more), we still have a 30 characters long script:
tr -d '[:space:]'<<<$s|cut -c1
答案4
得分: 0
也可以只使用参数扩展形式来完成:
```sh
shopt -s extglob # 打开扩展的通配符以使用+()
foo=' abc'
trimmed=${foo##+([[:space:]])} # 如果有的话,去除前导空格
echo "${trimmed:0:1}" # 并显示剩余文本的第一个字符
英文:
You can also do it with just parameter expansion forms:
shopt -s extglob # Turn on extended globs to get +()
foo=' abc'
trimmed=${foo##+([[:space:]])} # Remove leading whitespace if any
echo "${trimmed:0:1}" # And display the first character of the remaining text
答案5
得分: 0
UPDATE 1
::: 更 清晰
mawk '_ {退出} _ = _$(NF = 1)' RS='[ \t-\r]+' FS=
它将第一个字符分配给 _
,只要输入行为空,它就不会退出。
_$( )
不是什么特别的东西 - 它只是将变量 _
与字段 $( … )
直接连接。
echo ' abc' |
gawk '_ {退出} _ = "" < $(NF=1)' RS='[ \t-\r]+' FS=
a
英文:
UPDATE 1
::: MUCH cleaner
mawk '_ {exit} _ = _$(NF = 1)' RS='[ \t-\r]+' FS=
It assigns the 1st character into _
, so as long as the input rows are empty, it wouldn't exit.
The _$( )
isn't anything special - it's merely a straight up string concat of variable _
with field $( … )
>
echo ' abc' |
gawk '_ {exit} _ = "" < $(NF=1)' RS='[ \t-\r]+' FS=
a
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论