如何在Bash中用字符串填充数组,这些字符串可以包含多行?

huangapple go评论64阅读模式
英文:

How to fill array with strings which can contain multiple lines in Bash?

问题

我明白你的需求。下面是你提供的代码的翻译部分:

我有一个包含多个字符串的变量,这些字符串可以包含多行:

    var="foo 'bar baz' 'lorem
    ipsum'"

我需要将它们都作为数组元素,所以我的想法是使用 `xargs -n1` 将每个带引号或不带引号的字符串读入单独的数组元素:

    mapfile -t arr < <(xargs -n1 <<< "$(echo "$var")")

但这会引发以下错误:

    xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option

最后,我唯一的想法是将换行符替换为回车并在之后恢复它:

    # 填充数组 保留换行符(有点不规范)
    mapfile -t arr < <(xargs -n1 <<< "$(echo "$var" | tr '\n' '\r')")

    # 恢复换行符
    for ((i=0; i<${#arr[@]}; i++)); do
      arr[i]=$(echo "${arr[$i]}" | tr '\r' '\n')
    done

它可以工作:

    # for ((i=0; i<${#arr[@]}; i++)); do echo "index: $i, value: ${arr[$i]}"; done
    index: 0, value: foo
    index: 1, value: bar baz
    index: 2, value: lorem
    ipsum

但只要输入变量不包含回车符。

我认为我需要让 `xargs` 以 null 字节分隔每个结果,并使用 `mapfile``-d ''` 导入,但似乎 `xargs` 缺少一个 `print0` 选项(`tr '\n' '
我有一个包含多个字符串的变量,这些字符串可以包含多行:

    var="foo 'bar baz' 'lorem
    ipsum'"

我需要将它们都作为数组元素,所以我的想法是使用 `xargs -n1` 将每个带引号或不带引号的字符串读入单独的数组元素:

    mapfile -t arr < <(xargs -n1 <<< "$(echo "$var")")

但这会引发以下错误:

    xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option

最后,我唯一的想法是将换行符替换为回车并在之后恢复它:

    # 填充数组 保留换行符(有点不规范)
    mapfile -t arr < <(xargs -n1 <<< "$(echo "$var" | tr '\n' '\r')")

    # 恢复换行符
    for ((i=0; i<${#arr[@]}; i++)); do
      arr[i]=$(echo "${arr[$i]}" | tr '\r' '\n')
    done

它可以工作:

    # for ((i=0; i<${#arr[@]}; i++)); do echo "index: $i, value: ${arr[$i]}"; done
    index: 0, value: foo
    index: 1, value: bar baz
    index: 2, value: lorem
    ipsum

但只要输入变量不包含回车符。

我认为我需要让 `xargs` 以 null 字节分隔每个结果,并使用 `mapfile``-d ''` 导入,但似乎 `xargs` 缺少一个 `print0` 选项(`tr '\n' '\0'` 会改变多行字符串本身)。
'
` 会改变多行字符串本身)。
英文:

I'm having a variable with multiple strings, which can contain multiple lines:

var=&quot;foo &#39;bar baz&#39; &#39;lorem
ipsum&#39;&quot;

I need all of them as array elements, so my idea was to use xargs -n1 to read every quoted or unquoted string into separate array elements:

mapfile -t arr &lt; &lt;(xargs -n1 &lt;&lt;&lt; &quot;$(echo &quot;$var&quot;)&quot; )

But this causes this error:

xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option

Finally the only idea I had, was to replace the line feed against a carriage return and restore it afterwards:

# fill array                                  preserve line feed (dirty)
mapfile -t arr &lt; &lt;(xargs -n1 &lt;&lt;&lt; &quot;$(echo &quot;$var&quot; | tr &#39;\n&#39; &#39;\r&#39;)&quot; )

# restore line feed
for (( i=0; i&lt;${#arr[@]}; i++ )); do
  arr[i]=$(echo &quot;${arr[$i]}&quot; | tr &#39;\r&#39; &#39;\n&#39;)
done

It works:

# for (( i=0; i&lt;${#arr[@]}; i++ )); do echo &quot;index: $i, value: ${arr[$i]}&quot;; done
index: 0, value: foo
index: 1, value: bar baz
index: 2, value: lorem
ipsum

But only as long the input variable does not contain a carriage return.

I assume I need xargs output every result delimited by a null byte and import with mapfile's -d &#39;&#39;, but it seems xargs is missing a print0 option (tr &#39;\n&#39; &#39;\0&#39; would manipulate the multi-line string itself).

答案1

得分: 1

这段Shellcheck清洁的代码演示了一种使用Bash正则表达式从字符串中提取部分的方法:

#! /bin/bash -p

var="foo 'bar baz' 'lorem ipsum'"

leadspace_rx='^[[:space:]]+(.*)$'
bare_rx="^([^'[:space:]]+)(.*)$"
quoted_rx="^'([^']*)'(.*)$"

arr=()
while [[ -n $var ]]; do
    if [[ $var =~ $leadspace_rx ]]; then
        var=${BASH_REMATCH[1]}
    elif [[ $var =~ $bare_rx ]]; then
        arr+=( "${BASH_REMATCH[1]}" )
        var=${BASH_REMATCH[2]}
    elif [[ $var =~ $quoted_rx ]]; then
        arr+=( "${BASH_REMATCH[1]}" )
        var=${BASH_REMATCH[2]}
    else
        printf 'ERROR: Cannot handle: %s\n' "$var" >&2
        exit 1
    fi
done

declare -p arr
  • 输出结果为 declare -a arr=([0]="foo" [1]="bar baz" [2]=$'lorem\nipsum')
  • 如果你认为这个想法值得追求,将字符串拆分的代码很容易封装成一个函数。
  • 当前的代码可能会执行一些你不太期望的操作。例如,字符串 a'b'c 被转换为数组 (a b c)。如果你能提供更精确的输入字符串格式规范,我可以尝试修改代码来处理它。
英文:

This Shellcheck-clean code demonstrates a way to do it by using Bash regular expressions to extract parts from the string:

#! /bin/bash -p

var=&quot;foo &#39;bar baz&#39; &#39;lorem
ipsum&#39;&quot;

leadspace_rx=&#39;^[[:space:]]+(.*)$&#39;
bare_rx=&quot;^([^&#39;[:space:]]+)(.*)$&quot;
quoted_rx=&quot;^&#39;([^&#39;]*)&#39;(.*)$&quot;

arr=()
while [[ -n $var ]]; do
    if [[ $var =~ $leadspace_rx ]]; then
        var=${BASH_REMATCH[1]}
    elif [[ $var =~ $bare_rx ]]; then
        arr+=( &quot;${BASH_REMATCH[1]}&quot; )
        var=${BASH_REMATCH[2]}
    elif [[ $var =~ $quoted_rx ]]; then
        arr+=( &quot;${BASH_REMATCH[1]}&quot; )
        var=${BASH_REMATCH[2]}
    else
        printf &#39;ERROR: Cannot handle: %s\n&#39; &quot;$var&quot; &gt;&amp;2
        exit 1
    fi
done

declare -p arr
  • The output is declare -a arr=([0]=&quot;foo&quot; [1]=&quot;bar baz&quot; [2]=$&#39;lorem\nipsum&#39;)
  • The code for splitting up a string could easily be encapsulated in a function if you think this idea is worth pursuing.
  • The current code does things that you might not expect. For instance, the string a&#39;b&#39;c is converted to the array (a b c). If you can provide a more precise specification for the format of input strings I'll see if the code can be modified to handle it.

huangapple
  • 本文由 发表于 2023年3月21日 01:51:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/75793665-3.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定