从两个Bash数组创建JSON

huangapple go评论65阅读模式
英文:

Creating json from two bash arrays

问题

{
  "200": {"feature1": 1000},
  "300": {"feature1": 1001},
  "400": {"feature1": 1002}
}

For the updated request with a third array:

{
  "200": {
    "feature1": 1000,
    "feature2": "path-to-files/1000.log"
  },
  "300": {
    "feature1": 1001,
    "feature2": "path-to-files/1001.log"
  },
  "400": {
    "feature1": 1002,
    "feature2": "path-to-files/1002.log"
  }
}

Note that the code sections have been omitted as per your request.

英文:

I have two bash arrays:

arr1="200 300 400"
arr2=(1000 10001 10002)

I would like to produce a json file with a bash script:

{
  "200": {"feature1": 1000},
  "300": {"feature1": 1001},
  "400": {"feature1": 1002}
}

I tried doing it with jq:

jq -n --arg arg1 "${arr1[*]}" \
      --arg arg2 "${arr2[*]}" \
      '{$arg1: {"feature1": $arg2}}'

but this only expands the whole arrays into one entry.

[EDIT]: What if I have the third array with file paths and would like to place them as feature 2:

{
  "200": {
    "feature1": 1000,
    "feature2": "path-to-files/1000.log"
  },
  "300": {
    "feature1": 1001,
    "feature2": "path-to-files/1001.log"
  },
  "400": {
    "feature1": 1002,
    "feature2": "path-to-files/1002.log"
  }
}

[EDIT2]: My workflow:

arr1="200 300 400"
arr2=()
arr3=()
arr2_content=1000
for i in $arr1; do
    arr2+=("$arr2_content")
    touch "$arr2_content.log"
    arr3+=("$PWD/$arr2_content.log")
    arr2_content=$((arr2_content+1))
done

答案1

得分: 2

只有 arr2 是一个数组。arr1 只是一个包含空格的字符串。因此,您可以使用 --arg 选项读取 arr1,并使用 / " " 来在空格处分割它,并使用 --args 选项用于“真正”的数组 arr2

jq -n --arg arg "$arr1" '[$arg / " ", $ARGS.positional]
  | reduce transpose[] as [$key, $feature1] ({}; .[$key] = {$feature1})
' --args "${arr2[@]}"
{
  "200": {
    "feature1": "1000"
  },
  "300": {
    "feature1": "10001"
  },
  "400": {
    "feature1": "10002"
  }
}

要将值转换为数字,可以使用 tonumber

jq -n --arg arg "$arr1" '[$arg / " ", $ARGS.positional]
  | reduce transpose[] as [$key, $val] ({}; .[$key] = {feature1: $val | tonumber})
' --args "${arr2[@]}"
{
  "200": {
    "feature1": 1000
  },
  "300": {
    "feature1": 10001
  },
  "400": {
    "feature1": 10002
  }
}

通用化到:

如何将两个(或更多)索引的 Bash 数组导入到 jq

这更加复杂,因为使用 --args(将数组项传递给参数堆栈)的技巧只能使用一次。或者更精确地说,--args 只是将所有剩余的参数传递给内部的 $ARGS 对象,所以即使传递了两个数组,项目将会到达,但第一个数组结束和第二个数组开始的信息将会丢失。这里有一些解决方法:

使用分隔符项

您可以在两个数组之间插入一个特殊的项,以便在 jq 内部识别其索引,然后可以通过在该项周围切片来恢复这些数组。需要注意的是,分隔符项必须是唯一可识别的,否则分割可能会发生在数组的任何地方。找到这样的项可能很容易,也可能很难,这取决于数组中存储的数据类型。让我们假设空字符串 "" 是一个永远不会出现在任何数组项中的值。然后,您可以使用以下方法:

keys="200 300 400"
arr1=("one" "two" "two and a half")
arr2=("three and four" "five" "six")
sep=""
init=1000
jq -n --arg keys "$keys" --arg sep "$sep" --argjson init "$init" '
  [$keys / " ", ($ARGS.positional | index($sep) as $i | .[:$i], .[$i+1:]) ]
  | . + [[$init + range(first | length)]]
  | reduce transpose[] as [$key, $feature1, $feature2, $counter] ({};
      .[$key] = {$counter, $feature1, $feature2}
    )
' --args "${arr1[@]}" "" "${arr2[@]}"    # 请注意特殊项

提供数组长度

不必基于参数数组本身来确定分割的索引,可以单独提供该信息,因为它已经在 Bash 上下文中可用。对于此用例,相信数组具有相等的长度,因此不必重复上面的索引计算(虽然也可以使用该方法)。而是使用 _nwise,它会产生相等长度的切片:

keys="200 300 400"
arr1=("one" "two" "two and a half")
arr2=("three and four" "five" "six")
len=${#arr1}    # 请注意长度的动态计算
init=1000
jq -n --arg keys "$keys" --argjson len "$len" --argjson init "$init" '
  [$keys / " ", ($ARGS.positional | _nwise($len))]
  | . + [[$init + range(first | length)]]
  | reduce transpose[] as [$key, $feature1, $feature2, $counter] ({};
      .[$key] = {$counter, $feature1, $feature2}
    )
' --args "${arr1[@]}" "${arr2[@]}"    # 只有两个数组

预处理数组

一个完全不同的方法是分别调用 jq 的实例,将 Bash 数组转换为 JSON 数组(每个数组调用一次 - 就像上面所示,可以使用 --args 选项)。这是实际的,因为 jq 提供了更多的读取 JSON 数组的可能性(终究它是一个 JSON 处理器)。但要记住,从性能的角度来看,这可能变得昂贵,特别是当涉及循环和迭代时。但如果只有一次运行两个数组,一次调用 jq 与三次调用之间的差异几乎是看不出的。为了导入 JSON 数组,使用 --argjson 选项来进行内

英文:

Only arr2 is an array. arr1 is just a string containing spaces. Therefore you can read in arr1 using the --arg option, and / " " to split it at the spaces, and use the --args option for the "real" array arr2:

jq -n --arg arg "$arr1" '[$arg / " ", $ARGS.positional]
  | reduce transpose[] as [$key, $feature1] ({}; .[$key] = {$feature1})
' --args "${arr2[@]}"
{
  "200": {
    "feature1": "1000"
  },
  "300": {
    "feature1": "10001"
  },
  "400": {
    "feature1": "10002"
  }
}

To make the conversion to numbers in the values, use tonumber:

jq -n --arg arg "$arr1" '[$arg / " ", $ARGS.positional]
  | reduce transpose[] as [$key, $val] ({}; .[$key] = {feature1: $val | tonumber})
' --args "${arr2[@]}"
{
  "200": {
    "feature1": 1000
  },
  "300": {
    "feature1": 10001
  },
  "400": {
    "feature1": 10002
  }
}

Generalizing to:

How can I import two (or more) indexed bash arrays into jq

This is less trivial because the trick of using --args (populating the argument stack with the array items) can only be used once. Or more precisely: --args simply passes all remaining arguments to the internal $ARGS object, so even if you pass on both arrays, the items will arrive but the information where the first one ends and the second one starts will be lost. Here are a few workarounds:

Using a separator item

You could sneak in a special item between the two arrays, so that by spotting its index from within jq the arrays can be recovered by slicing around that item. The caveat here is that this separator item must be unique(ly identifiable), otherwise the splitting may occur anywhere across the arrays. Also finding such an item may or may not be easy, depending on the type of data stored in the arrays. Let's assume the empty string "" is such a value that never occurs as item in any of the arrays. Then you may go with something along the lines of:

keys="200 300 400"
arr1=("one" "two" "two and a half")
arr2=("three and four" "five" "six")
sep=""
init=1000
jq -n --arg keys "$keys" --arg sep "$sep" --argjson init "$init" '
  [$keys / " ", ($ARGS.positional | index($sep) as $i | .[:$i], .[$i+1:])]
  | . + [[$init + range(first | length)]]
  | reduce transpose[] as [$key, $feature1, $feature2, $counter] ({};
      .[$key] = {$counter, $feature1, $feature2}
    )
' --args "${arr1[@]}" "" "${arr2[@]}"    # note the special item

Providing array lengths

Instead of determining the index to split at based on the argument array itself, you could provide that information separately, as it is already available in the Bash context. For the use-case at hand, the arrays are believed to have equal lengths, so instead of repeating the indexing from above (which would work too), I will use _nwise which produces slices of equal lengths:

keys="200 300 400"
arr1=("one" "two" "two and a half")
arr2=("three and four" "five" "six")
len=${#arr1}    # note the dynamic computation of the length
init=1000
jq -n --arg keys "$keys" --argjson len "$len" --argjson init "$init" '
  [$keys / " ", ($ARGS.positional | _nwise($len))]
  | . + [[$init + range(first | length)]]
  | reduce transpose[] as [$key, $feature1, $feature2, $counter] ({};
      .[$key] = {$counter, $feature1, $feature2}
    )
' --args "${arr1[@]}" "${arr2[@]}"    # just the two arrays

Preprocessing the arrays

A completely different approach could be to invoke separate instances of jq just to convert the bash arrays into JSON arrays (one call per array - the --args option can be used here as shown above). This is practical, as jq offers more possibilities to read in JSON arrays (after all it's a JSON processor). However, keep in mind that performacewise this can become expensive, especially when loops and iterations are involved. But If there is only a single run with two arrays, the difference between calling jq once and doing it three times will hardly be noticeable. To import the JSON array, using options like --argjson for inline data (as seen in the last approach), or --argfile and --slurpfile for referenced data (using process substitution <() here is quite common) would be one way. As this use-case has no (other) external input (we were using the -n option all the time), the converted arrays can also be streamed in via the "main enterance". Using -s instead also can take care of collecting the stream into a big array (which is the counterpart of what happened in the first line of the previous approaches):

keys="200 300 400"
arr1=("one" "two" "two and a half")
arr2=("three and four" "five" "six")
init=1000
{
  jq -n '$ARGS.positional' --args "${arr1[@]}"
  jq -n '$ARGS.positional' --args "${arr2[@]}"
} |
jq -s --arg keys "$keys" --argjson init "$init" '    # no other parameters here
  [$keys / " "] + . + [[$init + range(first | length)]]
  | reduce transpose[] as [$key, $feature1, $feature2, $counter] ({};
      .[$key] = {$counter, $feature1, $feature2}
    )
'

Output

All these approaches produce the same output:

{
  "200": {
    "counter": 1000,
    "feature1": "one",
    "feature2": "three and four"
  },
  "300": {
    "counter": 1001,
    "feature1": "two",
    "feature2": "five"
  },
  "400": {
    "counter": 1002,
    "feature1": "two and a half",
    "feature2": "six"
  }
}

huangapple
  • 本文由 发表于 2023年2月23日 21:15:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/75545347.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定