如何使用 jq 在连续的 JSON 记录流上调用操作

huangapple go评论74阅读模式
英文:

How to invoke actions on a continuous stream of json records using jq

问题

Here is the translated content you requested:

我有一个每秒将JSON记录写入stdout的进程,如何使用jq在写入一定数量(=x)的记录后触发操作并仍然每秒输出?

为了允许简单的重播,该进程被替换为一个带有sleep的bash shell读取循环:
`( read line ; while [[ -n "${line}" ]] ; do echo "${line}"; sleep 1 ; read line ; done ) < json.data`

x条记录后的操作是计算记录的数量,并忽略包含“null”值的行。

当x为1时,处理每行的解决方案是简单或直接的。

( read line ; while [[ -n "${line}" ]] ; do echo "${line}"; sleep 1 ; read line ; done ) < json.data |
jq -c '. , ([.] | del( .[] | select(to_entries[].value == null)) | length)'
{"date":230415072207,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.075,"kwL3Tl":0,"kwL3":0}
1
{"date":230415072311,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
1
{"date":230415072312,"kwL1Tl":null,"kwL1":null,"kwL2Tl":null,"kwL2":null,"kwL3Tl":null,"kwL3":null}
0
{"date":230415072415,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
1
{"date":230415072416,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
1
...


但是当我想为每5或100个记录执行此操作时该怎么办?

x=5的预期输出如下:
```shell
$ ( read line ; while [[ -n "${line}" ]] ; do echo "${line}"; sleep 1 ; read line ; done ) < json.data | \
alert
{"date":230415072207,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.075,"kwL3Tl":0,"kwL3":0}
{"date":230415072311,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
{"date":230415072312,"kwL1Tl":null,"kwL1":null,"kwL2Tl":null,"kwL2":null,"kwL3Tl":null,"kwL3":null}
{"date":230415072415,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
{"date":230415072416,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
4
{"date":230415072519,"kwL1Tl":0,"kwL1":0.021,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
{"date":230415072520,"kwL1Tl":0,"kwL1":0.021,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
{"date":230415072623,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.075,"kwL3Tl":0,"kwL3":0}
{"date":230415072624,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.075,"kwL3Tl":0,"kwL3":0}
{"date":230415072727,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.076,"kwL3Tl":0,"kwL3":0}
5
{"date":230415072728,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.076,"kwL3Tl":0,"kwL3":0}
{"date":230415072831,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.075,"kwL3Tl":0,"kwL3":0}
...

Bash函数 "alert" 应包含一个有效的jq命令。


<details>
<summary>英文:</summary>

So I have a process that writes a JSON record to stdout every second, how can I use jq to trigger an action after a certain amount (=x) of records are written and still have my output every second? 

The process is, to allow easy replay, replaced by a bash shell read loop with a sleep:  
`( read line ; while [[ -n &quot;${line}&quot; ]] ; do echo &quot;${line}&quot;; sleep 1 ; read line ; done ) &lt; json.data`

The action after x records is to count the amount of records, and ignore lines with &quot;null&quot; values.

When x is 1, so process each line, the solution is simple or straightforward.  

( read line ; while [[ -n "${line}" ]] ; do echo "${line}"; sleep 1 ; read line ; done ) < json.data |
jq -c '. , ([.] | del( .[] | select(to_entries[].value == null)) | length)'
{"date":230415072207,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.075,"kwL3Tl":0,"kwL3":0}
1
{"date":230415072311,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
1
{"date":230415072312,"kwL1Tl":null,"kwL1":null,"kwL2Tl":null,"kwL2":null,"kwL3Tl":null,"kwL3":null}
0
{"date":230415072415,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
1
{"date":230415072416,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
1
...


But what to do when I want to do this for every 5 or 100 records ?

The expected output for x=5:  

$ ( read line ; while [[ -n "${line}" ]] ; do echo "${line}"; sleep 1 ; read line ; done ) < json.data |
alert
{"date":230415072207,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.075,"kwL3Tl":0,"kwL3":0}
{"date":230415072311,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
{"date":230415072312,"kwL1Tl":null,"kwL1":null,"kwL2Tl":null,"kwL2":null,"kwL3Tl":null,"kwL3":null}
{"date":230415072415,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
{"date":230415072416,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
4
{"date":230415072519,"kwL1Tl":0,"kwL1":0.021,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
{"date":230415072520,"kwL1Tl":0,"kwL1":0.021,"kwL2Tl":0,"kwL2":0.08,"kwL3Tl":0,"kwL3":0}
{"date":230415072623,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.075,"kwL3Tl":0,"kwL3":0}
{"date":230415072624,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.075,"kwL3Tl":0,"kwL3":0}
{"date":230415072727,"kwL1Tl":0,"kwL1":0.022,"kwL2Tl":0,"kwL2":0.076,"kwL3Tl":0,"kwL3":0}
5
{"date":230415072728,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.076,"kwL3Tl":0,"kwL3":0}
{"date":230415072831,"kwL1Tl":0,"kwL1":0.023,"kwL2Tl":0,"kwL2":0.075,"kwL3Tl":0,"kwL3":0}
...


Bash function &quot;alert&quot; should contain a working jq command.

</details>


# 答案1
**得分**: 0

以下“alert”函数执行预期操作:

```javascript
alert(){
    #
    # 如有需要发送警报
    jq -nc --argjson m "5" '
        def alerts(r): r | del( .[] | select(to_entries[].value == null)) | length ; 
        foreach inputs as $in (
             {"recs":[]};
             if (.recs|length) < $m then .recs += [$in] else .recs=[$in] end;
             .recs[-1], if (.recs|length) == $m then alerts(.recs) else empty end)
    '
}

解释一下:

  • “foreach inputs”读取处理输出作为流
  • 第一个“if”将输入JSON记录添加到数组“recs”中,直到数组达到$m大小。(在我们的例子中,“5”由jq程序的“-argjson m 5”选项指定)。如果大小为5,则再次覆盖“recs”数组(用新的第一个元素)。
  • “.recs[-1]”将最后接收到的记录放在输出上。(“$in”也能正常工作)
  • 第二个“if”在数组“recs”大小为5时执行内部函数“alerts”。
  • jq本地函数“alerts”在删除具有一个或多个字段的“null”值的记录后,输出数组“recs”的大小。
英文:

The following "alert" function does what is expected:

alert(){
    #
    # send alerts if needed
    jq -nc --argjson m &quot;5&quot; &#39;
        def alerts(r): r | del( .[] | select(to_entries[].value == null)) | length ; 
        foreach inputs as $in (
             {&quot;recs&quot;:[]};
             if (.recs|length) &lt; $m then .recs += [$in] else .recs=[$in] end;
             .recs[-1], if (.recs|length) == $m then alerts(.recs) else empty end)
    &#39;
}

Explaining it a bit:

  • "foreach inputs" reads process output as a stream
  • The first "if" adds input JSON records in a array "recs", until the array is $m big. (in our case "5" as specified by the "-argjson m 5" option to the jq program). If it is 5 big it overwrites the "recs" array again (with a new first element).
  • ".recs[-1]" puts the last record received on output. ("$in" would also work)
  • The second "if" executes the internal function "alerts", when the array "recs" is 5 big.
  • The jq local function "alerts" writes out the size of the array "recs", after removing records that have a "null" as value of one or more fields.

huangapple
  • 本文由 发表于 2023年4月16日 23:35:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76028685.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定