英文:
Translate non-uniqified json output into a summary table using jq
问题
我正在尝试将来自LSF作业调度程序的作业数据转换成主机和状态的摘要表格。例如,这可能是一些示例数据:
$ bjobs -q normal -a -uall -o 'exec_host stat' -json
{
"COMMAND":"bjobs",
"JOBS":5,
"RECORDS":[
{
"EXEC_HOST":"compute-node-1",
"STAT":"RUN"
},
{
"EXEC_HOST":"compute-node-1",
"STAT":"DONE"
},
{
"EXEC_HOST":"compute-node-2",
"STAT":"RUN"
},
{
"EXEC_HOST":"compute-node-1",
"STAT":"EXIT"
},
{
"EXEC_HOST":"compute-node-2",
"STAT":"RUN"
},
]
}
我想要的输出如下所示:
RUN DONE EXIT
compute-node-1 1 1 1
compute-node-2 2
我可以通过使用datamash进行一些非常别扭的转换来实现这个目标,但通过最小化使用bjobs和jq的工作流程可以显著提高可维护性。我正在努力想出在jq中汇总唯一的EXEC_HOST/STAT值的方法。
英文:
I am trying to transform job data from the LSF job scheduler into a summary table of hosts and statuses. For example this might be some sample data:
$ bjobs -q normal -a -uall -o 'exec_host stat' -json
{
"COMMAND":"bjobs",
"JOBS":5,
"RECORDS":[
{
"EXEC_HOST":"compute-node-1",
"STAT":"RUN"
},
{
"EXEC_HOST":"compute-node-1",
"STAT":"DONE"
},
{
"EXEC_HOST":"compute-node-2",
"STAT":"RUN"
},
{
"EXEC_HOST":"compute-node-1",
"STAT":"EXIT"
},
{
"EXEC_HOST":"compute-node-2",
"STAT":"RUN"
},
And I want output that looks like:
RUN DONE EXIT
compute-node-1 1 1 1
compute-node-2 2
I can accomplish this through some really awkward contortions using datamash, but minimizing the workflow to bjobs and jq would significantly improve maintainability. I'm struggling to come up with the recipe to summarize the unique EXEC_HOST/STAT values.
Is there a way within jq to summarize this data as above?
答案1
得分: 3
以下是翻译好的内容:
使用您的输入,以下调用生成TSV输出,如下所示:
jq -r '
.RECORDS
| (map(.STAT) | unique) as $statuses
| reduce .[] as $x (null; .[$x.EXEC_HOST][$x.STAT] += 1)
| [null, $statuses[]],
(to_entries[] | [.key, .value[$statuses[]]])
| @tsv
DONE EXIT RUN
compute-node-1 1 1 1
compute-node-2 2
您可以轻松地根据需要进行调整,例如,如果您想更多地控制列的排序。此外,您可能想使用@csv而不是@tsv等等...。
英文:
With your input, the following invocation produces TSV output, as shown:
jq -r '
.RECORDS
| (map(.STAT) | unique) as $statuses
| reduce .[] as $x (null; .[$x.EXEC_HOST][$x.STAT] += 1)
| [null, $statuses[]],
(to_entries[] | [.key, .value[$statuses[]]])
| @tsv
DONE EXIT RUN
compute-node-1 1 1 1
compute-node-2 2
You can easily tweak the above, e.g. if you want more control over the ordering of the columns. Also, you might want to use @csv instead of @tsv, etc....
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论