英文:
BASH: redirect stdin to variable number of subprocesses
问题
不要翻译代码部分。以下是翻译好的内容:
正如标题所述,我想将 stdin
重定向到不定数量的输出子进程。
如果我要重定向到输出文件,可以这样做:
files=(file_1 file_2 ... file_n)
tee ${files[*]} >/dev/null
但是对于子进程(具体来说,使用进程替代),像这样做事情:
programs=(">(exe_1 args_1)" ... ">(exe_n args_n)")
tee ${programs[@]} >/dev/null
将不会将 >()
解释为进程替代,而会将其解释为字面文件名(出于安全原因,我认为);而且替代项中的标志被解释为 tee
的标志。
是否可能从 stdin
中读取一行并将其重定向到所有这些进程(再次强调,进程数量是可变的:n
是未知的)?我有遗漏什么吗?
提前感谢您的帮助,对我的糟糕英语表示抱歉。
英文:
As the title says, I would like to redirect stdin
to a variable number of output subprocesses.
If I had to redirect to output files I could do something like
files=(file_1 file_2 ... file_n)
tee ${files[*]} >/dev/null
but with subprocesses (using process substitutions, specifically), doing things like
programs=(">(exe_1 args_1)" ... ">(exe_n args_n)")
tee ${programs[@]} >/dev/null
will not intepret the >()
as process substitutions but as literal filenames (for security reasons, I assume); also, the flags withing the substitutions are interpreted as flags of tee
.
Is it possible to read ONE LINE from stdin
and redirect it to all these processes (which, again, are variable in number: n
is unknown)? Have I missed something, somewhere?
Thanks in advance and sorry for my bad English.
答案1
得分: 4
不要使用进程替代,而是在循环中创建一组命名管道,并使每个进程的标准输入重定向到其中一个管道。然后使用 tee
写入所有管道。
progs=(exe_1 exe_2 ...)
args=(args_1 args2 ...)
pipes=()
arraylength=${#progs[@]}
for (( i=0; i<${arraylength}; i++ ))
do
pipe=/tmp/pipe.$$.$i
mkfifo "$pipe" && pipes+=("$pipe") && "${progs[i]}" "${args[i]}" < "$pipe" &
done
tee "${pipes[@]}" > /dev/null
# 清理
rm -f "${pipes[@]}"
此解决方案使每个程序都以精确1个参数运行。要使其更加通用健壮,因为Bash没有二维数组,这很困难。
英文:
Instead of using process substitution, create a bunch of named pipes in a loop, and run each process with its stdin redirected to one of the pipes. Then use tee
to write to all the pipes.
progs=(exe_1 exe_2 ...)
args=(args_1 args2 ...)
pipes=()
arraylenth=${#progs[@]}
for (( i=0; i<${arraylength}; i++ ))
do
pipe=/tmp/pipe.$$.$i
mkfifo "$pipe" && pipes+=("$pipe") && "$progs[i]" "$args[i]" < "$pipe" &
done
tee "${pipes[@]" > /dev/null
# Clean up
rm -f "${pipes[@]"
This solution has each program run with exactly 1 argument. It's hard to make it more general robustly because bash doesn't have 2-dimensional arrays.
答案2
得分: 2
抱歉,以下是翻译好的部分:
很不幸,这需要使用 eval
。给定类似以下的内容:
all_redirections=""
add_redirection() {
local argv_q
printf -v argv_q '%q ' "$@" # 生成一个 eval 安全的转义字符串"$@"
all_redirections+=" ${argv_q} " # 将其附加到 all_redirections 字符串
}
to_all_redirections() { eval "tee ${all_redirections}" >/dev/null; }
...您可以运行:
add_redirection exe_1 arg_1_a arg_1_b
add_redirection exe_2 arg_2_a
# ...
add_redirection exe_n arg_n
...然后,当您有一个输出需要复制到这些可执行程序的输入时:
yourprogram | to_all_redirections
英文:
Unfortunately, this is a job for eval
. Given something like:
all_redirections=""
add_redirection() {
local argv_q
printf -v argv_q '%q ' "$@" # generate an eval-safe escaping of "$@"
all_redirections+=" ${argv_q} " # append that to all_redirections string
}
to_all_redirections() { eval "tee ${all_redirections}" >/dev/null; }
...you can run:
add_redirection exe_1 arg_1_a arg_1_b
add_redirection exe_2 arg_2_a
# ...
add_redirection exe_n arg_n
...and then, when you have a program whose output is to be copied to the input of those executables:
yourprogram | to_all_redirections
答案3
得分: 1
这个Shellcheck清理过的代码不使用eval
,可以处理任意数量的程序和任意数量的参数:
#! /bin/bash -p
prog_args=( ::: sed -e 's/^/[1] /'
::: sed -e 's/^/[2] /'
::: sed -e 's/^/[3] /' )
exec 3>&1
function tee_to_progs
{
(( $# < 2 )) && return 1
local -r startstr=$1
shift
local pargs=()
while [[ $# -gt 0 && $1 != "$startstr" ]]; do
pargs+=( "$1" )
shift
done
if (( $# == 0 )); then
# 最后一个要运行的程序。它只需读取标准输入并写入标准输出。
"${pargs[@]}"
else
tee >("${pargs[@]}" >&3) | tee_to_progs "$@"
fi
}
tee_to_progs "${prog_args[@]}"
prog_args
数组保存要运行的程序和参数。由于Bash不支持嵌套数组,每个命令都以标记字符串开头,以便识别单独的程序和参数。我使用字符串:::
(因为它在GNU Parallel中用于类似的目的),但可以使用任何不用作程序名称或参数的字符串。代码假定数组中的第一个字符串(无论是什么)是标记字符串,因此如果字符串更改,代码无需更改。我测试过使用_
代替:::
。sed
命令只是示例。我用它们进行测试,因为每个程序的输出可以轻松识别。- 请注意,此代码仅支持运行简单命令。如果需要运行其他命令(例如,使用重定向的命令),您将需要使用不同的方法(可能涉及可怕的
eval
)。 exec 3>&1
使文件描述符号3与标准输出相关联。代码中使用它来确保输出到“真正的”标准输出。tee_to_progs
函数运行其参数列表中的第一个程序,将输入复制给它,并使用tee
将tee
输出管道到该函数的递归调用,该函数为第二个及后续程序执行相同操作。
英文:
This Shellcheck-clean code doesn't use eval
and can handle any number of programs with any number of arguments:
#! /bin/bash -p
prog_args=( ::: sed -e 's/^/[1] /'
::: sed -e 's/^/[2] /'
::: sed -e 's/^/[3] /' )
exec 3>&1
function tee_to_progs
{
(( $# < 2 )) && return 1
local -r startstr=$1
shift
local pargs=()
while [[ $# -gt 0 && $1 != "$startstr" ]]; do
pargs+=( "$1" )
shift
done
if (( $# == 0 )); then
# Last program to run. It can just read stdin and write stdout.
"${pargs[@]}"
else
tee >("${pargs[@]}" >&3) | tee_to_progs "$@"
fi
}
tee_to_progs "${prog_args[@]}"
- The
prog_args
array holds the programs and arguments to be run. Since Bash doesn't support nested arrays each command is preceded by a marker string to enable the separate programs and arguments to be identified. I used the string:::
(because it is used for a similar purpose in GNU Parallel), but any string that isn't used as a program name or argument could be used instead. The code assumes that the first string (whatever it happens to be) in the array is the marker string, so the code doesn't need to be changed if the string is changed. I tested using_
instead of:::
. - The
sed
commands are just examples. I used them for testing because the outputs of each program can be easily identified. - Note that this code only supports running simple commands. If you need to run other commands (e.g. ones that use redirections) you will need to use a different approach (probably involving the dreaded
eval
). exec 3>&1
causes file descriptor number 3 to be associated with standard output. It is used in the code to ensure that output goes to the "real" standard output.- The
tee_to_progs
function runs the first program in its list of arguments with input duplicated to it withtee
and pipes thetee
output to a recursive call of the function that does the same for the second and subsequent programs.
答案4
得分: 1
GNU Parallel提供了--tee
选项来实现这个功能:
cat input | parallel --tee --pipe my_program {} ::: arg1 arg2 arg3
在内部工作原理上,它与Barmar的解决方案非常相似,但你可以使用GNU Parallel的输出控制:
- 输出是串行化的,因此来自两个作业的输出不会混合在一起
- 你可以保持顺序
--keep-order
- 你可以为每一行添加标签
--tag
临时文件的清理是在作业完成之前进行的,因此如果脚本被终止,你不需要清理临时文件。
例如:
seq 10000 | parallel --tee --pipe --tag --keep-order grep {} ::: {1..9}
英文:
GNU Parallel has --tee
for this:
cat input | parallel --tee --pipe my_program {} ::: arg1 arg2 arg3
Internally it works very much like Barmar's solution, but you get GNU Parallel's output control:
- the output is serialized, so output from two jobs is not mixed
- you can keep the order
--keep-order
- you can
--tag
each line
and clean up of temporary files is done before the job is done, so you do not need to clean up temporary files if the script is killed.
E.g.
seq 10000 | parallel --tee --pipe --tag --keep-order grep {} ::: {1..9}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论