英文:
Use variable to change flag to cut command in bash
问题
以下是翻译好的代码部分:
file=~/Desktop/test.geno.file.txt
if cat $file | awk '{exit !/\t/}'; then
echo "Tab delimited" # tab file
FILESEP=""
else
FILESEP="-d \ "
echo "Space delimited with $FILESEP" # space file
fi
onlyfile=$(basename $file)
POSFILE=${onlyfile%.txt}.pos.gz # Next command will generate it
cat $file | cut $FILESEP -f 1 | perl -p -e 's/_([^_]+$)/\t$1/' | grep -v marker | gzip > ~/Desktop/$POSFILE # Keep position only chr \t pos
希望这对你有所帮助。如果你需要进一步的协助,请随时提问。
英文:
I have a file test.geno.file.txt
which contains chromosome and alleles and some numbers in it (this one is SPACE separated, but I have another which is TAB separated):
marker allele1 allele2 id1 id1 id2 id2
chr11_96001606 C T 1.25893e-12 1 3.16228e-26 0.000999001
chr1_46021459 G T 0.969347 0.0306534 1.22034e-21 0.996035
I'm using a script that checks if it is tab or space delimited and stores the result inside the FILESEP
variable:
file=~/Desktop/test.geno.file.txt
if cat $file | awk '{exit !/\t/}'; then
echo "Tab delimited" # tab file
FILESEP=""
else
FILESEP="-d \ "
echo "Space delimited with $FILESEP" # space file
fi
onlyfile=$(basename $file)
Basically, I want to modify the original file to get the chromosome names and positions separated. I'm using the cut command, but I would like to make it easier to change the delimiter flag (using the if statement and variable above)
Like this:
POSFILE=${onlyfile%.txt}.pos.gz # Next command will generate it
cat $file | cut -d' ' -f 1 | perl -p -e 's/_([^_]+$)/\t$1/' | grep -v marker | gzip > ~/Desktop/$POSFILE # Keep position only chr \t pos
This is the output:
gunzip -c ~/Desktop/$POSFILE
chr11 96001606
chr1 46021459
But changing the -d' '
with $FILESEP
:
cat $file | cut $FILESEP -f 1 | perl -p -e 's/_([^_]+$)/\t$1/' | grep -v marker | gzip > ~/Desktop/$POSFILE # Keep position only chr \t pos
But this last command doesn't work...
答案1
得分: 3
假设没有任何字段包含空格,我会用awk
替换cut
,其中默认分隔符是空格(空格、连续空格、制表符、连续制表符)。 这将消除整个if/then/else
结构以确定要放入FILESEP
中的内容。
换句话说...
用这个替换:
if cat $file | awk '{exit !/\t/}'; then
echo "Tab delimited" # tab file
FILESEP=""
else
FILESEP="-d \ "
echo "Space delimited with $FILESEP" # space file
fi
....
cat $file | cut $FILESEP -f 1 | perl ...
用这个:
awk '{print $1}' "${file}" | perl ...
# 或者如果使用zcat(根据楼主评论中的一个)
zcat "${file}" | awk '{print $1}' | perl ...
注意: 我不用perl
,但如果你不能在perl
中模拟这个awk
脚本的话,我会感到惊讶(也就是说,通过在perl
脚本中添加一些额外的代码来消除awk
调用)... ??
英文:
Assuming none of the fields contain spaces, I'd replace cut
with awk
where the default delimiter is white space (space, contiguous spaces, tab, contiguous tabs). This would do away with the whole if/then/else
construct do determine what to put in FILESEP
.
In other words ...
Replace this:
if cat $file | awk '{exit !/\t/}'; then
echo "Tab delimited" # tab file
FILESEP=""
else
FILESEP="-d \ "
echo "Space delimited with $FILESEP" # space file
fi
....
cat $file | cut $FILESEP -f 1 | perl ...
With this:
awk '{print $1}' "${file}" | perl ...
# or if using zcat (per one of OP's comments)
zcat "${file}" | awk '{print $1}' | perl ...
NOTE: I don't work with perl
but I'd be surprised if you couldn't emulate this awk
script inside perl
(ie, eliminate the awk
call with some additional code in the perl
script) ... ??
答案2
得分: 2
引用在扩展参数时至关重要。在扩展参数时不引用参数几乎总是一个错误,shellcheck会正确地抱怨这一点。
在这种情况下的解决方法是始终指定分隔符:
if awk '{exit !/\t/}' "$file"; then
FILESEP=$'\t' # 一个字面上的制表符。在终端中,通过按Ctrl-v Tab或Ctrl-v Ctrl-i来输入
else
FILESEP=' '
fi
cut -d "$FILESEP" -f1 yourfile
英文:
Quoting is crucial when expanding parameters. Not quoting a parameter when expanding is almost always an error and shellcheck will rightfully complain about it.
The solution in this case is to always specify the delimiter:
if awk '{exit !/\t/}' "$file"; then
FILESEP=' ' # a literal tab character. In terminal, enter by pressing Ctrl-v Tab, or Ctrl-v Ctrl-i
else
FILESEP=' '
fi
cut -d "$FILESEP" -f1 yourfile
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论