英文:
merging two files with columns of that have different lengths and possible comments into CSV like file
问题
Here's the translated code part:
paste $(grep -v '^#' file1) file2
This code attempts to merge the contents of file1
and file2
as described in your request.
英文:
file1
looks like
# dsd
# dsd
1,2,5
2,3,5
1,2,5
2,3,5
3,4,5
3,4,5
file2
looks like
# s
1,2
1,2
I want to merge them to get
# dsd
# dsd
1,2,5,1,2
2,3,5,1,2
1,2,5,,
2,3,5,,
3,4,5,,
3,4,5,,
That is I want to keep the comment lines #
from the first file after the comment lines, I want to paste columns from the second file, padding them to the column length of the first file. If there are any comment lines in the second file, ignore them.
I started with:
paste $(grep -v '^#' file1) file2
but I get bash: /usr/bin/paste: Argument list too long
I guess this would be a job for awk
but I am only familiar with single file processing and I have only found examples that deal with the same length files. Is there an easy way or one needs to go to longer bash
script or python
et al.?
答案1
得分: 2
您可以使用以下的 awk
解决方案:
awk -v OFS=, '
NR == FNR {
if (!/^#)
a[++i] = $0
next
}
{
if (/^#)
print
else {
++NR2
if (NR2 in a)
print $0, a[NR2]
else
print $0,"",""
}
}' file2 file1
# dsd
# dsd
1,2,5,1,2
2,3,5,1,2
1,2,5,,
2,3,5,,
3,4,5,,
3,4,5,,
如果您需要进一步的信息或帮助,请告诉我。
英文:
You may use this awk
solution:
awk -v OFS=, '
NR == FNR {
if (!/^#/)
a[++i] = $0
next
}
{
if (/^#/)
print
else {
++NR2
if (NR2 in a)
print $0, a[NR2]
else
print $0,"",""
}
}' file2 file1
# dsd
# dsd
1,2,5,1,2
2,3,5,1,2
1,2,5,,
2,3,5,,
3,4,5,,
3,4,5,,
答案2
得分: 2
Using awk的示例代码:
$ cat tst.awk
BEGIN { FS=OFS="," }
FNR == 1 {
lineNr = 0
dflt = a[1]
gsub("[^"FS"]+", "", dflt)
}
/^#/ {
if ( NR != FNR ) {
print
}
next
}
{ ++lineNr }
NR == FNR {
a[lineNr] = $0
next
}
{ print $0, (lineNr in a ? a[lineNr] : dflt) }
<p>
$ awk -f tst.awk file2 file1
# dsd
# dsd
1,2,5,1,2
2,3,5,1,2
1,2,5,,
2,3,5,,
3,4,5,,
3,4,5,,
只返回翻译好的代码部分。
英文:
Using any awk:
$ cat tst.awk
BEGIN { FS=OFS="," }
FNR == 1 {
lineNr = 0
dflt = a[1]
gsub("[^"FS"]+","",dflt)
}
/^#/ {
if ( NR != FNR ) {
print
}
next
}
{ ++lineNr }
NR == FNR {
a[lineNr] = $0
next
}
{ print $0, (lineNr in a ? a[lineNr] : dflt) }
<p>
$ awk -f tst.awk file2 file1
# dsd
# dsd
1,2,5,1,2
2,3,5,1,2
1,2,5,,
2,3,5,,
3,4,5,,
3,4,5,,
答案3
得分: 1
使用伟大的Miller,通过以下代码运行paste、cat和grep,你可以得到以下结果:
# dsd
# dsd
1,2,5,1,2
2,3,5,1,2
1,2,5,,
2,3,5,,
3,4,5,,
3,4,5,,
步骤:
- 水平合并两个输入文件,移除注释行(通过
paste
和grep
); - 添加缺失的逗号(通过
mlr
); - 将第一个文件的注释行添加到合并后的文件中(通过
grep
和cat
)。
英文:
Using the great Miller, paste, cat and grep, you could run
paste -d ',' <(grep -v '^#' file1.txt) <(grep -v '^#' file2.txt) | mlr --csv -N --ragged cat >output
<file1.txt grep -P '^#' | cat - output > tmp.txt && mv tmp.txt output
to get
# dsd
# dsd
1,2,5,1,2
2,3,5,1,2
1,2,5,,
2,3,5,,
3,4,5,,
3,4,5,,
The steps:
-
merge the two input files horizontally, removing the comments lines (via
paste
andgrep
); -
add missing commas (via
mlr
); -
add the comment lines of first file to the merged one (via
grep
andcat
)
答案4
得分: 0
以下是使用CSV模块的Ruby代码:
ruby -r csv -e '
f1 = CSV.read(ARGV[0])
f2 = CSV.read(ARGV[1]).select { |row| !row.join("").[/^\s*#/] }
f2 = [""] * f1.slice_when { |a, b| b.to_s[/\d/] }.first.length + f2
f2c = f2.max_by { |row| row.length }.length
puts CSV.generate { |csv|
f1.zip(f2).each { |row|
if row.flatten.join("").[/^\s*#/]
csv << row[0]
elsif row[-1].nil?
csv << row[0] + [nil] * f2c
else
csv << row.flatten
end
}
}
' file1 file2
请注意,这段代码假设file1
是两个文件中较长的一个,如果不是,可以轻松修改。
英文:
Here is a Ruby with the CSV module:
ruby -r csv -e '
f1=CSV.read(ARGV[0])
f2=CSV.read(ARGV[1]).select{|row| !row.join("")[/^\s*#/] }
f2=[""]*f1.slice_when{|a,b| b.to_s[/\d/]}.first.length+f2
f2c=f2.max_by{|row| row.length}.length
puts CSV.generate{|csv|
f1.zip(f2).each{|row|
if row.flatten.join("")[/^\s*#/]
csv<<row[0]
elsif row[-1].nil?
csv<<row[0]+[nil]*f2c
else
csv<<row.flatten
end
}
}
' file1 file2
This is not limited to the assumption that file2
is only 2 columns.
It DOES assume that file1
is the longer of the two files. Easily changed if that is not true.
Prints:
# dsd
# dsd
1,2,5,1,2
2,3,5,1,2
1,2,5,,
2,3,5,,
3,4,5,,
3,4,5,,
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论