英文:
MacOS - Combine multiple text files into one spreadsheet with one file per a column?
问题
使用MacOS终端,有没有办法将一个包含多个文本文件的目录合并成一个电子表格(可以是CSV或Numbers格式):
- 每个文件位于单独的列中
- 每个txt文件的每一行位于单独的行中
- 文件按照它们的文件名的首字母顺序排列在电子表格中
示例1:这是在合并之前我的文本文件的样子:
示例2:这是在合并后我的文本文件应该在电子表格中的样子:
(这些示例只是部分提取。实际上我有数百个文件要合并)。
<hr>
我尝试过的步骤:
-
我在Stack Overflow上搜索了答案,但所有关于这个任务的其他问题都使用了Python或Panda。我希望找到一个可以直接从MacOS终端完成而不需要安装Python或Panda等包的解决方案。
-
通过研究,我相信可以使用
paste
命令:paste -d '\t' *.txt > ^0-merged.csv
但是,当我尝试这样做时,会出现以下错误消息:
paste: Too many open files
。它还会生成一个完全空白的CSV文件。
英文:
Using MacOS terminal, is there away to take a directory of text files and have them all combined into one spreadsheet (either CSV or Numbers format) so:
- Every file is in a separate column
- Every line of the txt file is in a separate row.
- The files are placed in the spreadsheet in alphabetical order (using the first letter of file name of the text file).
Example 1: here is how my text files look before combining:
Example 2: here is how my text files should look in a spreadsheet after combining:
(These examples are a partial extract. I actually have 100s of files to combine).
<hr>
Steps I have tried:
-
I searched Stack Overflow for an answer, but all the other questions about this task use Python or Panda. I would prefer a solution that could be done directly from MacOS terminal without needing to install packages like Python or Panda.
-
From researching, I believe the
paste
command could be used:paste -d '\t' *.txt > ^0-merged.csv
However, when I try this, it produces the following error message: paste: Too many open files
. It also produces a CSV file that is completely blank.
答案1
得分: 1
你可以循环遍历并追加每个文件。
```touch merged.csv
for f in *.txt; do paste -d '\t' $f merged.csv > temp; cp temp merged.csv; done; rm temp
你必须先创建文件,否则粘贴操作会失败找不到文件。
https://unix.stackexchange.com/questions/205642/combining-large-amount-of-files
为文件名包含空格的情况添加一个新的想法。
touch merged.csv
# 保存并更改 IFS
OLDIFS=$IFS
IFS=$'\n'
# 将所有文件名读入数组
fileArray=($(find ./ -name "*.txt" | sort))
# 恢复 IFS
IFS=$OLDIFS
# 获取数组长度
tLen=${#fileArray[@]}
# 使用 for 循环读取所有文件名
for (( i=0; i<${tLen}; i++ ));
do
paste -d '\t' "${fileArray[$i]}" merged.csv > temp;
cp temp merged.csv;
done
rm temp
<details>
<summary>英文:</summary>
You could loop through appending each file.
touch merged.csv
for f in *.txt; do paste -d '\t' $f merged.csv > temp; cp temp merged.csv; done; rm temp
You have to create the file first as the paste will fail if it can't find the file.
https://unix.stackexchange.com/questions/205642/combining-large-amount-of-files
Adding a new idea for files with spaces in.
#!/bin/bash
touch merged.csv
save and change IFS
OLDIFS=$IFS
IFS=$'\n'
read all file name into an array
fileArray=($(find ./ -name "*.txt" | sort))
restore it
IFS=$OLDIFS
get length of an array
tLen=${#fileArray[@]}
use for loop read all filenames
for (( i=0; i<${tLen}; i++ ));
do
paste -d '\t' "${fileArray[$i]}" merged.csv > temp;
cp temp merged.csv;
done
rm temp
</details>
# 答案2
**得分**: 1
Ruby是MacOS的一部分。
给定:
```shell
head -n 3 *.txt
==> GOOD THINGS IN LIFE.txt <==
Art
Fun
Hugs
==> IN THE BACKYARD.txt <==
Hose
Tree
Soil
==> KITCHEN CUPBOARD ESSENTIALS.txt <==
Tea
Rice
Milk
==> KNITTING STITCHES.txt <==
Rib
Dip
Seed
你可以运行:
ruby -e '
a=[]
ARGV.sort.each{|fn|
a<<[fn]+File.open(fn).read.split(/\R/)
}
a.transpose.each{|sa|
puts sa.join(",")
}
' *.txt
输出:
GOOD THINGS IN LIFE.txt,IN THE BACKYARD.txt,KITCHEN CUPBOARD ESSENTIALS.txt,KNITTING STITCHES.txt
Art,Hose,Tea,Rib
Fun,Tree,Rice,Dip
Hugs,Soil,Milk,Seed
如果你想得到一个在Excel中工作更好的“正规”CSV文件,可以使用Ruby附带的CSV模块:
ruby -r csv -e '
a=[]
ARGV.sort.each{|fn|
a<<[fn]+File.open(fn).read.split(/\R/)
}
a=a.transpose
puts CSV.generate(**{headers:true, quote_empty:true, force_quotes:true}){|csv|
csv<<a[0]
a[1..].each{|row|
csv<<row
}
}
' *.txt
输出:
"GOOD THINGS IN LIFE.txt","IN THE BACKYARD.txt","KITCHEN CUPBOARD ESSENTIALS.txt","KNITTING STITCHES.txt"
"Art","Hose","Tea","Rib"
"Fun","Tree","Rice","Dip"
"Hugs","Soil","Milk","Seed"
"Earth","Fence","Salt","Tile"
"Honor","Porch","Pesto","Linen"
"Space","Patio","Flour","Cable"
"Sport","Grass","Honey","Wicker"
"Intelligence","Wading Pool","Baking Powder","Knotted Boxes"
"Innovation","Welcome Mat","Vegetable Oil","Chinese Wave"
"Confidence","Back Stoop","Tomato Paste","Checkerboard"
"Good Deeds","Fruit Tree","Black Pepper","Herringbone"
"Creativity","Downspout","Baking Soda","Stockinette"
"Education","Birdbath","Ketchup","Garter"
"Kindness","Terrace","Surer","Waffle"
"Integrity","Planter","Sugar","Puri Ridge"
"Faith","Carport","Coffee","Netted"
"Friends","Flowerbed","Cinnamon","Elongated"
"Respect","Shovel","Cheese","Farrow Rib"
"People","Hedges","Bread","Plaited"
"Yourself","Rocks","Olive Oil","Clamshell"
"Happiness","Lawnmower","Crackers","Bamboo"
"Heart","Hot Tub","Pasta","English Rib"
"Religion","Garden","Scissors","Basket"
"Wisdom","Stoop","Garlic","Raspberry"
英文:
Ruby is part of MacOS.
Given:
head -n 3 *.txt
==> GOOD THINGS IN LIFE.txt <==
Art
Fun
Hugs
==> IN THE BACKYARD.txt <==
Hose
Tree
Soil
==> KITCHEN CUPBOARD ESSENTIALS.txt <==
Tea
Rice
Milk
==> KNITTING STITCHES.txt <==
Rib
Dip
Seed
# and the rest of your lines in each case...
You can do:
ruby -e '
a=[]
ARGV.sort.each{|fn|
a<<[fn]+File.open(fn).read.split(/\R/)
}
a.transpose.each{|sa|
puts sa.join(",")
}
' *.txt
Prints:
GOOD THINGS IN LIFE.txt,IN THE BACKYARD.txt,KITCHEN CUPBOARD ESSENTIALS.txt,KNITTING STITCHES.txt
Art,Hose,Tea,Rib
Fun,Tree,Rice,Dip
Hugs,Soil,Milk,Seed
Earth,Fence,Salt,Tile
Honor,Porch,Pesto,Linen
Space,Patio,Flour,Cable
Sport,Grass,Honey,Wicker
Intelligence,Wading Pool,Baking Powder,Knotted Boxes
Innovation,Welcome Mat,Vegetable Oil,Chinese Wave
Confidence,Back Stoop,Tomato Paste,Checkerboard
Good Deeds,Fruit Tree,Black Pepper,Herringbone
Creativity,Downspout,Baking Soda,Stockinette
Education,Birdbath,Ketchup,Garter
Kindness,Terrace,Surer,Waffle
Integrity,Planter,Sugar,Puri Ridge
Faith,Carport,Coffee,Netted
Friends,Flowerbed,Cinnamon,Elongated
Respect,Shovel,Cheese,Farrow Rib
People,Hedges,Bread,Plaited
Yourself,Rocks,Olive Oil,Clamshell
Happiness,Lawnmower,Crackers,Bamboo
Heart,Hot Tub,Pasta,English Rib
Religion,Garden,Scissors,Basket
Wisdom,Stoop,Garlic,Raspberry
If you want a 'proper' csv with quoted fields that works better with Excel, you can use the CSV module included with Ruby:
ruby -r csv -e '
a=[]
ARGV.sort.each{|fn|
a<<[fn]+File.open(fn).read.split(/\R/)
}
a=a.transpose
puts CSV.generate(**{headers:true, quote_empty:true, force_quotes:true}){|csv|
csv<<a[0]
a[1..].each{|row|
csv<<row
}
}
' *.txt
Prints:
"GOOD THINGS IN LIFE.txt","IN THE BACKYARD.txt","KITCHEN CUPBOARD ESSENTIALS.txt","KNITTING STITCHES.txt"
"Art","Hose","Tea","Rib"
"Fun","Tree","Rice","Dip"
"Hugs","Soil","Milk","Seed"
"Earth","Fence","Salt","Tile"
"Honor","Porch","Pesto","Linen"
"Space","Patio","Flour","Cable"
"Sport","Grass","Honey","Wicker"
"Intelligence","Wading Pool","Baking Powder","Knotted Boxes"
"Innovation","Welcome Mat","Vegetable Oil","Chinese Wave"
"Confidence","Back Stoop","Tomato Paste","Checkerboard"
"Good Deeds","Fruit Tree","Black Pepper","Herringbone"
"Creativity","Downspout","Baking Soda","Stockinette"
"Education","Birdbath","Ketchup","Garter"
"Kindness","Terrace","Surer","Waffle"
"Integrity","Planter","Sugar","Puri Ridge"
"Faith","Carport","Coffee","Netted"
"Friends","Flowerbed","Cinnamon","Elongated"
"Respect","Shovel","Cheese","Farrow Rib"
"People","Hedges","Bread","Plaited"
"Yourself","Rocks","Olive Oil","Clamshell"
"Happiness","Lawnmower","Crackers","Bamboo"
"Heart","Hot Tub","Pasta","English Rib"
"Religion","Garden","Scissors","Basket"
"Wisdom","Stoop","Garlic","Raspberry"
Comments:
> It also puts the file name at the top of every column. Is there any
> way to omit the file name? Also, it seems to treat Uppercase A-Z and
> lowercase a-z as separate (e.g. so file names with A-Z will come first
> and then file names with a-z) Thanks!
If you have files of different length, you can pad the end of the shorter files so that you still have a proper matrix to transpose:
ruby -r csv -e '
a=[]
ARGV.sort_by{|s| s.downcase}.each{|fn|
a<<File.open(fn).read.split(/\R/)
}
max_length=a.max_by{|sa| sa.length}.length
a.each.with_index{|sa,i|
if sa.length<max_length then a[i].concat [""]*(max_length-sa.length) end }
a=a.transpose
puts CSV.generate(**{headers:true, quote_empty:true, force_quotes:true}){|csv|
csv<<a[0]
a[1..].each{|row|
csv<<row
}
}
' *.txt
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论