2023年4月17日 18:51:27go评论85阅读模式

英文:

MacOS - Combine multiple text files into one spreadsheet with one file per a column?

问题

使用MacOS终端，有没有办法将一个包含多个文本文件的目录合并成一个电子表格（可以是CSV或Numbers格式）：

每个文件位于单独的列中
每个txt文件的每一行位于单独的行中
文件按照它们的文件名的首字母顺序排列在电子表格中

示例1：这是在合并之前我的文本文件的样子：

示例2：这是在合并后我的文本文件应该在电子表格中的样子：

（这些示例只是部分提取。实际上我有数百个文件要合并）。

<hr>

我尝试过的步骤：

我在Stack Overflow上搜索了答案，但所有关于这个任务的其他问题都使用了Python或Panda。我希望找到一个可以直接从MacOS终端完成而不需要安装Python或Panda等包的解决方案。
通过研究，我相信可以使用paste命令：

paste -d '\t' *.txt > ^0-merged.csv

但是，当我尝试这样做时，会出现以下错误消息：paste: Too many open files。它还会生成一个完全空白的CSV文件。

英文:

Using MacOS terminal, is there away to take a directory of text files and have them all combined into one spreadsheet (either CSV or Numbers format) so:

Every file is in a separate column
Every line of the txt file is in a separate row.
The files are placed in the spreadsheet in alphabetical order (using the first letter of file name of the text file).

Example 1: here is how my text files look before combining:

Example 2: here is how my text files should look in a spreadsheet after combining:

(These examples are a partial extract. I actually have 100s of files to combine).

<hr>

Steps I have tried:

I searched Stack Overflow for an answer, but all the other questions about this task use Python or Panda. I would prefer a solution that could be done directly from MacOS terminal without needing to install packages like Python or Panda.
From researching, I believe the paste command could be used:

paste -d '\t' *.txt > ^0-merged.csv

However, when I try this, it produces the following error message: paste: Too many open files. It also produces a CSV file that is completely blank.

答案1

得分: 1

你可以循环遍历并追加每个文件。
```touch merged.csv
for f in *.txt; do paste -d '\t' $f merged.csv > temp; cp temp merged.csv; done; rm temp

你必须先创建文件，否则粘贴操作会失败找不到文件。

https://unix.stackexchange.com/questions/205642/combining-large-amount-of-files

为文件名包含空格的情况添加一个新的想法。

touch merged.csv
# 保存并更改 IFS
OLDIFS=$IFS
IFS=$'\n'
# 将所有文件名读入数组
fileArray=($(find ./ -name "*.txt" | sort))
# 恢复 IFS
IFS=$OLDIFS
# 获取数组长度
tLen=${#fileArray[@]}
# 使用 for 循环读取所有文件名
for (( i=0; i<${tLen}; i++ ));
do
  paste -d '\t' "${fileArray[$i]}" merged.csv > temp;
  cp temp merged.csv;
done
rm temp


<details>
<summary>英文:</summary>
You could loop through appending each file.

touch merged.csv
for f in *.txt; do paste -d '\t' $f merged.csv > temp; cp temp merged.csv; done; rm temp


You have to create the file first as the paste will fail if it can&#39;t find the file.
https://unix.stackexchange.com/questions/205642/combining-large-amount-of-files
Adding a new idea for files with spaces in.

#!/bin/bash
touch merged.csv

save and change IFS

OLDIFS=$IFS
IFS=$'\n'

read all file name into an array

fileArray=($(find ./ -name "*.txt" | sort))

restore it

IFS=$OLDIFS

get length of an array

tLen=${#fileArray[@]}

use for loop read all filenames

for (( i=0; i<${tLen}; i++ ));
do
paste -d '\t' "${fileArray[$i]}" merged.csv > temp;
cp temp merged.csv;
done
rm temp


</details>
# 答案2
**得分**: 1
Ruby是MacOS的一部分。
给定：
```shell
head -n 3 *.txt
==&gt; GOOD THINGS IN LIFE.txt &lt;==
Art
Fun
Hugs
==&gt; IN THE BACKYARD.txt &lt;==
Hose
Tree
Soil
==&gt; KITCHEN CUPBOARD ESSENTIALS.txt &lt;==
Tea
Rice
Milk
==&gt; KNITTING STITCHES.txt &lt;==
Rib
Dip
Seed

你可以运行：

ruby -e &#39;
a=[]
ARGV.sort.each{|fn|
    a&lt;&lt;[fn]+File.open(fn).read.split(/\R/)
}
a.transpose.each{|sa|
    puts sa.join(&quot;,&quot;)
}
&#39; *.txt

输出：

GOOD THINGS IN LIFE.txt,IN THE BACKYARD.txt,KITCHEN CUPBOARD ESSENTIALS.txt,KNITTING STITCHES.txt
Art,Hose,Tea,Rib
Fun,Tree,Rice,Dip
Hugs,Soil,Milk,Seed

如果你想得到一个在Excel中工作更好的“正规”CSV文件，可以使用Ruby附带的CSV模块：

ruby -r csv -e &#39;
a=[]
ARGV.sort.each{|fn|
    a&lt;&lt;[fn]+File.open(fn).read.split(/\R/)
}
a=a.transpose
puts CSV.generate(**{headers:true, quote_empty:true, force_quotes:true}){|csv|
    csv&lt;&lt;a[0]
    a[1..].each{|row|
        csv&lt;&lt;row
    }
}
&#39; *.txt

输出：

"GOOD THINGS IN LIFE.txt","IN THE BACKYARD.txt","KITCHEN CUPBOARD ESSENTIALS.txt","KNITTING STITCHES.txt"
"Art","Hose","Tea","Rib"
"Fun","Tree","Rice","Dip"
"Hugs","Soil","Milk","Seed"
"Earth","Fence","Salt","Tile"
"Honor","Porch","Pesto","Linen"
"Space","Patio","Flour","Cable"
"Sport","Grass","Honey","Wicker"
"Intelligence","Wading Pool","Baking Powder","Knotted Boxes"
"Innovation","Welcome Mat","Vegetable Oil","Chinese Wave"
"Confidence","Back Stoop","Tomato Paste","Checkerboard"
"Good Deeds","Fruit Tree","Black Pepper","Herringbone"
"Creativity","Downspout","Baking Soda","Stockinette"
"Education","Birdbath","Ketchup","Garter"
"Kindness","Terrace","Surer","Waffle"
"Integrity","Planter","Sugar","Puri Ridge"
"Faith","Carport","Coffee","Netted"
"Friends","Flowerbed","Cinnamon","Elongated"
"Respect","Shovel","Cheese","Farrow Rib"
"People","Hedges","Bread","Plaited"
"Yourself","Rocks","Olive Oil","Clamshell"
"Happiness","Lawnmower","Crackers","Bamboo"
"Heart","Hot Tub","Pasta","English Rib"
"Religion","Garden","Scissors","Basket"
"Wisdom","Stoop","Garlic","Raspberry"

英文:

Ruby is part of MacOS.

Given:

head -n 3 *.txt
==&gt; GOOD THINGS IN LIFE.txt &lt;==
Art
Fun
Hugs
==&gt; IN THE BACKYARD.txt &lt;==
Hose
Tree
Soil
==&gt; KITCHEN CUPBOARD ESSENTIALS.txt &lt;==
Tea
Rice
Milk
==&gt; KNITTING STITCHES.txt &lt;==
Rib
Dip
Seed
# and the rest of your lines in each case...

You can do:

ruby -e &#39;
a=[]
ARGV.sort.each{|fn|
    a&lt;&lt;[fn]+File.open(fn).read.split(/\R/)
}
a.transpose.each{|sa|
    puts sa.join(&quot;,&quot;)
}
&#39; *.txt

Prints:

GOOD THINGS IN LIFE.txt,IN THE BACKYARD.txt,KITCHEN CUPBOARD ESSENTIALS.txt,KNITTING STITCHES.txt
Art,Hose,Tea,Rib
Fun,Tree,Rice,Dip
Hugs,Soil,Milk,Seed
Earth,Fence,Salt,Tile
Honor,Porch,Pesto,Linen
Space,Patio,Flour,Cable
Sport,Grass,Honey,Wicker
Intelligence,Wading Pool,Baking Powder,Knotted Boxes
Innovation,Welcome Mat,Vegetable Oil,Chinese Wave
Confidence,Back Stoop,Tomato Paste,Checkerboard
Good Deeds,Fruit Tree,Black Pepper,Herringbone
Creativity,Downspout,Baking Soda,Stockinette
Education,Birdbath,Ketchup,Garter
Kindness,Terrace,Surer,Waffle
Integrity,Planter,Sugar,Puri Ridge
Faith,Carport,Coffee,Netted
Friends,Flowerbed,Cinnamon,Elongated
Respect,Shovel,Cheese,Farrow Rib
People,Hedges,Bread,Plaited
Yourself,Rocks,Olive Oil,Clamshell
Happiness,Lawnmower,Crackers,Bamboo
Heart,Hot Tub,Pasta,English Rib
Religion,Garden,Scissors,Basket
Wisdom,Stoop,Garlic,Raspberry

If you want a 'proper' csv with quoted fields that works better with Excel, you can use the CSV module included with Ruby:

ruby -r csv -e &#39;
a=[]
ARGV.sort.each{|fn|
    a&lt;&lt;[fn]+File.open(fn).read.split(/\R/)
}
a=a.transpose
puts CSV.generate(**{headers:true, quote_empty:true, force_quotes:true}){|csv|
    csv&lt;&lt;a[0]
    a[1..].each{|row|
        csv&lt;&lt;row
    }
}
&#39; *.txt

Prints:

&quot;GOOD THINGS IN LIFE.txt&quot;,&quot;IN THE BACKYARD.txt&quot;,&quot;KITCHEN CUPBOARD ESSENTIALS.txt&quot;,&quot;KNITTING STITCHES.txt&quot;
&quot;Art&quot;,&quot;Hose&quot;,&quot;Tea&quot;,&quot;Rib&quot;
&quot;Fun&quot;,&quot;Tree&quot;,&quot;Rice&quot;,&quot;Dip&quot;
&quot;Hugs&quot;,&quot;Soil&quot;,&quot;Milk&quot;,&quot;Seed&quot;
&quot;Earth&quot;,&quot;Fence&quot;,&quot;Salt&quot;,&quot;Tile&quot;
&quot;Honor&quot;,&quot;Porch&quot;,&quot;Pesto&quot;,&quot;Linen&quot;
&quot;Space&quot;,&quot;Patio&quot;,&quot;Flour&quot;,&quot;Cable&quot;
&quot;Sport&quot;,&quot;Grass&quot;,&quot;Honey&quot;,&quot;Wicker&quot;
&quot;Intelligence&quot;,&quot;Wading Pool&quot;,&quot;Baking Powder&quot;,&quot;Knotted Boxes&quot;
&quot;Innovation&quot;,&quot;Welcome Mat&quot;,&quot;Vegetable Oil&quot;,&quot;Chinese Wave&quot;
&quot;Confidence&quot;,&quot;Back Stoop&quot;,&quot;Tomato Paste&quot;,&quot;Checkerboard&quot;
&quot;Good Deeds&quot;,&quot;Fruit Tree&quot;,&quot;Black Pepper&quot;,&quot;Herringbone&quot;
&quot;Creativity&quot;,&quot;Downspout&quot;,&quot;Baking Soda&quot;,&quot;Stockinette&quot;
&quot;Education&quot;,&quot;Birdbath&quot;,&quot;Ketchup&quot;,&quot;Garter&quot;
&quot;Kindness&quot;,&quot;Terrace&quot;,&quot;Surer&quot;,&quot;Waffle&quot;
&quot;Integrity&quot;,&quot;Planter&quot;,&quot;Sugar&quot;,&quot;Puri Ridge&quot;
&quot;Faith&quot;,&quot;Carport&quot;,&quot;Coffee&quot;,&quot;Netted&quot;
&quot;Friends&quot;,&quot;Flowerbed&quot;,&quot;Cinnamon&quot;,&quot;Elongated&quot;
&quot;Respect&quot;,&quot;Shovel&quot;,&quot;Cheese&quot;,&quot;Farrow Rib&quot;
&quot;People&quot;,&quot;Hedges&quot;,&quot;Bread&quot;,&quot;Plaited&quot;
&quot;Yourself&quot;,&quot;Rocks&quot;,&quot;Olive Oil&quot;,&quot;Clamshell&quot;
&quot;Happiness&quot;,&quot;Lawnmower&quot;,&quot;Crackers&quot;,&quot;Bamboo&quot;
&quot;Heart&quot;,&quot;Hot Tub&quot;,&quot;Pasta&quot;,&quot;English Rib&quot;
&quot;Religion&quot;,&quot;Garden&quot;,&quot;Scissors&quot;,&quot;Basket&quot;
&quot;Wisdom&quot;,&quot;Stoop&quot;,&quot;Garlic&quot;,&quot;Raspberry&quot;

Comments:

> It also puts the file name at the top of every column. Is there any
> way to omit the file name? Also, it seems to treat Uppercase A-Z and
> lowercase a-z as separate (e.g. so file names with A-Z will come first
> and then file names with a-z) Thanks!

If you have files of different length, you can pad the end of the shorter files so that you still have a proper matrix to transpose:

ruby -r csv -e &#39;
a=[]
ARGV.sort_by{|s| s.downcase}.each{|fn|
    a&lt;&lt;File.open(fn).read.split(/\R/)
}
max_length=a.max_by{|sa| sa.length}.length
a.each.with_index{|sa,i| 
    if sa.length&lt;max_length then a[i].concat [&quot;&quot;]*(max_length-sa.length) end }
a=a.transpose
puts CSV.generate(**{headers:true, quote_empty:true, force_quotes:true}){|csv|
    csv&lt;&lt;a[0]
    a[1..].each{|row|
        csv&lt;&lt;row
    }
}
&#39; *.txt

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将多个文本文件合并成一个电子表格，每个文件占据一列。

问题

答案1

save and change IFS

read all file name into an array

restore it

get length of an array

use for loop read all filenames

Writing values from .csv file to .xml using PowerShell. My script isn't working – critique my code

pandas的read_csv在每行的第一个和最后一个项目上显示引号。

如何在Google Sheets中避免复制引号

将HashMap的键和值分别存储到两个字符串变量中，在Java中。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。