英文:
Merge multiple csv file with same header using powershell
问题
我有多个包含以下数据的文件在一个文件夹中:
文件1
"Index","Response","Status","Time"
"32190","2","Succeeded","2023-01-18 08:31:34.9"
"32189","3","Succeeded","2023-01-18 08:26:34.9"
"32188","3","Succeeded","2023-01-18 08:21:34.9"
文件2
"Index","Response","Status","Time"
"32190","2","Succeeded","2023-01-19 08:31:34.9"
"32189","3","Succeeded","2023-01-19 08:26:34.9"
"32188","3","Succeeded","2023-01-19 08:21:34.9"
需要将这些文件合并为一个CSV文件,并且只有一个标题行:
"Index","Response","Status","Time"
"32190","2","Succeeded","2023-01-18 08:31:34.9"
"32189","3","Succeeded","2023-01-18 08:26:34.9"
"32188","3","Succeeded","2023-01-18 08:21:34.9"
"32190","2","Succeeded","2023-01-19 08:31:34.9"
"32189","3","Succeeded","2023-01-19 08:26:34.9"
"32188","3","Succeeded","2023-01-19 08:21:34.9"
我有以下代码,但我无法得到单一的标题行:
$folder = 'D:\reports\daily_csv'
$files = Get-ChildItem $folder\*.csv
Get-Content $files | Set-Content "D:\Monthly\Merged_$prev_month.csv"
请告诉我需要在这里添加什么以避免多个标题行。
英文:
I am having multiple csv files in a folder with data like below
file1
"Index","Response","Status","Time"
"32190","2","Succeeded","2023-01-18 08:31:34.9"
"32189","3","Succeeded","2023-01-18 08:26:34.9"
"32188","3","Succeeded","2023-01-18 08:21:34.9"
file2
"Index","Response","Status","Time"
"32190","2","Succeeded","2023-01-19 08:31:34.9"
"32189","3","Succeeded","2023-01-19 08:26:34.9"
"32188","3","Succeeded","2023-01-19 08:21:34.9"
Need to merge these files into a single csv with a one header
"Index","Response","Status","Time"
"32190","2","Succeeded","2023-01-18 08:31:34.9"
"32189","3","Succeeded","2023-01-18 08:26:34.9"
"32188","3","Succeeded","2023-01-18 08:21:34.9"
"32190","2","Succeeded","2023-01-19 08:31:34.9"
"32189","3","Succeeded","2023-01-19 08:26:34.9"
"32188","3","Succeeded","2023-01-19 08:21:34.9"
I have this below code but I am not able to get single header in it
$folder = 'D:\reports\daily_csv'
$files = Get-ChildItem $folder\*.csv
Get-Content $files | Set-Content "D:\Monthly\Merged_$prev_month.csv"
Please let me know what I need to add here to avoid multiple headers
答案1
得分: 3
以下是使用 StreamReader
和匿名函数执行的一种方法。请注意,.OpenText()
使用UTF8编码初始化StreamReader
,如果这是一个问题,您可以改用 StreamReader(String, Encoding)
。
$folder = 'D:\reports\daily_csv'
Get-ChildItem $folder\*.csv | & {
begin { $isFirstObject = $true }
process {
try {
$reader = $_.OpenText()
$headers = $reader.ReadLine()
if($isFirstObject) {
$headers
$isFirstObject = $false
}
while(-not $reader.EndOfStream) {
$reader.ReadLine()
}
}
finally {
if($reader) {
$reader.Dispose()
}
}
}
} | Set-Content path\to\mergedCsv.csv
英文:
Here is one way to do it using StreamReader
and an anonymous function. Note that .OpenText()
initializes the StreamReader
with UTF8 encoding, if that's a problem you can use StreamReader(String, Encoding)
instead.
$folder = 'D:\reports\daily_csv'
Get-ChildItem $folder\*.csv | & {
begin { $isFirstObject = $true }
process {
try {
$reader = $_.OpenText()
$headers = $reader.ReadLine()
if($isFirstObject) {
$headers
$isFirstObject = $false
}
while(-not $reader.EndOfStream) {
$reader.ReadLine()
}
}
finally {
if($reader) {
$reader.Dispose()
}
}
}
} | Set-Content path\to\mergedCsv.csv
答案2
得分: 3
Santiago Squarzon的有益的纯文本处理答案绝对是您的最佳选择,无论是在性能上,还是在保留格式化细节方面(无论是所有字段还是只有一些字段是双引号括起来的)。
一个较慢但更方便的替代方法,它不保留格式化细节(但这不应该重要),是使用Import-Csv
的支持多个输入文件,通过其-LiteralPath
参数:
Import-Csv -LiteralPath (Get-ChildItem D:\reports\daily_csv -Filter *.csv).FullName |
Export-Csv -NoTypeInformation -Encoding utf8 "D:\Monthly\Merged_$prev_month.csv"
请注意,在PowerShell(核心)7+中,Export-Csv
不再需要-NoTypeInformation
或-Encoding utf8
,除非您需要不同的编码(无BOM的UTF-8现在是一致的默认值;如果需要BOM,请使用-Encoding utf8bom
)。
还请注意,PowerShell(核心)7+中已经修复了一个错误,允许将Get-ChildItem
的结果通过管道提供给Import-Csv
:
# 仅限PS 7+ - WinPS中的错误阻止了将Get-ChildItem的输入提供给Import-Csv
Get-ChildItem D:\reports\daily_csv -Filter *.csv |
Import-Csv |
Export-Csv "D:\Monthly\Merged_$prev_month.csv"
英文:
<!-- language-all: sh -->
Santiago Squarzon's helpful plain-text-processing answer is definitely your best option, both in terms of performance, and in that it also preserves the formatting specifics (whether all fields or even only some fields are double-quoted or not).
A slower, but more convenient alternative that doesn't preserve the formatting specifics (which should not matter, however) is to use Import-Csv
's support for multiple input files, via its -LiteralPath
parameter:
Import-Csv -LiteralPath (Get-ChildItem D:\reports\daily_csv -Filter *.csv).FullName |
Export-Csv -NoTypeInformation -Encoding utf8 "D:\Monthly\Merged_$prev_month.csv"
Note that neither -NoTypeInformation
nor -Encoding utf8
are required anymore for Export-Csv
in PowerShell (Core) 7+, unless you need a different encoding (BOM-less UTF-8 is now the consistent default; if you do need a BOM, use -Encoding utf8bom
).
Also note that a bug has been fixed in PowerShell (Core) 7+ that enables providing the Get-ChildItem
results to Import-Csv
via the pipeline:
# PS 7+ ONLY - a bug in WinPS prevents Get-ChildItem input to Import-Csv
Get-ChildItem D:\reports\daily_csv -Filter *.csv |
Import-Csv |
Export-Csv "D:\Monthly\Merged_$prev_month.csv"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论