英文:
Stream powershell output objects to format-table
问题
我正在编写一个 PowerShell 函数来处理日志文件中的行。日志文件由另一个进程写入。为了更容易查看日志,我将每个日志行解析成一个对象,并将其传递到管道中。我添加了一个 -Wait
参数,它将继续监视更改而不退出。
注意:这是一个简化版本,其中包含硬编码的字段,但显示相同的行为
function Watch-Log($Path, [switch]$Wait) {
[System.IO.FileStream]$log = $null
[System.IO.StreamReader]$logReader = $null
$logPath = Split-Path -Parent $Path
$logFilter = Split-Path -Leaf $Path
$watcher = [System.IO.FileSystemWatcher]::new($logPath, $logFilter)
$watcher.IncludeSubdirectories = $false
$createdEventIdentifier = "Service_LogFileCreated"
Register-ObjectEvent $watcher Created -SourceIdentifier $createdEventIdentifier
$watcher.EnableRaisingEvents = $true
function Open-ServiceLogStream([ref][System.IO.FileStream]$log, [ref][System.IO.StreamReader]$logReader) {
if ($log.Value) { $log.Value.Close() }
$log.Value = [System.IO.File]::Open($Path, "Open", "Read", "ReadWrite,Delete");
$logReader.Value = [System.IO.StreamReader]::new($log.Value)
}
try {
Open-ServiceLogStream ([ref]$log) ([ref]$logReader)
while ($true) {
$line = $logReader.ReadLine()
if (!$line) {
if (!$Wait) { return }
$newLogEvent = Get-Event -SourceIdentifier $createdEventIdentifier -ErrorAction SilentlyContinue
if ($newLogEvent) {
Remove-Event -EventIdentifier $newLogEvent.EventIdentifier
Open-ServiceLogStream ([ref]$log) ([ref]$logReader)
}
else {
Start-Sleep -Milliseconds 100
}
continue;
}
$lineParts = $line -split " "
[PSCustomObject]@{ PSTypeName = "Test.LogLine"; Date = $lineParts[0]; Priority = $lineParts[1]; LogMessage = $lineParts[2]; }
}
}
finally {
Unregister-Event -SourceIdentifier $createdEventIdentifier
}
}
所以它可以这样使用:
> Watch-Log test.log | Select Date, Priority, LogMessage
Date Priority LogMessage
---- -------- ----------
7/10/2023 19:18:50 Message New Log Opened
7/10/2023 19:19:33 Message Line Here!
7/10/2023 19:19:33 Message Line Here!
...
不幸的是,如果我使用 -Wait
参数,管道的“输出”将永远被缓冲,永远不会被打印出来。理想情况下,我希望它能够开始打印表格中的输出,并在发出日志时添加新行。如果列宽基于它首次看到的第一条日志行而保持固定,我也可以接受。
我可以通过使用 Out-Host
来欺骗它立即打印内容:
Watch-Log test.log -Wait | Out-Host
但是需要其他使用者知道他们需要附加 Out-Host
,如果它被包装在另一个函数中,那么你无法强制执行这个操作。
更新
我终于找到了问题的根源,之前无法在没有我的默认 PowerShell 配置文件的机器上复现此问题。
几年前,我在 PowerShell 中做了一些事情,最终覆盖了我的 PowerShell 配置文件中的 Out-Default。
我已经拖着这个配置文件走了相当长的时间(我认为是自 Windows PowerShell 时代以来?),完全忘记了这一点。
function Out-Default {
$input | Tee-Object -var global:last | Microsoft.PowerShell.Core\out-default
}
由于这不是一个管道函数,它会在将所有内容收集到 $input 之前阻塞,然后将其传递给 Tee-Object,这导致默认格式化出现问题。
是的,我已经努力寻找问题解决方案好几天了,终于在提交了一个关于它的错误报告后,只需几分钟就发现了问题的所在!感谢大家帮助我找到正确的方向。
英文:
I'm writing a powershell function to process lines from a log file. The log file is being written to by another process. In order to make viewing the logs easier I'm parsing each log line into an object and passing that along the pipeline. I've added a -Wait
parameter which will continue watching for changes instead of exiting.
Note: This is a trimmed down version with hard-coded fields but it shows the same behavior
function Watch-Log($Path, [switch]$Wait) {
[System.IO.FileStream]$log = $null
[System.IO.StreamReader]$logReader = $null
$logPath = Split-Path -Parent $Path
$logFilter = Split-Path -Leaf $Path
$watcher = [System.IO.FileSystemWatcher]::new($logPath, $logFilter)
$watcher.IncludeSubdirectories = $false
$createdEventIdentifier = "Service_LogFileCreated"
Register-ObjectEvent $watcher Created -SourceIdentifier $createdEventIdentifier
$watcher.EnableRaisingEvents = $true
function Open-ServiceLogStream([ref][System.IO.FileStream]$log, [ref][System.IO.StreamReader]$logReader) {
if ($log.Value) { $log.Value.Close() }
$log.Value = [System.IO.File]::Open($Path, "Open", "Read", "ReadWrite,Delete");
$logReader.Value = [System.IO.StreamReader]::new($log.Value)
}
try {
Open-ServiceLogStream ([ref]$log) ([ref]$logReader)
while ($true) {
$line = $logReader.ReadLine()
if (!$line) {
if (!$Wait) { return }
$newLogEvent = Get-Event -SourceIdentifier $createdEventIdentifier -ErrorAction SilentlyContinue
if ($newLogEvent) {
Remove-Event -EventIdentifier $newLogEvent.EventIdentifier
Open-ServiceLogStream ([ref]$log) ([ref]$logReader)
}
else {
Start-Sleep -Milliseconds 100
}
continue;
}
$lineParts = $line -split " "
[PSCustomObject]@{ PSTypeName = "Test.LogLine"; Date = $lineParts[0]; Priority = $lineParts[1]; LogMessage = $lineParts[2]; }
}
}
finally {
Unregister-Event -SourceIdentifier $createdEventIdentifier
}
}
So it might be used like this:
> Watch-Log test.log | Select Date, Priority, LogMessage
Date Priority LogMessage
---- -------- ----------
7/10/2023 19:18:50 Message New Log Opened
7/10/2023 19:19:33 Message Line Here!
7/10/2023 19:19:33 Message Line Here!
...
Unfortuantely, if I'm using the -Wait
parameter the "output" of the pipeline is buffered forever and never gets printed. Ideally, I would like it to start printing the output in the table, and add new rows as logs are emitted. I'm okay if the column widths stay fixed based on the first log line it sees.
I can trick it into printing stuff immediately by using Out-Host
Watch-Log test.log -Wait | Out-Host
But it's frustrating to require other users of this to know they need to append Out-Host, and if it's wrapped in another function than you can't force that.
Update
I've finally uncovered the source of the problem after being unable to repro the issue on a machine without my default Powershell profile.
A few years ago I was doing something in Powershell and ended up overriding Out-Default in my powershell profile.
I've been dragging this profile around for quite some time (since the Windows Powershell only days I think?) and totally forgot about this.
function Out-Default {
$input | Tee-Object -var global:last | Microsoft.PowerShell.Core\out-default
}
Since this isn't a pipeline function, it blocks as it gathers everything into $input before passing it onto Tee-Object which is causing the default formatting to break.
Yes, I've been trying to figure this out for days and FINALLY only moments after submitting a bug about it I uncovered the problem! Thanks everyone for helping get me pointed in the right direction.
答案1
得分: 2
可能的罪魁祸首是Format-Table
使用的臭名昭著的300毫秒延迟,以收集足够的输入数据,以便合理猜测输出列的宽度,然后才开始生成显示输出。
然而,如下示例所示,300毫秒的延迟可能导致潜在的_更长_的延迟,甚至可能导致_永远_不会生成显示输出:[1]
# 注意:省略Format-Table会表现出相同的行为,因为
# 一个具有4个或更少属性的对象会被*隐式*地格式化为表格,
# 除非默认格式由与对象类型关联的预定义格式数据管理。
& { [pscustomobject] @{ foo = 1 }; Start-Sleep 10 } | Format-Table
直到10秒后,即Start-Sleep
调用返回并输入命令终止后,才会打印出表格。
这个问题也会出现在等待未指定时间发生的_事件_的cmdlet使用时,即Wait-Job
、Wait-Process
和Wait-Event
。
问题在于,Format-Table
直到它要么_接收到下一个输入对象_,要么_输入命令终止_时才能重新获得控制。
解决方法:
如果Format-Table
能_提前知道列宽度_,就可以避免300毫秒的延迟,可以通过以下两种方式之一实现:
-
# 由于预定义的列宽度,开始立即生成显示输出, # 因此显示输出 * 瞬间 *。 & { [pscustomobject] @{ foo = 1 }; Start-Sleep 10 } | Format-Table -Property @{ Expression='foo'; Width = 10 }
-
不幸的是,你必须为_每个_输出属性(列)执行此操作。
-
此外,任何显式使用
Format-*
cmdlet都会干扰将输出用作_数据_ - 请参阅此答案。
-
-
将预定义的_格式数据_与输出对象的.NET类型关联,并定义具有固定列宽度的表视图。
-
这是首选解决方案 - 至少因为它不会干扰输出_数据_ - 但需要编写格式文件(
*._Format.ps1xml
),必须首先加载到会话中。 -
另外,您必须使
Write-Log
要么使用与您的格式数据相关联的特定.NET类型,要么向[pscustomobject]
输出对象添加PSTypeName
属性(例如,[pscustomobject] @{ PSTypeName = 'My.Type'; foo = 1 }
)。
-
[1] 严格来说,首先会始终打印一个初始的空行,然后在300毫秒延迟过去后才会显示实际的表格。
英文:
<!-- language-all: sh -->
The likely culprit is the infamous 300-millisecond delay that Format-Table
uses in order to collect enough input data so as to make a reasonable guess how wide the output columns should be, before it starts to produce display output.
-
See this answer for a detailed explanation.
-
The most common gripe about this behavior is that - when
Format-Table
is used implicitly - it can interfere with output ordering across PowerShell's different output streams - see this answer.
However, as the following example shows, the 300-millisecond delay can result in potentially much longer delays, and even potentially in never producing display output:<sup>[1]</sup>
# Note: Omitting Format-Table exhibits the same behavior, because
# an object with 4 or fewer properties is *implicitly* table-formatted,
# unless the default formatting is governed by predefined formatting
# data associated with the object's type.
& { [pscustomobject] @{ foo = 1 }; Start-Sleep 10 } | Format-Table
This doesn't print the table until 10 seconds later, namely after the Start-Sleep
call returns and the input command has terminated.
The problem also occurs with use of cmdlets that wait for an unspecified period for an event to occur, namely Wait-Job
, Wait-Process
and Wait-Event
.
The problem is that Format-Table
doesn't regain control until it either receives the next input object or the input command terminates.
Workarounds:
The 300-millisecond delay can be avoided if Format-Table
knows the column widths ahead of time, which can be achieved in one of two ways:
-
Specify the column widths via calculated properties, as zett42 recommends:
# Starts to produce display output *instantly*, # due to the predefined column width. & { [pscustomobject] @{ foo = 1 }; Start-Sleep 10 } | Format-Table -Property @{ Expression='foo'; Width = 10 }
-
Unfortunately, you'll have to do that for every output property (column).
-
Also, any explicit use of a
Format-*
cmdlet invariably interferes with using the output as data - see this answer.
-
-
Associate predefined formatting data with the .NET type of your output objects and define a table view with fixed column widths.
-
This is the preferable solution - not least because it doesn't interfere with outputting data - but requires the nontrivial effort of authoring a formatting file (
*._Format.ps1xml
), which must be loaded into the session first. -
Additionally, you must make
Write-Log
either use a specific .NET type associated with your formatting data or add aPSTypeName
property to[pscustomobject]
output objects (e.g,[pscustomobject] @{ PSTypeName = 'My.Type'; foo = 1 }
)
-
<sup>[1] Strictly speaking, an initial, empty line always prints, to be followed by the actual table only after the 300-millisecond delay has elapsed.</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论