2023年6月29日 22:21:25go评论91阅读模式

英文:

Is this a good "pattern" for processing collection-based parameters which belong to parameter sets of a function?

问题

我已经写了很多年的高级函数，甚至已经写了不少模块。但有一个问题，我始终找不到答案。

让我们以 Microsoft 在 MSMQ 模块中提供的 Cmdlet 为例，并将其“重新实现”为高级 PowerShell 函数：Send-MsmqQueue。但这个函数将与 MSMQ 模块提供的函数有点不同，因为它不仅接受 $InputObject 参数的多个 MSMQ 队列，还接受 $Name 参数的多个 MSMQ 队列名称，这两个参数属于不同的参数集。（此函数的 Cmdlet 版本通常只接受 $Name 参数的单个字符串值。）我不会展示一个“完整”的重新实现，只是足够说明在这种情况下我自己常常会怎么做。（注意：另一个细微的差异是，我将使用System.Messaging命名空间中的类，而不是Microsoft.Msmq.PowerShell.Commands命名空间中 PowerShell 提供的类。因此，请隐式地假定Add-Type -AssemblyName System.Messaging在某个地方已被执行。）

function Send-MsmqQueue {
    [CmdletBinding(DefaultParameterSetName = 'Name')]
    [OutputType([Messaging.Message])]
    Param (
        [Parameter(
            Mandatory,
            ValueFromPipeline,
            ParameterSetName = 'InputObject')
        ]
        [Messaging.MessageQueue[]] $InputObject,
        [Parameter(
            Mandatory,
            ValueFromPipeline,
            ParameterSetName = 'Name')
        ]
        [string[]] $Name,
        # 下面是原始参数名，不是我起的 ;)
        [Messaging.Message] $MessageObject
        # 所有其他正常的 Send-MsmqQueue 参数都被省略了，因为它们不需要来说明我的问题。
    )
    Process {
        # 当我有上述定义的参数时，在我的 Process 块中的第一件事就是“使数据同质化”，
        # 这样我就不必在每个 foreach 循环中实现两次循环，也不必在每次循环迭代中进行分支，
        # 这可能会掩盖正在执行的主要逻辑，也就是说，我会在一开始就完成这个任务。
        #
        # 我的一个问题是，从纯粹的 PowerShell 角度来看，这是否会对性能造成任何有意义的影响？
        # （我知道，当涉及到具体的实现细节时，有无限多种编写性能低下的代码的方法，所以从纯粹的 PowerShell 角度来看，
        # 就语言设计/内部工作而言，这是否会影响性能？
        #
        # 注意：通常情况下，我不需要这种包装“强制将其转换为数组”的构造（,<array_items>），
        # 但在这种情况下，C# System.Messaging.MessageQueue 类实现了 IEnumerable，
        # PowerShell（没有帮助地）会自动进行迭代，导致队列中的消息被迭代，而不是队列本身，所以这是特定于此特定函数的实现细节。
        $Queues = (,@(
            if ($PSCmdlet.ParameterSetName -ieq 'Name') {
                # 处理当参数未通过管道传递时...
                foreach ($n in $Name) { [Messaging.MessageQueue]::new($n) }
            } else {
                $InputObject
            }
        ))
        # 我喜欢使用 'foreach (...) { ... }' 而不是 ForEach-Object，因为经常需要根据实现细节进行中断或继续，
        # 使用 ForEach-Object 结合 break/continue 会导致管道提前退出。
        foreach ($q in $Queues) {
            $q.Send($MessageObject)
            # 通常情况下，我不会返回这个值，特别是因为它没有被修改，但这是对 MSFT 的 Send-MsmqQueue 的重新实现，
            # 它返回了已发送的消息。
            $MessageObject
        }
    }
}

正如我在这个问题的开头所说，我已经写了很多函数，它们接受不同参数集的各种集合参数，这些参数可以被传递到函数中，这是我使用的模式。我希望有人可以确认，从 PowerShell 语言/风格的角度来看，这是可以接受的，或者帮助我理解为什么不应该这样做以及我应该考虑什么。

谢谢！

英文:

I've been writing advanced functions for many years now and have even written quite a few modules at this point. But there's one question for which I have never really been able to find an answer.

Let's look at a Cmdlet that Microsoft provides in the MSMQ module, as an example, and "re-implement" it as an advanced PowerShell function: Send-MsmqQueue. But this function will be a bit different than the one provided by the MSMQ module in that not only will it accept multiple MSMQ queues for the $InputObject parameter, but also multiple MSMQ queue names for the $Name parameter, where these two parameters belong to different parameter sets. (The Cmdlet version of this function normally only accepts a single string value for the $Name parameter.) I won't be showing a complete re-implementation, just enough to illustrate what I, at times, find myself doing when this situation arises. (NOTE: one other slight difference is that I will be using the classes from System.Messaging namespace instead of the PowerShell-provided ones in Microsoft.Msmq.PowerShell.Commands namespace. So assume that implicitly, somewhere, Add-Type -AssemblyName System.Messaging has been executed.)

function Send-MsmqQueue {
    [CmdletBinding(DefaultParameterSetName = &#39;Name&#39;)]
    [OutputType([Messaging.Message])]
    Param (
        [Parameter(
            Mandatory,
            ValueFromPipeline,
            ParameterSetName = &#39;InputObject&#39;)
        ]
        [Messaging.MessageQueue[]] $InputObject,
        [Parameter(
            Mandatory,
            ValueFromPipeline,
            ParameterSetName = &#39;Name&#39;)
        ]
        [string[]] $Name,
        # Below is the original parameter name, not mine ;)
        [Messaging.Message] $MessageObject
        # All other normal Send-MsmqQueue parameters elided as they are not
        # needed to illustrate the premise of my question.
    )
    Process {
        # When I have parameters defined as above, the first thing I do in my
        # Process block is &quot;homogenize&quot; the data so I don&#39;t have to implement
        # two foreach loops or do the branching on each foreach loop iteration
        # which can obscure the main logic that is being executed, i.e., I get
        # this done all &quot;up-front&quot;.
        #
        # One aspect of my question is, from purely a PowerShell perspective,
        # is this hurting performance in any meaningful way? (I know that when it
        # comes to specific implementation details, there are INFINITE ways to
        # write non-performant code, so from purely a PowerShell perspective,
        # as far as the language design/inner-workings, is this hurting
        # performance?
        #
        # NOTE: I don&#39;t normally need the wrapping &quot;force this thing to be an
        # array&quot; construct (,&lt;array_items&gt;), BUT, in this case, the C#
        # System.Messaging.MessageQueue class implements IEnumerable,
        # which PowerShell (unhelpfully) iterates over automatically, and results
        # in the messages in the queues being iterated over instead of the queues
        # themselves, so this is an implementation detail specific to this
        # particular function.
        $Queues = (,@(
            if ($PSCmdlet.ParameterSetName -ieq &#39;Name&#39;) {
                # Handle when the parameter is NOT passed by the pipeline...
                foreach ($n in $Name) { [Messaging.MessageQueue]::new($n) }
            } else {
                $InputObject
            }
        ))
        # I like using &#39;foreach (...) { ... }&#39; instead of ForEach-Object because
        # oftentimes, I will need to break or continue based on implementation
        # details, and using ForEach-Object in combination with break/continue
        # causes the pipeline to prematurely exit.
        foreach ($q in $Queues) {
            $q.Send($MessageObject)
            # Normally, I wouldn&#39;t return this, especially since it wasn&#39;t
            # modified, but this is a re-implementation of MSFT&#39;s Send-MsmqQueue,
            # and it returns the sent message.
            $MessageObject
        }
    }
}

As I stated in the introduction to this question, I have written many functions which take varying collection-based parameters belonging to different parameter sets which can be piped into the function, and this is the pattern that I use. I'm hoping someone can either confirm that this is OK from a PowerShell language/style perspective and/or help me understand why I should not do this and what I ought to consider instead.

Thank you!

答案1

得分: 2

以下是翻译好的内容：

&lt;!-- language-all: sh --&gt;
关于性能的一个基本决策是是否要**优化参数传递与管道输入**：
* 将参数声明为数组（例如`[string[]] $Name`）允许通过参数（参数值）有效传递**多个**输入对象。
* 但是，这样做会**损害管道性能**，因为每个管道输入对象都会创建一个单一元素数组，如下面的示例所示：它为通过管道传递的数组的标量字符串元素的**每个元素**输出`String[]`：
      &#39;one&#39;, &#39;two&#39; | 
        &amp; {
          param(
            [Parameter(Mandatory, ValueFromPipeline)]
            [string[]] $Name
          )
          process {
            $Name.GetType().Name # -&gt; &#39;String[]&#39; *每个*输入字符串
          }
        }
  * **注意**：为简洁起见，本答案中的所有示例都使用了[脚本块](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Script_Blocks)而不是`function`定义。也就是说，函数声明（`function foo { ... }`）后跟其调用（`... | foo`）缩短为功能上等效的`... | &amp; { ... }`
有关相关讨论，请参见[GitHub问题#4242](https://github.com/PowerShell/PowerShell/issues/4242)。
---
对于**"同质化"不同类型的参数值**，以便只需要**一个**处理循环，有两种基本的优化方式：
* **只声明一个参数**，依赖PowerShell**自动**将其他类型的值转换为该参数的类型，或者实现自动应用的**自定义转换**，从而消除了"同质化"的需要：
  * 如果参数类型具有接受其他类型的实例作为其（唯一）参数的公共单参数构造函数，或者（如果另一种类型是`[string]`）如果该类型具有带有单一`[string]`参数的静态`::Parse()`方法，则**转换是自动的**；例如：
        # 带有接受[int]值的公共单参数构造函数的示例类。
        class Foo {
          [int] $n
          Foo([int] $val) {
            $this.n = $val
          }
        }
        # [int]值（无论是通过管道提供还是作为参数提供的）
        # 自动转换为[Foo]实例
        42, 43 | &amp; {
          [CmdletBinding()]
          param(
            [Parameter(ValueFromPipeline)]
            [Foo[]] $Foo
          )
          process {
            $Foo # 诊断输出。
          }
        }
    * 在您的情况下，`[Messaging.MessageQueue]`确实具有接受字符串的公共单参数构造函数（如您的`[Messaging.MessageQueue]::new($n)`调用所示），因此您可以简单地**省略**`$Name`参数声明，依赖于将`[string]`输入的自动转换。
    * **一般警告**：
       * 这种自动转换 - 也发生在**强制转换**（例如，`[Foo[]] (0x2a, 43)`，见下文）和（很少使用的）[内置`.ForEach()`](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Arrays#foreach)的类型转换形式（例如，`(0x2a, 43).ForEach([Foo])`）中 - 相对于匹配构造函数的参数类型，它更**严格**。
      * 我对具体规则不清楚，但是使用`[double]`值，例如，通过`[Foo]::new(42.1)`可以成功（也就是说，会自动执行到`[int]`的转换），但是使用`[Foo] 42.1`和`(42.1).ForEach([Foo])`都会**失败**（后者当前会生成模糊的错误消息）。
  * 如果**自动转换不起作用**，请通过在参数上装饰一个自定义属性，该属性从抽象的[`ArgumentTransformationAttribute`](https://docs.microsoft.com/en-US/dotnet/api/System.Management.Automation.ArgumentTransformationAttribute)类派生来**实现自定义转换**，然后PowerShell会自动应用它；例如：
        using namespace System.Management.Automation
        
        # 带有接受[int]值的公共单参数构造函数的示例类。
        class Foo {
          [int] $n
          Foo([int] $val) {
            $this.n = $val
          }
        }
        # 一个示例的参数转换属性类，将可以解释为[int]的字符串转换为[Foo]实例。
        class CustomTransformationAttribute : ArgumentTransformationAttribute  {
          [object] Transform([EngineIntrinsics] $engineIntrinsics, [object] $inputData) {            
            # 注意：如果输入作为*数组参数*传递，$inputData是一个数组。
            return $(foreach ($o in $inputData) {
              if ($null -ne ($int = $o -as [int])) { [Foo]::new($int) }
              else                                 { $o }
            })
          }
        }
        
        # [string]值（无论是通过管道提供还是作为参数提供的）
        # 可以自动转换为[Foo]实例，
        # 依赖于自定义[ArgumentTransformationAttribute]派生属性。
        &#39;0x2a&#39;, &#39;43&#39; | &amp; {
          [CmdletBinding()]
          param(
            [Parameter(ValueFromPipeline)]
            [CustomTransformation()] # 这实现了自定义转换。
            [Foo[]] $Foo
          )
          process {
            $Foo # 诊断输出。
          }
        }
* 如果**确实需要*分开*的参数，请优化转换过程**：
  * 上述自动类型转换规则也
<details>
<summary>英文:</summary>
&lt;!-- language-all: sh --&gt;
A fundamental performance decision is whether you want to **optimize for _argument-passing_ vs. _pipeline input_**:
* Declaring your parameters _as arrays_ (e.g. `[string[]] $Name`) allows efficient passing of _multiple_ input objects by _argument_ (parameter value).
* However, doing so _hurts pipeline performance_, because a single-element array is then created for each every pipeline input object, as the following example demonstrates: It outputs `String[]` for _each_ of the scalar string elements of the array passed via the pipeline:
      &#39;one&#39;, &#39;two&#39; | 
        &amp; {
          param(
            [Parameter(Mandatory, ValueFromPipeline)]
            [string[]] $Name
          )
          process {
            $Name.GetType().Name # -&gt; &#39;String[]&#39; *for each* input string
          }
        }
  * **Note**: For brevity, the example above as well all others in this answer use a [script block](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Script_Blocks) in lieu of a `function` definition. That is, a function declaration (`function foo { ... }`) followed by its invocation (`... | foo`) is shortened to the functionally equivalent `... | &amp; { ... }`
See [GitHub issue #4242](https://github.com/PowerShell/PowerShell/issues/4242) for a related discussion.
---
With _array_ parameters, you indeed need to ensure element-by-element processing yourself, notably inside the `process` block if they&#39;re also _pipeline-binding_.
As for **&quot;homogenizing&quot; parameter values of different types** so that only _one_ processing loop is required, two fundamental optimizations are possible:
* **Declare only a _single_ parameter** and rely either on PowerShell to _automatically_ convert values of other types to that parameter&#39;s type, or implement an automatically applied _custom conversion_, which obviates the need for &quot;homogenizing&quot; altogether:
  * The **conversion is _automatic_** if the parameter type has a public, single-parameter constructor that accepts an instance of the other type as its (only) argument or - in case the other type is `[string]`, if the type has a static `::Parse()` method with a single `[string]` parameter; e.g.:
        # Sample class with a single-parameter
        # public constructor that accepts [int] values.
        class Foo {
          [int] $n
          Foo([int] $val) {
            $this.n = $val
          }
        }
        # [int] values (whether provided via the pipeline or as an argument)
        # auto-convert to [Foo] instances
        42, 43 | &amp; {
          [CmdletBinding()]
          param(
            [Parameter(ValueFromPipeline)]
            [Foo[]] $Foo
          )
          process {
            $Foo # Diagnostic output.
          }
        }
    * In your case, `[Messaging.MessageQueue]` _does_ have a public single-parameter constructor that accepts a string (as evidenced by your `[Messaging.MessageQueue]::new($n)` call), so you could simply _omit_ the `$Name` parameter declaration, and rely on the automatic conversion of `[string]` inputs.
    * A _general caveat_:
       * This automatic conversion - which also happens with _casts_ (e.g, `[Foo[]] (0x2a, 43)`, see below) and the (rarely used) type-conversion form of the [intrinsic `.ForEach()`](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Arrays#foreach) (e.g., `(0x2a, 43).ForEach([Foo])`) - is _stricter_ than calling a single-element constructor with respect to matching the constructor&#39;s parameter type.
      * I&#39;m unclear on the exact rules, but using a `[double]` value, for instance, succeeds with `[Foo]::new(42.1)` (that is, conversion to `[int]` is automatically performed), but *fails* with both `[Foo] 42.1` and `(42.1).ForEach([Foo])` (the latter currently produces an obscure error message).
  * If the conversion _isn&#39;t_ automatic, **implement a _custom_ conversion** that PowerShell then applies automatically, by way of decorating your parameter with a custom attribute that derives from the abstract [`ArgumentTransformationAttribute`](https://docs.microsoft.com/en-US/dotnet/api/System.Management.Automation.ArgumentTransformationAttribute) class; e.g.:
        using namespace System.Management.Automation
        
        # Sample class with a single-parameter
        # public constructor that accepts [int] values.
        class Foo {
          [int] $n
          Foo([int] $val) {
            $this.n = $val
          }
        }
        # A sample argument-conversion (transformation) attribute class that
        # converts strings that can be interpreted as [int] to [Foo] instances.
        class CustomTransformationAttribute : ArgumentTransformationAttribute  {
          [object] Transform([EngineIntrinsics] $engineIntrinsics, [object] $inputData) {            
            # Note: If the inputs were passed as an *array argument*, $inputData is an array.
            return $(foreach ($o in $inputData) {
              if ($null -ne ($int = $o -as [int])) { [Foo]::new($int) }
              else                                 { $o }
            })
          }
        }
        
        # [string] values (whether provided via the pipeline or as an argument)
        # that can be interpreted as [int] now auto-convert to [Foo] instances,
        #  thanks to the custom [ArgumentTransformationAttribute]-derived attribute.
        &#39;0x2a&#39;, &#39;43&#39; | &amp; {
          [CmdletBinding()]
          param(
            [Parameter(ValueFromPipeline)]
            [CustomTransformation()] # This implements the custom transformation.
            [Foo[]] $Foo
          )
          process {
            $Foo # Diagnostic output.
          }
        }
* If you *do* want ***separate* parameters, optimize the conversion process**:
  * The auto type-conversion rules described above also apply to _explicit casts_ (including support for _arrays_ of values), so you can simplify your code as follows:
        if ($PSCmdlet.ParameterSetName -eq &#39;Name&#39;) {
          # Simply use an array cast.
          $Queues = [Messaging.MessageQueue[]] $Name
        } else {
          $Queues = $InputObject
        }
  * In cases where element-by-element construction to effect conversion is required:
        if ($PSCmdlet.ParameterSetName -eq &#39;Name&#39;) {
          # Note the &quot;,&quot;
          $Queues = foreach ($n in $Name) { , [Messaging.MessageQueue]::new($n) }
        } else {
          $Queues = $InputObject
        }
    * Note the use of the unary form of `,` the [array constructor (&quot;comma&quot;) operator](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Operators#comma-operator-), as in your attempt, albeit:
      * _inside_ the `foreach` loop, and
      * _without_ `@(...)` enclosure of the object to wrap in a single-element array, as `@(...)` _itself_ would trigger enumeration.
      * While `Write-Output -NoEnumerate ([Messaging.MessageQueue]::new($n))`, as shown in Mathias&#39; answer works too, it is _slower_. It comes down to a tradeoff between performance / concision vs. readability / signaling the intent explicitly.
     * The need to wrap _each_ [`[System.Messaging.MessageQueue]`](https://learn.microsoft.com/en-us/dotnet/api/system.messaging.messagequeue) instance in an aux. single-element wrapper with unary `,` / to use `Write-Output -NoEnumerate` stems from the fact that this type implements the [`System.Collections.IEnumerable`](https://learn.microsoft.com/en-US/dotnet/api/System.Collections.IEnumerable) interface, which means that PowerShell automatically _enumerates_ instances of the type by default.&lt;sup&gt;[1]&lt;/sup&gt; Applying either technique ensures that the `[System.Messaging.MessageQueue]` is output _as a whole_ to the pipeline (for details, see [this answer](https://stackoverflow.com/a/48360724/45375)).
       * Note that this is _not_ necessary in the first snippet, because `$Queues = [Messaging.MessageQueue[]] $Name` is an _expression_, to which automatic enumeration does _not_ apply.
       * The above also implies that you need the same technique if you want to pass a _single_ `[System.Messaging.MessageQueue]` instance or a *single-element* array containing such an instance _via the pipeline_; e.g.:
             # !! Without `,` this command would *break*, because
             # !! PowerShell would try to enumerate the elements of the queue
             # !! which fails with an empty one.
             , [System.Messaging.MessageQueue]::new(&#39;foo&#39;) | Get-Member
     * By *not* using an `if` statement as a single *assignment expression* (`$Queue = if ...`) and instead assigning to `$Queue` in the _branches_ of the `if` statement, you additionally prevent subjecting `$InputObject` to unnecessary enumeration.
---
&lt;sup&gt;[1] There are some exceptions, notably strings and dictionaries. See the bottom section of [this answer](https://stackoverflow.com/a/65530467/45375) for details.&lt;/sup&gt;
</details>
# 答案2
**得分**: 1
这种模式（根据选择的参数集“同质化”输入实体）是完全有效的，并且在我个人看来至少构成了良好的参数设计。
话虽如此，你可能希望使用 `Write-Output -NoEnumerate` 来避免笨拙的 `,@(...)` 解包封包数组的技巧：
```powershell
if ($PSCmdlet.ParameterSetName -ieq 'Name') {
    # 当参数未通过管道传递时处理...
    $Queues = foreach ($n in $Name) {
        $queue = [Messaging.MessageQueue]::new($n)
        Write-Output $queue -NoEnumerate
    }
}
else {
    # 输入已经是 [MessageQueue[]]，完全避免管道边界
    $Queues = $InputObject 
}

英文:

This pattern ("homogenizing" the input entities based on chosen parameter set) is perfectly valid, and constitutes - in my personal opinion at least - good parameter design.

That being said, you might want to use Write-Output -NoEnumerate to avoid the clunky ,@(...) unwrapped-wrapped-array unpacking trick:

if ($PSCmdlet.ParameterSetName -ieq &#39;Name&#39;) {
# Handle when the parameter is NOT passed by the pipeline...
$Queues = foreach ($n in $Name) {
$queue = [Messaging.MessageQueue]::new($n)
Write-Output $queue -NoEnumerate
}
}
else {
# Input is already [MessageQueue[]], avoid pipeline boundaries entirely
$Queues = $InputObject 
}

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

这是处理函数参数集中属于集合的参数的良好“模式”吗？

问题

答案1

Powershell搜索文件并输出找到的项目以及文件名。

Powershell Linq 查询，当存在多个相同名称的元素时

如何向正在使用会话主机的用户发送RdsUserSessionMessage？

添加 NSLookup 到 Ping 脚本

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。