这是处理函数参数集中属于集合的参数的良好“模式”吗?

huangapple go评论91阅读模式
英文:

Is this a good "pattern" for processing collection-based parameters which belong to parameter sets of a function?

问题

我已经写了很多年的高级函数,甚至已经写了不少模块。但有一个问题,我始终找不到答案。

让我们以 Microsoft 在 MSMQ 模块中提供的 Cmdlet 为例,并将其“重新实现”为高级 PowerShell 函数:Send-MsmqQueue。但这个函数将与 MSMQ 模块提供的函数有点不同,因为它不仅接受 $InputObject 参数的多个 MSMQ 队列,还接受 $Name 参数的多个 MSMQ 队列名称,这两个参数属于不同的参数集。 (此函数的 Cmdlet 版本通常只接受 $Name 参数的单个字符串值。)我不会展示一个“完整”的重新实现,只是足够说明在这种情况下我自己常常会怎么做。 (注意:另一个细微的差异是,我将使用System.Messaging命名空间中的类,而不是Microsoft.Msmq.PowerShell.Commands命名空间中 PowerShell 提供的类。因此,请隐式地假定Add-Type -AssemblyName System.Messaging在某个地方已被执行。)

  1. function Send-MsmqQueue {
  2. [CmdletBinding(DefaultParameterSetName = 'Name')]
  3. [OutputType([Messaging.Message])]
  4. Param (
  5. [Parameter(
  6. Mandatory,
  7. ValueFromPipeline,
  8. ParameterSetName = 'InputObject')
  9. ]
  10. [Messaging.MessageQueue[]] $InputObject,
  11. [Parameter(
  12. Mandatory,
  13. ValueFromPipeline,
  14. ParameterSetName = 'Name')
  15. ]
  16. [string[]] $Name,
  17. # 下面是原始参数名,不是我起的 ;)
  18. [Messaging.Message] $MessageObject
  19. # 所有其他正常的 Send-MsmqQueue 参数都被省略了,因为它们不需要来说明我的问题。
  20. )
  21. Process {
  22. # 当我有上述定义的参数时,在我的 Process 块中的第一件事就是“使数据同质化”,
  23. # 这样我就不必在每个 foreach 循环中实现两次循环,也不必在每次循环迭代中进行分支,
  24. # 这可能会掩盖正在执行的主要逻辑,也就是说,我会在一开始就完成这个任务。
  25. #
  26. # 我的一个问题是,从纯粹的 PowerShell 角度来看,这是否会对性能造成任何有意义的影响?
  27. # (我知道,当涉及到具体的实现细节时,有无限多种编写性能低下的代码的方法,所以从纯粹的 PowerShell 角度来看,
  28. # 就语言设计/内部工作而言,这是否会影响性能?
  29. #
  30. # 注意:通常情况下,我不需要这种包装“强制将其转换为数组”的构造(,<array_items>),
  31. # 但在这种情况下,C# System.Messaging.MessageQueue 类实现了 IEnumerable,
  32. # PowerShell(没有帮助地)会自动进行迭代,导致队列中的消息被迭代,而不是队列本身,所以这是特定于此特定函数的实现细节。
  33. $Queues = (,@(
  34. if ($PSCmdlet.ParameterSetName -ieq 'Name') {
  35. # 处理当参数未通过管道传递时...
  36. foreach ($n in $Name) { [Messaging.MessageQueue]::new($n) }
  37. } else {
  38. $InputObject
  39. }
  40. ))
  41. # 我喜欢使用 'foreach (...) { ... }' 而不是 ForEach-Object,因为经常需要根据实现细节进行中断或继续,
  42. # 使用 ForEach-Object 结合 break/continue 会导致管道提前退出。
  43. foreach ($q in $Queues) {
  44. $q.Send($MessageObject)
  45. # 通常情况下,我不会返回这个值,特别是因为它没有被修改,但这是对 MSFT 的 Send-MsmqQueue 的重新实现,
  46. # 它返回了已发送的消息。
  47. $MessageObject
  48. }
  49. }
  50. }

正如我在这个问题的开头所说,我已经写了很多函数,它们接受不同参数集的各种集合参数,这些参数可以被传递到函数中,这是我使用的模式。我希望有人可以确认,从 PowerShell 语言/风格的角度来看,这是可以接受的,或者帮助我理解为什么不应该这样做以及我应该考虑什么。

谢谢!

英文:

I've been writing advanced functions for many years now and have even written quite a few modules at this point. But there's one question for which I have never really been able to find an answer.

Let's look at a Cmdlet that Microsoft provides in the MSMQ module, as an example, and "re-implement" it as an advanced PowerShell function: Send-MsmqQueue. But this function will be a bit different than the one provided by the MSMQ module in that not only will it accept multiple MSMQ queues for the $InputObject parameter, but also multiple MSMQ queue names for the $Name parameter, where these two parameters belong to different parameter sets. (The Cmdlet version of this function normally only accepts a single string value for the $Name parameter.) I won't be showing a complete re-implementation, just enough to illustrate what I, at times, find myself doing when this situation arises. (NOTE: one other slight difference is that I will be using the classes from System.Messaging namespace instead of the PowerShell-provided ones in Microsoft.Msmq.PowerShell.Commands namespace. So assume that implicitly, somewhere, Add-Type -AssemblyName System.Messaging has been executed.)

  1. function Send-MsmqQueue {
  2. [CmdletBinding(DefaultParameterSetName = &#39;Name&#39;)]
  3. [OutputType([Messaging.Message])]
  4. Param (
  5. [Parameter(
  6. Mandatory,
  7. ValueFromPipeline,
  8. ParameterSetName = &#39;InputObject&#39;)
  9. ]
  10. [Messaging.MessageQueue[]] $InputObject,
  11. [Parameter(
  12. Mandatory,
  13. ValueFromPipeline,
  14. ParameterSetName = &#39;Name&#39;)
  15. ]
  16. [string[]] $Name,
  17. # Below is the original parameter name, not mine ;)
  18. [Messaging.Message] $MessageObject
  19. # All other normal Send-MsmqQueue parameters elided as they are not
  20. # needed to illustrate the premise of my question.
  21. )
  22. Process {
  23. # When I have parameters defined as above, the first thing I do in my
  24. # Process block is &quot;homogenize&quot; the data so I don&#39;t have to implement
  25. # two foreach loops or do the branching on each foreach loop iteration
  26. # which can obscure the main logic that is being executed, i.e., I get
  27. # this done all &quot;up-front&quot;.
  28. #
  29. # One aspect of my question is, from purely a PowerShell perspective,
  30. # is this hurting performance in any meaningful way? (I know that when it
  31. # comes to specific implementation details, there are INFINITE ways to
  32. # write non-performant code, so from purely a PowerShell perspective,
  33. # as far as the language design/inner-workings, is this hurting
  34. # performance?
  35. #
  36. # NOTE: I don&#39;t normally need the wrapping &quot;force this thing to be an
  37. # array&quot; construct (,&lt;array_items&gt;), BUT, in this case, the C#
  38. # System.Messaging.MessageQueue class implements IEnumerable,
  39. # which PowerShell (unhelpfully) iterates over automatically, and results
  40. # in the messages in the queues being iterated over instead of the queues
  41. # themselves, so this is an implementation detail specific to this
  42. # particular function.
  43. $Queues = (,@(
  44. if ($PSCmdlet.ParameterSetName -ieq &#39;Name&#39;) {
  45. # Handle when the parameter is NOT passed by the pipeline...
  46. foreach ($n in $Name) { [Messaging.MessageQueue]::new($n) }
  47. } else {
  48. $InputObject
  49. }
  50. ))
  51. # I like using &#39;foreach (...) { ... }&#39; instead of ForEach-Object because
  52. # oftentimes, I will need to break or continue based on implementation
  53. # details, and using ForEach-Object in combination with break/continue
  54. # causes the pipeline to prematurely exit.
  55. foreach ($q in $Queues) {
  56. $q.Send($MessageObject)
  57. # Normally, I wouldn&#39;t return this, especially since it wasn&#39;t
  58. # modified, but this is a re-implementation of MSFT&#39;s Send-MsmqQueue,
  59. # and it returns the sent message.
  60. $MessageObject
  61. }
  62. }
  63. }

As I stated in the introduction to this question, I have written many functions which take varying collection-based parameters belonging to different parameter sets which can be piped into the function, and this is the pattern that I use. I'm hoping someone can either confirm that this is OK from a PowerShell language/style perspective and/or help me understand why I should not do this and what I ought to consider instead.

Thank you!

答案1

得分: 2

以下是翻译好的内容:

  1. &lt;!-- language-all: sh --&gt;
  2. 关于性能的一个基本决策是是否要**优化参数传递与管道输入**:
  3. * 将参数声明为数组(例如`[string[]] $Name`)允许通过参数(参数值)有效传递**多个**输入对象。
  4. * 但是,这样做会**损害管道性能**,因为每个管道输入对象都会创建一个单一元素数组,如下面的示例所示:它为通过管道传递的数组的标量字符串元素的**每个元素**输出`String[]`
  5. &#39;one&#39;, &#39;two&#39; |
  6. &amp; {
  7. param(
  8. [Parameter(Mandatory, ValueFromPipeline)]
  9. [string[]] $Name
  10. )
  11. process {
  12. $Name.GetType().Name # -&gt; &#39;String[]&#39; *每个*输入字符串
  13. }
  14. }
  15. * **注意**:为简洁起见,本答案中的所有示例都使用了[脚本块](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Script_Blocks)而不是`function`定义。也就是说,函数声明(`function foo { ... }`)后跟其调用(`... | foo`)缩短为功能上等效的`... | &amp; { ... }`
  16. 有关相关讨论,请参见[GitHub问题#4242](https://github.com/PowerShell/PowerShell/issues/4242)
  17. ---
  18. 对于**"同质化"不同类型的参数值**,以便只需要**一个**处理循环,有两种基本的优化方式:
  19. * **只声明一个参数**,依赖PowerShell**自动**将其他类型的值转换为该参数的类型,或者实现自动应用的**自定义转换**,从而消除了"同质化"的需要:
  20. * 如果参数类型具有接受其他类型的实例作为其(唯一)参数的公共单参数构造函数,或者(如果另一种类型是`[string]`)如果该类型具有带有单一`[string]`参数的静态`::Parse()`方法,则**转换是自动的**;例如:
  21. # 带有接受[int]值的公共单参数构造函数的示例类。
  22. class Foo {
  23. [int] $n
  24. Foo([int] $val) {
  25. $this.n = $val
  26. }
  27. }
  28. # [int]值(无论是通过管道提供还是作为参数提供的)
  29. # 自动转换为[Foo]实例
  30. 42, 43 | &amp; {
  31. [CmdletBinding()]
  32. param(
  33. [Parameter(ValueFromPipeline)]
  34. [Foo[]] $Foo
  35. )
  36. process {
  37. $Foo # 诊断输出。
  38. }
  39. }
  40. * 在您的情况下,`[Messaging.MessageQueue]`确实具有接受字符串的公共单参数构造函数(如您的`[Messaging.MessageQueue]::new($n)`调用所示),因此您可以简单地**省略**`$Name`参数声明,依赖于将`[string]`输入的自动转换。
  41. * **一般警告**:
  42. * 这种自动转换 - 也发生在**强制转换**(例如,`[Foo[]] (0x2a, 43)`,见下文)和(很少使用的)[内置`.ForEach()`](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Arrays#foreach)的类型转换形式(例如,`(0x2a, 43).ForEach([Foo])`)中 - 相对于匹配构造函数的参数类型,它更**严格**。
  43. * 我对具体规则不清楚,但是使用`[double]`值,例如,通过`[Foo]::new(42.1)`可以成功(也就是说,会自动执行到`[int]`的转换),但是使用`[Foo] 42.1``(42.1).ForEach([Foo])`都会**失败**(后者当前会生成模糊的错误消息)。
  44. * 如果**自动转换不起作用**,请通过在参数上装饰一个自定义属性,该属性从抽象的[`ArgumentTransformationAttribute`](https://docs.microsoft.com/en-US/dotnet/api/System.Management.Automation.ArgumentTransformationAttribute)类派生来**实现自定义转换**,然后PowerShell会自动应用它;例如:
  45. using namespace System.Management.Automation
  46. # 带有接受[int]值的公共单参数构造函数的示例类。
  47. class Foo {
  48. [int] $n
  49. Foo([int] $val) {
  50. $this.n = $val
  51. }
  52. }
  53. # 一个示例的参数转换属性类,将可以解释为[int]的字符串转换为[Foo]实例。
  54. class CustomTransformationAttribute : ArgumentTransformationAttribute {
  55. [object] Transform([EngineIntrinsics] $engineIntrinsics, [object] $inputData) {
  56. # 注意:如果输入作为*数组参数*传递,$inputData是一个数组。
  57. return $(foreach ($o in $inputData) {
  58. if ($null -ne ($int = $o -as [int])) { [Foo]::new($int) }
  59. else { $o }
  60. })
  61. }
  62. }
  63. # [string]值(无论是通过管道提供还是作为参数提供的)
  64. # 可以自动转换为[Foo]实例,
  65. # 依赖于自定义[ArgumentTransformationAttribute]派生属性。
  66. &#39;0x2a&#39;, &#39;43&#39; | &amp; {
  67. [CmdletBinding()]
  68. param(
  69. [Parameter(ValueFromPipeline)]
  70. [CustomTransformation()] # 这实现了自定义转换。
  71. [Foo[]] $Foo
  72. )
  73. process {
  74. $Foo # 诊断输出。
  75. }
  76. }
  77. * 如果**确实需要*分开*的参数,请优化转换过程**:
  78. * 上述自动类型转换规则也
  79. <details>
  80. <summary>英文:</summary>
  81. &lt;!-- language-all: sh --&gt;
  82. A fundamental performance decision is whether you want to **optimize for _argument-passing_ vs. _pipeline input_**:
  83. * Declaring your parameters _as arrays_ (e.g. `[string[]] $Name`) allows efficient passing of _multiple_ input objects by _argument_ (parameter value).
  84. * However, doing so _hurts pipeline performance_, because a single-element array is then created for each every pipeline input object, as the following example demonstrates: It outputs `String[]` for _each_ of the scalar string elements of the array passed via the pipeline:
  85. &#39;one&#39;, &#39;two&#39; |
  86. &amp; {
  87. param(
  88. [Parameter(Mandatory, ValueFromPipeline)]
  89. [string[]] $Name
  90. )
  91. process {
  92. $Name.GetType().Name # -&gt; &#39;String[]&#39; *for each* input string
  93. }
  94. }
  95. * **Note**: For brevity, the example above as well all others in this answer use a [script block](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Script_Blocks) in lieu of a `function` definition. That is, a function declaration (`function foo { ... }`) followed by its invocation (`... | foo`) is shortened to the functionally equivalent `... | &amp; { ... }`
  96. See [GitHub issue #4242](https://github.com/PowerShell/PowerShell/issues/4242) for a related discussion.
  97. ---
  98. With _array_ parameters, you indeed need to ensure element-by-element processing yourself, notably inside the `process` block if they&#39;re also _pipeline-binding_.
  99. As for **&quot;homogenizing&quot; parameter values of different types** so that only _one_ processing loop is required, two fundamental optimizations are possible:
  100. * **Declare only a _single_ parameter** and rely either on PowerShell to _automatically_ convert values of other types to that parameter&#39;s type, or implement an automatically applied _custom conversion_, which obviates the need for &quot;homogenizing&quot; altogether:
  101. * The **conversion is _automatic_** if the parameter type has a public, single-parameter constructor that accepts an instance of the other type as its (only) argument or - in case the other type is `[string]`, if the type has a static `::Parse()` method with a single `[string]` parameter; e.g.:
  102. # Sample class with a single-parameter
  103. # public constructor that accepts [int] values.
  104. class Foo {
  105. [int] $n
  106. Foo([int] $val) {
  107. $this.n = $val
  108. }
  109. }
  110. # [int] values (whether provided via the pipeline or as an argument)
  111. # auto-convert to [Foo] instances
  112. 42, 43 | &amp; {
  113. [CmdletBinding()]
  114. param(
  115. [Parameter(ValueFromPipeline)]
  116. [Foo[]] $Foo
  117. )
  118. process {
  119. $Foo # Diagnostic output.
  120. }
  121. }
  122. * In your case, `[Messaging.MessageQueue]` _does_ have a public single-parameter constructor that accepts a string (as evidenced by your `[Messaging.MessageQueue]::new($n)` call), so you could simply _omit_ the `$Name` parameter declaration, and rely on the automatic conversion of `[string]` inputs.
  123. * A _general caveat_:
  124. * This automatic conversion - which also happens with _casts_ (e.g, `[Foo[]] (0x2a, 43)`, see below) and the (rarely used) type-conversion form of the [intrinsic `.ForEach()`](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Arrays#foreach) (e.g., `(0x2a, 43).ForEach([Foo])`) - is _stricter_ than calling a single-element constructor with respect to matching the constructor&#39;s parameter type.
  125. * I&#39;m unclear on the exact rules, but using a `[double]` value, for instance, succeeds with `[Foo]::new(42.1)` (that is, conversion to `[int]` is automatically performed), but *fails* with both `[Foo] 42.1` and `(42.1).ForEach([Foo])` (the latter currently produces an obscure error message).
  126. * If the conversion _isn&#39;t_ automatic, **implement a _custom_ conversion** that PowerShell then applies automatically, by way of decorating your parameter with a custom attribute that derives from the abstract [`ArgumentTransformationAttribute`](https://docs.microsoft.com/en-US/dotnet/api/System.Management.Automation.ArgumentTransformationAttribute) class; e.g.:
  127. using namespace System.Management.Automation
  128. # Sample class with a single-parameter
  129. # public constructor that accepts [int] values.
  130. class Foo {
  131. [int] $n
  132. Foo([int] $val) {
  133. $this.n = $val
  134. }
  135. }
  136. # A sample argument-conversion (transformation) attribute class that
  137. # converts strings that can be interpreted as [int] to [Foo] instances.
  138. class CustomTransformationAttribute : ArgumentTransformationAttribute {
  139. [object] Transform([EngineIntrinsics] $engineIntrinsics, [object] $inputData) {
  140. # Note: If the inputs were passed as an *array argument*, $inputData is an array.
  141. return $(foreach ($o in $inputData) {
  142. if ($null -ne ($int = $o -as [int])) { [Foo]::new($int) }
  143. else { $o }
  144. })
  145. }
  146. }
  147. # [string] values (whether provided via the pipeline or as an argument)
  148. # that can be interpreted as [int] now auto-convert to [Foo] instances,
  149. # thanks to the custom [ArgumentTransformationAttribute]-derived attribute.
  150. &#39;0x2a&#39;, &#39;43&#39; | &amp; {
  151. [CmdletBinding()]
  152. param(
  153. [Parameter(ValueFromPipeline)]
  154. [CustomTransformation()] # This implements the custom transformation.
  155. [Foo[]] $Foo
  156. )
  157. process {
  158. $Foo # Diagnostic output.
  159. }
  160. }
  161. * If you *do* want ***separate* parameters, optimize the conversion process**:
  162. * The auto type-conversion rules described above also apply to _explicit casts_ (including support for _arrays_ of values), so you can simplify your code as follows:
  163. if ($PSCmdlet.ParameterSetName -eq &#39;Name&#39;) {
  164. # Simply use an array cast.
  165. $Queues = [Messaging.MessageQueue[]] $Name
  166. } else {
  167. $Queues = $InputObject
  168. }
  169. * In cases where element-by-element construction to effect conversion is required:
  170. if ($PSCmdlet.ParameterSetName -eq &#39;Name&#39;) {
  171. # Note the &quot;,&quot;
  172. $Queues = foreach ($n in $Name) { , [Messaging.MessageQueue]::new($n) }
  173. } else {
  174. $Queues = $InputObject
  175. }
  176. * Note the use of the unary form of `,` the [array constructor (&quot;comma&quot;) operator](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_Operators#comma-operator-), as in your attempt, albeit:
  177. * _inside_ the `foreach` loop, and
  178. * _without_ `@(...)` enclosure of the object to wrap in a single-element array, as `@(...)` _itself_ would trigger enumeration.
  179. * While `Write-Output -NoEnumerate ([Messaging.MessageQueue]::new($n))`, as shown in Mathias&#39; answer works too, it is _slower_. It comes down to a tradeoff between performance / concision vs. readability / signaling the intent explicitly.
  180. * The need to wrap _each_ [`[System.Messaging.MessageQueue]`](https://learn.microsoft.com/en-us/dotnet/api/system.messaging.messagequeue) instance in an aux. single-element wrapper with unary `,` / to use `Write-Output -NoEnumerate` stems from the fact that this type implements the [`System.Collections.IEnumerable`](https://learn.microsoft.com/en-US/dotnet/api/System.Collections.IEnumerable) interface, which means that PowerShell automatically _enumerates_ instances of the type by default.&lt;sup&gt;[1]&lt;/sup&gt; Applying either technique ensures that the `[System.Messaging.MessageQueue]` is output _as a whole_ to the pipeline (for details, see [this answer](https://stackoverflow.com/a/48360724/45375)).
  181. * Note that this is _not_ necessary in the first snippet, because `$Queues = [Messaging.MessageQueue[]] $Name` is an _expression_, to which automatic enumeration does _not_ apply.
  182. * The above also implies that you need the same technique if you want to pass a _single_ `[System.Messaging.MessageQueue]` instance or a *single-element* array containing such an instance _via the pipeline_; e.g.:
  183. # !! Without `,` this command would *break*, because
  184. # !! PowerShell would try to enumerate the elements of the queue
  185. # !! which fails with an empty one.
  186. , [System.Messaging.MessageQueue]::new(&#39;foo&#39;) | Get-Member
  187. * By *not* using an `if` statement as a single *assignment expression* (`$Queue = if ...`) and instead assigning to `$Queue` in the _branches_ of the `if` statement, you additionally prevent subjecting `$InputObject` to unnecessary enumeration.
  188. ---
  189. &lt;sup&gt;[1] There are some exceptions, notably strings and dictionaries. See the bottom section of [this answer](https://stackoverflow.com/a/65530467/45375) for details.&lt;/sup&gt;
  190. </details>
  191. # 答案2
  192. **得分**: 1
  193. 这种模式(根据选择的参数集“同质化”输入实体)是完全有效的,并且在我个人看来至少构成了良好的参数设计。
  194. 话虽如此,你可能希望使用 `Write-Output -NoEnumerate` 来避免笨拙的 `,@(...)` 解包封包数组的技巧:
  195. ```powershell
  196. if ($PSCmdlet.ParameterSetName -ieq 'Name') {
  197. # 当参数未通过管道传递时处理...
  198. $Queues = foreach ($n in $Name) {
  199. $queue = [Messaging.MessageQueue]::new($n)
  200. Write-Output $queue -NoEnumerate
  201. }
  202. }
  203. else {
  204. # 输入已经是 [MessageQueue[]],完全避免管道边界
  205. $Queues = $InputObject
  206. }
英文:

This pattern ("homogenizing" the input entities based on chosen parameter set) is perfectly valid, and constitutes - in my personal opinion at least - good parameter design.

That being said, you might want to use Write-Output -NoEnumerate to avoid the clunky ,@(...) unwrapped-wrapped-array unpacking trick:

  1. if ($PSCmdlet.ParameterSetName -ieq &#39;Name&#39;) {
  2. # Handle when the parameter is NOT passed by the pipeline...
  3. $Queues = foreach ($n in $Name) {
  4. $queue = [Messaging.MessageQueue]::new($n)
  5. Write-Output $queue -NoEnumerate
  6. }
  7. }
  8. else {
  9. # Input is already [MessageQueue[]], avoid pipeline boundaries entirely
  10. $Queues = $InputObject
  11. }

huangapple
  • 本文由 发表于 2023年6月29日 22:21:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76581956.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定