获取多行文本中两个字符串之间的内容

huangapple go评论93阅读模式
英文:

Powershell - get content between 2 strings in multiple lines

问题

Desired output:

  1. <media src="C:\Users\user\Downloads\Katzianerova vojna.mp3"/>
  2. <media src="C:\Users\user\Downloads\Rat i mir u povijesti III- dio.mp3"/>

Actual output:

  1. <media src="C:\Users\user\Downloads\Katzianerova vojna.mp3"/> <media src="C:\Users\user\Downloads\Rat i mir u povijesti III- dio.m
  2. p3"/>
英文:

I have file 1.wpl:

  1. <?wpl version="1.0"?>
  2. <smil>
  3. <head>
  4. <meta name="Generator" content="Microsoft Windows Media Player -- 12.0.22621.1"/>
  5. <meta name="ItemCount" content="2"/>
  6. <title>Untitled playlist</title>
  7. </head>
  8. <body>
  9. <seq>
  10. <media src="C:\Users\user\Downloads\Katzianerova vojna.mp3"/>
  11. <media src="C:\Users\user\Downloads\Rat i mir u povijesti III- dio.mp3"/>
  12. </seq>
  13. </body>
  14. </smil>

I want to get content between <seq> and </seq> in multiple lines:

Desired output:

  1. <media src="C:\Users\user\Downloads\Katzianerova vojna.mp3"/>
  2. <media src="C:\Users\user\Downloads\Rat i mir u povijesti III- dio.mp3"/>

Have this code which gives me output in single line:

  1. $fileName = "C:\Users\user\Music\Playlists.wpl"
  2. #Get content from file
  3. $file = Get-Content $fileName
  4. #Regex pattern to compare two strings
  5. $pattern = "<seq>(.*?)</seq>"
  6. #Perform the opperation
  7. $results = [regex]::Match($file,$pattern).Groups[1].Value -split [System.Environment]::NewLine
  8. return $results

Actual output:

  1. <media src="C:\Users\user\Downloads\Katzianerova vojna.mp3"/> <media src="C:\Users\user\Downloads\Rat i mir u povijesti III- dio.m
  2. p3"/>

答案1

得分: 2

没有使用正则表达式的理由,因为您的内容是有效的XML:

  1. ($xml = [xml]::new()).Load('C:\Users\user\Music\Playlists.wpl')
  2. $xml.SelectNodes('smil/body/seq/media') | ForEach-Object OuterXml
  3. # 输出:
  4. # <media src="C:\Users\user\Downloads\Katzianerova vojna.mp3" />
  5. # <media src="C:\Users\user\Downloads\Rat i mir u povijesti III- dio.mp3" />

或者在XPath中使用通配符:

  1. ($xml = [xml]::new()).Load('C:\Users\user\Music\Playlists.wpl')
  2. $xml.SelectNodes("//seq/*") | ForEach-Object OuterXml
英文:

There is literally no reason to use regex when what you have is valid XML:

  1. ($xml = [xml]::new()).Load(&#39;C:\Users\user\Music\Playlists.wpl&#39;)
  2. $xml.SelectNodes(&#39;smil/body/seq/media&#39;) | ForEach-Object OuterXml
  3. # Outputs:
  4. # &lt;media src=&quot;C:\Users\user\Downloads\Katzianerova vojna.mp3&quot; /&gt;
  5. # &lt;media src=&quot;C:\Users\user\Downloads\Rat i mir u povijesti III- dio.mp3&quot; /&gt;

Or using wildcard in the XPath:

  1. ($xml = [xml]::new()).Load(&#39;C:\Users\user\Music\Playlists.wpl&#39;)
  2. $xml.SelectNodes(&quot;//seq/*&quot;) | ForEach-Object OuterXml

答案2

得分: 1

你可以尝试使用一个 "switch"

  1. Get-Content -Path "C:\Users\user\Music\Playlists.wpl" |
  2. ForEach-Object {
  3. switch -Regex ($_) {
  4. '\&lt;seq\&gt;$' {
  5. $break = 1
  6. }
  7. '\&lt;\/seq\&gt;$' {
  8. $break = 0
  9. }
  10. default {
  11. if ($break -eq 1) {
  12. $_ -replace '^\s+'
  13. }
  14. }
  15. }
  16. }
英文:

You could try using a "switch"

  1. Get-Content -Path &quot;C:\Users\user\Music\Playlists.wpl&quot; |
  2. ForEach-Object {
  3. switch -Regex ($_) {
  4. &#39;\&lt;seq\&gt;$&#39; {
  5. $break = 1
  6. }
  7. &#39;\&lt;\/seq\&gt;$&#39; {
  8. $break = 0
  9. }
  10. default {
  11. if ($break -eq 1) {
  12. $_ -replace &#39;^\s+&#39;
  13. }
  14. }
  15. }
  16. }

答案3

得分: 0

找到解决方案:

  1. $fileContent = Get-Content -Path "C:\Users\user\Music\Playlists.wpl" -Raw
  2. $regexPattern = "(?s)<seq>(.*?)</seq>"
  3. $matches = [regex]::Match($fileContent, $regexPattern)
  4. if ($matches.Success) {
  5. $seqContent = $matches.Groups[1].Value
  6. $lines = $seqContent -split "`n"
  7. $output = ($lines | Where-Object { $_.Trim() -ne '' }) -join "`n"
  8. Write-Output $output
  9. } else {
  10. Write-Output "在文件中未找到 <seq> 内容。"
  11. }
英文:

found solution:

  1. $fileContent = Get-Content -Path &quot;C:\Users\user\Music\Playlists.wpl&quot; -Raw
  2. $regexPattern = &quot;(?s)&lt;seq&gt;(.*?)&lt;/seq&gt;&quot;
  3. $matches = [regex]::Match($fileContent, $regexPattern)
  4. if ($matches.Success) {
  5. $seqContent = $matches.Groups[1].Value
  6. $lines = $seqContent -split &quot;`n&quot;
  7. $output = ($lines | Where-Object { $_.Trim() -ne &#39;&#39; }) -join &quot;`n&quot;
  8. Write-Output $output
  9. } else {
  10. Write-Output &quot;No &lt;seq&gt; content found in the file.&quot;
  11. }

huangapple
  • 本文由 发表于 2023年6月16日 01:59:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/76484360.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定