    ;BP;7165378;XX_RAW;200SSS952;EU-PL;PL02;PL02;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;15:00:00;;;;Jhon Name;;;;;;;;9444253;;;;;;;;;;;;;"Jhon Name";;;;;;;;;;Jhon Name;;;;;;;;Final Check Approved;;;;;;;;;09.01.2023;;;;;Approve;;;;;;12077;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;


    $content = Get-Content -path "C:\Users\TUF17\Desktop\File\Fix\xx_fix_temp.csv" 
    $content -Replace '"\R(?!;)"', ' ' | Out-File "C:\Users\TUF17\Desktop\File\Fix\xx_noenters.csv"



I have huge csv file with data, and some of lines are incorrect and contains enters. When file is imported into Excel then I need to correct hundreds lines manually. I have regex which is work in Notepad++ and remove enters from line which is not start with specific string in this case ";" However same regex is not working in PowerShell script.

Example of input

15:00:00;;;;Jhon Name;;;;;;;;9444253;;;;;;;;;;;;;"Jhon Name";;;;;;;;;;Jhon Name;;;;;;;;Final Check Approved;;;;;;;;;09.01.2023;;;;;Approve;;;;;;12077;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

How it should look:

;BP;7165378;XX_RAW;200SSS952;EU-PL;PL02;PL02;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;15:00:00;;;;Jhon Name;;;;;;;;9444253;;;;;;;;;;;;;"Jhon Name";;;;;;;;;;Jhon Name;;;;;;;;Final Check Approved;;;;;;;;;09.01.2023;;;;;Approve;;;;;;12077;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;


$content = Get-Content -path "C:\Users\TUF17\Desktop\File\Fix\xx_fix_temp.csv" 
$content -Replace '"\R(?!;)"', ' ' |  Out-File "C:\Users\TUF17\Desktop\File\Fix\xx_noenters.csv" 


It has to do with line continuation \ in your ps script.

I would also suggest adding -Raw if you want to get content of file as single string, rather than an array of strings, for easier replacing.

I'm assuming it's a .csv file you are using.

$content = Get-Content -Path "C:\Users\TUF17\Desktop\File\Fix\xx_fix_temp.csv" -Raw
$content -Replace '(?m)(^[^;].*)\r?\n(?!;)', '$1 ' |  Out-File "C:\Users\TUF17\Desktop\File\Fix\xx_noenters.csv"


  • 为了跨越文本文件的多行进行替换,您需要使用Get-Content -Raw读取整个文件或执行基于状态的逐行处理,例如使用switch语句的-File参数。

    • 注意:您也可以通过将Get-Content不使用-Raw)与ForEach-Object调用结合使用来进行基于状态的逐行处理,但这样的解决方案速度较慢 - 参见此答案
  • 您的**正则表达式&#39;&quot;\R(?!;)&quot;&#39;有两个问题**:

    • 它意外地使用嵌入的&quot;引用。仅使用_&#39;...&#39;_引用。PowerShell对于正则表达式文本没有特殊的语法 - 它只是使用_字符串_。
      为避免与PowerShell自身的字符串插值混淆,最好使用保留的&#39;...&#39;字符串而不是可展开(插值)的&quot;...&quot;字符串 - 请参阅概念性的about_Quoting_Rules帮助主题。

    • \R是不受支持的正则表达式转义序列;您可能是指**\r**,即CR字符(回车,U+000D)。

      • 如果您想匹配CRLF,即Windows格式的换行_序列_,请使用\r\n

      • 如果您想匹配LF(LINE FEED,U+000A)单独(Unix格式的换行),请使用\n

      • 如果您想匹配_两种_换行格式,请使用\r?\n

      • 顺便说一下:虽然单独使用CR在实践中很少见,但PowerShell也将单独的CR字符视为换行,这就是Get-Content 不使用-Raw(按行读取)的原因,因为它不会起作用。

Get-Content -Raw解决方案(比switch -File更简单更快,但需要整个文件在内存中存储两次):

# 根据需要调整&#39;\r&#39;部分(请参阅上文)。
(Get-Content -Raw -LiteralPath $inFile) -replace &#39;\r(?!;)&#39; |
  Set-Content -NoNewLine -Encoding utf8 -LiteralPath $outFile


  • 通过未指定-replace的替换操作数,该命令移除所有不跟随(?!;))的换行,从而有效地将直接跟随CR的下一行连接到前一行,这是基于您的示例输出的期望行为。

  • 对于保存_文本_,Set-ContentOut-File稍快一些(在这里几乎没有区别,因为只写入一个_单一的_大字符串)。

    • -NoNewLine防止将额外的尾随换行追加到文件。
    • -Encoding utf8指定输出字符编码。请注意,PowerShell从不保留_输入_字符编码,因此除非在_输出_上使用-Encoding,否则您将得到相应cmdlet的_默认_字符编码,在_Windows PowerShell_中,这在各个cmdlet之间变化;在_PowerShell (Core) 7+_中,_一致的_默认值现在是无BOM的UTF-8。请注意,在_Windows PowerShell_中-Encoding utf8总是创建一个带有BOM的文件;有关背景信息和解决方法,请参阅此答案

Building on the helpful comments on the question:

  • In order to perform replacements across lines of a text file, you need to either read the file in full - with Get-Content -Raw - or perform stateful line-by-line processing, such as with the -File parameter of a switch statement.

    • Note: While you could also do stateful line-by-line processing by combining Get-Content (without -Raw) with a ForEach-Object call, such a solution would be much slower - see this answer.
  • Your regex, &#39;&quot;\R(?!;)&quot;&#39;, has two problems:

    • It accidentally uses embedded &quot; quoting. Use only &#39;...&#39; quoting. PowerShell has no special syntax for regex literals - it simply uses strings.
      To avoid confusion with PowerShell's own up-front string interpolation, it is better to use verbatim &#39;...&#39; strings rather than expandable (interpolating) &quot;...&quot; strings - see the conceptual about_Quoting_Rules help topic.

    • \R is an unsupported regex escape sequence; you presumably meant \r, i.e. a CR char. (CARRIAGE RETURN, U+000D)

      • If you instead want to match CRLF, a Windows-format newline sequence, use \r\n

      • If you want to match LF (LINE FEED, U+000A)) alone (a Unix-format newline), use \n

      • If you want to match both newline formats, use \r?\n

      • As an aside: While use of CR alone is rare in practice, PowerShell treats stand-alone CR characters as newlines as well, which is why Get-Content without -Raw, which reads line by line (as you've tried) wouldn't work.

Get-Content -Raw solution (easier and faster than switch -File, but requires the whole file to fit into memory twice):

# Adjust the &#39;\r&#39; part as needed (see above).
(Get-Content -Raw -LiteralPath $inFile) -replace &#39;\r(?!;)&#39; |
  Set-Content -NoNewLine -Encoding utf8 -LiteralPath $outFile


  • By not specifying a substitution operand to -replace, the command removes all newlines not followed by a ; ((?!;)), effectively joining the line that follows the CR directly to the previous line, which is the desired behavior based on your sample output.

  • For saving text, Set-Content is a bit faster than Out-File (it'll make no appreciable difference here, given that only a single, large string is written).

    • -NoNewLine prevents a(n additional) trailing newline from getting appended to the file.
    • -Encoding utf8 specifies the output character encoding. Note that PowerShell never preserves the input character encoding, so unless you use -Encoding on output, you'll get the respective cmdlet's default character encoding, which in Windows PowerShell varies from cmdlet to cmdlet; in PowerShell (Core) 7+, the consistent default is now BOM-less UTF-8. Note that in Windows PowerShell -Encoding utf8 always create a file with a BOM; see this answer for background information and workarounds.

