英文:
non-ASCII characters in Powershell scripts
问题
我需要在包含特殊非拉丁字符的文件结构中执行一些文件操作。
当我尝试使用这些字符中的任何一个时,PowerShell会崩溃。
例如,这个不起作用:
$TestPath = "C:\Examples\Folder_1ĀČ\"
$ExampleFileName = "Test.txt"
Copy-Item ($PSScriptRoot + "\" + $ExampleFileName) -Destination ($TestPath) -Force
但这个可以工作:
$TestPath = "C:\Examples\Folder_1AC\"
$ExampleFileName = "Test.txt"
Copy-Item ($PSScriptRoot + "\" + $ExampleFileName) -Destination ($TestPath) -Force
我尝试使用以下调试:
Write-Output $TestPath
在控制台中返回的结果是:
C:\Examples\Folder_1Ä€Ä\
是否可以使用包含这些字符的路径进行PowerShell操作?
我该如何做到这一点?
英文:
I need to do some file operations in a file structure that contains special non-latin characters.
Powershell crashes, when I try to use any of those characters.
For example, this doesn't work:
$TestPath = "C:\Examples\Folder_1ĀČ\"
$ExampleFileName = "Test.txt"
Copy-Item ($PSScriptRoot + "\" + $ExampleFileName) -Destination ($TestPath) -Force
But this works:
$TestPath = "C:\Examples\Folder_1AC\"
$ExampleFileName = "Test.txt"
Copy-Item ($PSScriptRoot + "\" + $ExampleFileName) -Destination ($TestPath) -Force
I tried debugging with
Write-Output $TestPath
And result returned in the console was:
C:\Examples\Folder_1Ä€Ä\
Is it possible to use powershell with paths containing these characters?
How can I do that?
答案1
得分: 1
看起来你在PowerShell中遇到了代码页的问题。
检查并切换到UTF-8。
可以参考这个链接:
StackOverflow: 更改PowerShell的默认输出编码为UTF-8
更新
正如你所写
我发现
-encoding default
对我的情况有效
Default
是你的系统代码页。
最简单的显示它的方法是执行:
chcp
=> 你的代码页是什么?
我猜你在你的问题中有一个小错误:
你写的: C:\Examples\Folder_1Ä€Ä\
我期望的是: C:\Examples\Folder_1ĀČ\
这是典型的字符代码转换问题,当使用本地ANSI代码页解释UTF8字符的二进制编码时会出现。 Ā => Ä€
和 Č => ÄŒ
。
请注意,以下内容很重要:
- 你的环境使用哪个代码页?
- cmdlet的文本使用哪种字符编码(代码页)?
- Powershell用于读取cmdlet文本和执行它的编码是什么?
根据你的信息更新,逻辑如下:
- 你可以成功使用
-encoding Default
执行脚本
=> 你的脚本已使用你的本地代码页存储。 - 不使用
-encoding Default
会导致“扩展”字符:
=> Powershell假定编码为UTF8,然后- 转换文件的二进制值以正确的UTF8方式(更改字符
ĀČ
的正确UTF8编码字符) - 但最终转换后的字符二进制表示会使用本地ANSI代码页进行解释。
结果是ĀČ
,因为字符ĀČ
在UTF8中是2字节编码的。
- 转换文件的二进制值以正确的UTF8方式(更改字符
因此,你应该确保所有环境(包括GUI编辑器)和Powershell的默认设置都使用相同的代码页。
关于这一点
PowerShell现在跨平台,通过其PowerShell Core版本,默认情况下使用不带BOM的UTF-8编码,与类Unix平台一致。
(引用自上面的链接1)。
我建议将所有内容迁移到UTF8,这样-encoding Default
将与-encoding UTF8
相同。
但务必对存储的文件/目录名称和内容进行简要测试,因为它们都是使用你的本地ANSI代码页编写的。
与此同时,你需要告诉Powershell,通过-encoding Default
不要假设你的cmdlet是使用UTF8存储的。
如何在其他函数(如Copy-Item)中使用这种编码?
通过使用
mycmdlet.ps1 -encoding Default
你告诉Powershell使用你当前使用的本地ANSI代码页读取所有内容。因此,由命令处理的所有内容都将适合该代码页。当处理来自或离开cmdlet的内容(因为它被读取或写入)时,系统的代码页(本地ANSI)将被使用,而且一切都应该正常。
英文:
It looks as you have a hassle with your codepage in Powershell.
Check and switch to UTF-8.
Have a look at this:
StackOverflow: Changing PowerShell's default output encoding to UTF-8
Update
As you write
> I found out that -encoding default
works for my case
Default
is your system codepage.
The simplest way to display it, just execute:
chcp
=> What is your codepage?
I suppose you've a small typo in you question:
You write: C:\Examples\Folder_1Ä€Ä\
I would expect: C:\Examples\Folder_1ĀČ\
These are typical character code translation problems, when interpreting UTF8 character's binary encoding by a local ANSI codepage. Ā => Ä€
and Č => ÄŒ
Please note, it is of interest
- Which codepage is using your environment?
- Which character encoding (codepage) is using the cmdlet's text?
- Which encoding is used by Powershell for reading the cmdlet's text and for executing it.
Seeing your info updates the logic says:
-
As you can successfully execute the script with
-encoding Default
=> Your script has been stored using your local codepage. -
As not using
-encoding Default
results in "extended" characters:
=> Powershell assumes UTF8 as encoding, and- converts the read binary values of the file to correct UTF8 (changing the characters
ĀČ
proper UTF8 coded characters) - but finally the converted characters binary representation is interpreted using the local ANSI codepage.
The result isĀČ
, as the charactersĀČ
are 2-byte-encoded in UTF8.
- converts the read binary values of the file to correct UTF8 (changing the characters
As a consequence you should take care that all your environments (also your GUI editor) and Powershell's default are set to the same codepage.
Regarding this
> PowerShell is now cross-platform, via its PowerShell Core edition, whose encoding - sensibly - defaults to BOM-less UTF-8, in line with Unix-like platforms.
(citation from link above)
I would suggest to migate everything to UTF8, so -encoding Default
becomes the same as -encoding UTF8
.
But be sure to do brief testing of your stored file-/directory-names and content, as currently they all are written using your local ANSI codepage.
In the meantime you have to tell Powershell, by -encoding Default
not to assume your cmdlet is stored using UTF8
.
> How do I use this encoding for other functions like Copy-Item?
By using
mycmdlet.ps1 -encoding Default
You tell Powershell to read everything with your currently used local ANSI codepage. So everything that is handled by the commands will fit to that.
Wenn something comes in or leaves the cmdlet processing (because it's read or written) the system's codepage (local ANSI) will be used and also there should be everything OK.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论