2023年6月15日 10:35:34go评论64阅读模式

英文:

How to output multibyte unicode character?

问题

I want to output superscript characters in a scientific report. For example, I want to display 'A/cm²' (with '²' as a superscript character). I found a Unicode character map, and in this case, the superscript character '²' is represented as a 3-byte character.

Environment

Windows 10 64-bit
Python v3.10.10

What I did

I read the Python documentation for the chr function, and I tried the following command:

chr(0xe28282)

However, Python gave me the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: chr() arg not in range(0x110000)

How can I output a 3-byte character in Python?

英文:

What I want to do

I want to output superscript character in making a scientific report. It is like 'A/cm2'('2' in this case). I managed to find a character map in unicode shown below. As for this case superscript charcter of '2' is shown to be a 3-byte character.

Environment

Windows10 64bit
Python v3.10.10

What I did

I read the reference of Python (description for chr function), I tried following command.

chr(0xe28282)

Python showed me following error.

Traceback (most recent call last)
File &quot;&lt;stdin&gt;&quot; line 1 in &lt;module&gt;
ValueError: chr() arg not in range(0x110000)

How to describe to output a 3-byte character in Python ?

答案1

得分: 1

Unicode代码点（或字符）在第二行中给出，你可以在Python中使用chr函数来表示它。第一行是UTF-8编码（可以从第二行派生出来）。你可以将它用作二进制字符串，例如b'\xe2\x82\x82'。

但是对于Unicode（而Python中的源默认为UTF-8），你可以直接使用"₂"。这比在表格中来回转换要容易得多。让你的编辑器/界面和Python来为你完成这项工作。

英文:

You are confusing several concepts (and probably the reason was the original site you used).

Unicode codepoint (or character) is given in the second line, and you should use that for chr in Python. The first line is the UTF-8 encoding (which can be derived from the second line). You can use it as binary string e.g. b'\xe2\x82\x82'.

But with Unicode (and sources in Python are by default UTF-8), you can directly use "₂". It is much easier then going back and forth with tables. Let's your editor/GUI and Python do the job for you.

答案2

得分: 0

U+2082是您想要的Unicode代码点。可以使用转义代码，直接粘贴到您的代码中，或者在数字上使用chr()：

print('\u2082')
₂

print('₂')
₂

print(chr(0x2082))
₂

''.join(chr(i) for i in range(0x2070, 0x209d) if i not in (0x2072, 0x2073, 0x208f))
'⁰ⁱ⁴⁵⁶⁷⁸⁹⁺⁻⁼⁽⁾ⁿ₀₁₂₃₄₅₆₇₈₉₊₋₌₍₎ₐₑₒₓₔₕₖₗₘₙₚₛₜ'

从Python 3.3开始，Windows Unicode API用于在cmd.exe窗口中打印，因此终端的编码不重要，但您需要有一个支持该字形的字体；否则，您可能会看到该字体的Unicode REPLACEMENT CHARACTER（�）。在Chrome浏览器的字体上，上面的所有字符在Windows上都正确呈现。

以上代码是在Windows上使用Python 3.11从cmd.exe运行的，join中的最后八个字符在我使用的Consolas字体中呈现为替代字符：

如何输出多字节的Unicode字符？

参考链接：Unicode.org: Superscripts and Subscripts

英文:

U+2082 is the Unicode code point you want. Use an escape code, paste it directly into your code, or use chr() on the number:

&gt;&gt;&gt; print(&#39;\u2082&#39;)
₂
&gt;&gt;&gt; print(&#39;₂&#39;)
₂
&gt;&gt;&gt; print(chr(0x2082))
₂
&gt;&gt;&gt; &#39;&#39;.join(chr(i) for i in range(0x2070, 0x209d) if i not in (0x2072, 0x2073, 0x208f))
&#39;⁰ⁱ⁴⁵⁶⁷⁸⁹⁺⁻⁼⁽⁾ⁿ₀₁₂₃₄₅₆₇₈₉₊₋₌₍₎ₐₑₒₓₔₕₖₗₘₙₚₛₜ&#39;

As of Python 3.3 Windows Unicode APIs are used to print to a cmd.exe window, so the encoding of the terminal won't matter, but you will need a font that supports the glyph; otherwise, you'll likely see the font's Unicode REPLACEMENT CHARACTER (�). All the characters above rendered correctly on Windows in the Chrome browser's font.

Above code was run from cmd.exe on Windows with Python 3.11. The last eight characters from the join were rendered as replacement characters in the Consolas font I was using:

如何输出多字节的Unicode字符？

Ref: Unicode.org: Superscripts and Subscripts

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何输出多字节的Unicode字符？

问题

Environment

What I did

What I want to do

Environment

What I did

答案1

答案2

Python逻辑以在笛卡尔坐标系上找到两组点之间的最小总距离。

从 pandas 数据框中提取相关行，当存在重复列数值时。

当我运行我的代码时，没有任何返回值，但没有错误，并且我调用了方法。

通过与另一个DataFrame进行比较来筛选DataFrame

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论