2023年7月27日 22:42:30go评论116阅读模式

英文:

Is there already something in python or numpy to determine a number's format?

问题

我需要确定一个字符串是一个普通整数，一个普通浮点数，一个使用 `e` 的浮点数，或者无法解析为数字。这是我想出的方法，但这感觉像是已经存在的东西，也许在numpy中？我对库和谷歌进行了简要的扫描，没有看到任何东西，这已经是一个事情了吗，只是我没有看到吗？
PLAIN_INT, PLAIN_FLOAT, E_FLOAT, STRING = range(4)
# 应该只是可选的 - 然后是数字
sample_plain_ints = ['1', '0', '-5', '333333333']
# 需要包含一个点
plain_floats = ['1.0', '-5.0', '-33.212', '0.0', '-1.', '-3.']
# 不需要包含一个点
e_floats = ['1.3e5', '-1.2e5', '0.0e0', '5e-3', '3e23', '3E5', '-3E-12']
# 其他
strings = ['aether', '1ee3', 'buzz', 'eeep', '121212beep']
def determine_str_type(item):
    try:
        float(item)
        try:
            int(item)
            return PLAIN_INT
        except ValueError:
            return E_FLOAT if 'E' in item.upper() else PLAIN_FLOAT
    except ValueError:
        return STRING
assert all([determine_str_type(item) == PLAIN_INT for item in sample_plain_ints])
assert all([determine_str_type(item) == PLAIN_FLOAT for item in plain_floats])
assert all([determine_str_type(item) == E_FLOAT for item in e_floats])
assert all([determine_str_type(item) == STRING for item in strings])

英文:

I need to determine if a string is a plain int, a plain float, a float using e, or not parsable as a number. Here's what I came up with, but this feels like something that probably already exists, perhaps in numpy? I did a brief scan of the libraries and google and didn't see anything, is this already a thing and I'm just not seeing it?

PLAIN_INT, PLAIN_FLOAT, E_FLOAT, STRING = range(4)
# should be just optionally - then numbers
sample_plain_ints = [&#39;1&#39;, &#39;0&#39;, &#39;-5&#39;, &#39;333333333&#39;]
# need to contain a dot
plain_floats = [&#39;1.0&#39;, &#39;-5.0&#39;, &#39;-33.212&#39;, &#39;0.0&#39;, &#39;-1.&#39;, &#39;-3.&#39;]
# do not need to contain a dot
e_floats = [&#39;1.3e5&#39;, &#39;-1.2e5&#39;, &#39;0.0e0&#39;, &#39;5e-3&#39;, &#39;3e23&#39;, &#39;3E5&#39;, &#39;-3E-12&#39;]
# other
strings = [&#39;aether&#39;, &#39;1ee3&#39;, &#39;buzz&#39;, &#39;eeep&#39;, &#39;121212beep&#39;]
def determine_str_type(item):
    try:
        float(item)
        try:
            int(item)
            return PLAIN_INT
        except ValueError:
            return E_FLOAT if &#39;E&#39; in item.upper() else PLAIN_FLOAT
    except ValueError:
        return STRING
assert all([determine_str_type(item) == PLAIN_INT for item in sample_plain_ints])
assert all([determine_str_type(item) == PLAIN_FLOAT for item in plain_floats])
assert all([determine_str_type(item) == E_FLOAT for item in e_floats])
assert all([determine_str_type(item) == STRING for item in strings])

答案1

得分: 5

我会用正则表达式 (`(-?\d+)(\.\d*)?([eE]-?\d+)?$`) 来处理，捕获不同的部分，然后根据输出决定：
```python
import re
lst = ['1', '0', '-5', '333333333', '1.0', '-5.0', '-33.212', '0.0', '-1.', '-3.', '1.3e5', '-1.2e5', '0.0e0', '5e-3', '3e23', '3E5', '-3E-12', 'aether', '1ee3', 'buzz', 'eeep', '121212beep']
def determine_str_type(s):
    pat = re.compile(r'(-?\d+)(\.\d*)?([eE]-?\d+)?$')
    match m.groups() if (m:=pat.match(s)) else None:
        case None:
            return 'STRING '
        case (_, None, None):
            return 'PLAIN_INT'
        case (_, _, None):
            return 'PLAIN_FLOAT'
        case (_, _, _):
            return 'E_FLOAT'
        
for s in lst:
    print(f'{s: <11}: {determine_str_type(s)}')

输出：

1          : PLAIN_INT
0          : PLAIN_INT
-5         : PLAIN_INT
333333333  : PLAIN_INT
1.0        : PLAIN_FLOAT
-5.0       : PLAIN_FLOAT
-33.212    : PLAIN_FLOAT
0.0        : PLAIN_FLOAT
-1.        : PLAIN_FLOAT
-3.        : PLAIN_FLOAT
1.3e5      : E_FLOAT
-1.2e5     : E_FLOAT
0.0e0      : E_FLOAT
5e-3       : E_FLOAT
3e23       : E_FLOAT
3E5        : E_FLOAT
-3E-12     : E_FLOAT
aether     : STRING 
1ee3       : STRING 
buzz       : STRING 
eeep       : STRING 
121212beep : STRING

正则表达式演示


<details>
<summary>英文:</summary>
I would use a regex for that (`(-?\d+)(\.\d*)?([eE]-?\d+)?$`), capture the different parts and decide depending on the output:

import re

lst = ['1', '0', '-5', '333333333', '1.0', '-5.0', '-33.212', '0.0', '-1.', '-3.', '1.3e5', '-1.2e5', '0.0e0', '5e-3', '3e23', '3E5', '-3E-12', 'aether', '1ee3', 'buzz', 'eeep', '121212beep']

def determine_str_type(s):
pat = re.compile(r'(-?\d+)(.\d*)?([eE]-?\d+)?$')
match m.groups() if (m:=pat.match(s)) else None:
case None:
return 'STRING '
case (, None, None):
return 'PLAIN_INT'
case (, , None):
return 'PLAIN_FLOAT'
case (, _, _):
return 'E_FLOAT'

for s in lst:
print(f'{s: <11}: {determine_str_type(s)}')

Output:

1 : PLAIN_INT
0 : PLAIN_INT
-5 : PLAIN_INT
333333333 : PLAIN_INT
1.0 : PLAIN_FLOAT
-5.0 : PLAIN_FLOAT
-33.212 : PLAIN_FLOAT
0.0 : PLAIN_FLOAT
-1. : PLAIN_FLOAT
-3. : PLAIN_FLOAT
1.3e5 : E_FLOAT
-1.2e5 : E_FLOAT
0.0e0 : E_FLOAT
5e-3 : E_FLOAT
3e23 : E_FLOAT
3E5 : E_FLOAT
-3E-12 : E_FLOAT
aether : STRING
1ee3 : STRING
buzz : STRING
eeep : STRING
121212beep : STRING

[regex demo](https://regex101.com/r/J41gmM/1)
</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Python或NumPy中已经有一些方法可以确定数字的格式吗？

问题

答案1

你可以在哪里找到spacy.py文件以重命名。

如何在点击按钮时获取出现的数据？

`pandas.concat()`的第二个参数是什么？

将数据加载到Oracle数据库中，通过扁平文件。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。