英文:
Is there already something in python or numpy to determine a number's format?
问题
我需要确定一个字符串是一个普通整数,一个普通浮点数,一个使用 `e` 的浮点数,或者无法解析为数字。这是我想出的方法,但这感觉像是已经存在的东西,也许在numpy中?我对库和谷歌进行了简要的扫描,没有看到任何东西,这已经是一个事情了吗,只是我没有看到吗?
PLAIN_INT, PLAIN_FLOAT, E_FLOAT, STRING = range(4)
# 应该只是可选的 - 然后是数字
sample_plain_ints = ['1', '0', '-5', '333333333']
# 需要包含一个点
plain_floats = ['1.0', '-5.0', '-33.212', '0.0', '-1.', '-3.']
# 不需要包含一个点
e_floats = ['1.3e5', '-1.2e5', '0.0e0', '5e-3', '3e23', '3E5', '-3E-12']
# 其他
strings = ['aether', '1ee3', 'buzz', 'eeep', '121212beep']
def determine_str_type(item):
try:
float(item)
try:
int(item)
return PLAIN_INT
except ValueError:
return E_FLOAT if 'E' in item.upper() else PLAIN_FLOAT
except ValueError:
return STRING
assert all([determine_str_type(item) == PLAIN_INT for item in sample_plain_ints])
assert all([determine_str_type(item) == PLAIN_FLOAT for item in plain_floats])
assert all([determine_str_type(item) == E_FLOAT for item in e_floats])
assert all([determine_str_type(item) == STRING for item in strings])
英文:
I need to determine if a string is a plain int, a plain float, a float using e
, or not parsable as a number. Here's what I came up with, but this feels like something that probably already exists, perhaps in numpy? I did a brief scan of the libraries and google and didn't see anything, is this already a thing and I'm just not seeing it?
PLAIN_INT, PLAIN_FLOAT, E_FLOAT, STRING = range(4)
# should be just optionally - then numbers
sample_plain_ints = ['1', '0', '-5', '333333333']
# need to contain a dot
plain_floats = ['1.0', '-5.0', '-33.212', '0.0', '-1.', '-3.']
# do not need to contain a dot
e_floats = ['1.3e5', '-1.2e5', '0.0e0', '5e-3', '3e23', '3E5', '-3E-12']
# other
strings = ['aether', '1ee3', 'buzz', 'eeep', '121212beep']
def determine_str_type(item):
try:
float(item)
try:
int(item)
return PLAIN_INT
except ValueError:
return E_FLOAT if 'E' in item.upper() else PLAIN_FLOAT
except ValueError:
return STRING
assert all([determine_str_type(item) == PLAIN_INT for item in sample_plain_ints])
assert all([determine_str_type(item) == PLAIN_FLOAT for item in plain_floats])
assert all([determine_str_type(item) == E_FLOAT for item in e_floats])
assert all([determine_str_type(item) == STRING for item in strings])
答案1
得分: 5
我会用正则表达式 (`(-?\d+)(\.\d*)?([eE]-?\d+)?$`) 来处理,捕获不同的部分,然后根据输出决定:
```python
import re
lst = ['1', '0', '-5', '333333333', '1.0', '-5.0', '-33.212', '0.0', '-1.', '-3.', '1.3e5', '-1.2e5', '0.0e0', '5e-3', '3e23', '3E5', '-3E-12', 'aether', '1ee3', 'buzz', 'eeep', '121212beep']
def determine_str_type(s):
pat = re.compile(r'(-?\d+)(\.\d*)?([eE]-?\d+)?$')
match m.groups() if (m:=pat.match(s)) else None:
case None:
return 'STRING '
case (_, None, None):
return 'PLAIN_INT'
case (_, _, None):
return 'PLAIN_FLOAT'
case (_, _, _):
return 'E_FLOAT'
for s in lst:
print(f'{s: <11}: {determine_str_type(s)}')
输出:
1 : PLAIN_INT
0 : PLAIN_INT
-5 : PLAIN_INT
333333333 : PLAIN_INT
1.0 : PLAIN_FLOAT
-5.0 : PLAIN_FLOAT
-33.212 : PLAIN_FLOAT
0.0 : PLAIN_FLOAT
-1. : PLAIN_FLOAT
-3. : PLAIN_FLOAT
1.3e5 : E_FLOAT
-1.2e5 : E_FLOAT
0.0e0 : E_FLOAT
5e-3 : E_FLOAT
3e23 : E_FLOAT
3E5 : E_FLOAT
-3E-12 : E_FLOAT
aether : STRING
1ee3 : STRING
buzz : STRING
eeep : STRING
121212beep : STRING
<details>
<summary>英文:</summary>
I would use a regex for that (`(-?\d+)(\.\d*)?([eE]-?\d+)?$`), capture the different parts and decide depending on the output:
import re
lst = ['1', '0', '-5', '333333333', '1.0', '-5.0', '-33.212', '0.0', '-1.', '-3.', '1.3e5', '-1.2e5', '0.0e0', '5e-3', '3e23', '3E5', '-3E-12', 'aether', '1ee3', 'buzz', 'eeep', '121212beep']
def determine_str_type(s):
pat = re.compile(r'(-?\d+)(.\d*)?([eE]-?\d+)?$')
match m.groups() if (m:=pat.match(s)) else None:
case None:
return 'STRING '
case (, None, None):
return 'PLAIN_INT'
case (, , None):
return 'PLAIN_FLOAT'
case (, _, _):
return 'E_FLOAT'
for s in lst:
print(f'{s: <11}: {determine_str_type(s)}')
Output:
1 : PLAIN_INT
0 : PLAIN_INT
-5 : PLAIN_INT
333333333 : PLAIN_INT
1.0 : PLAIN_FLOAT
-5.0 : PLAIN_FLOAT
-33.212 : PLAIN_FLOAT
0.0 : PLAIN_FLOAT
-1. : PLAIN_FLOAT
-3. : PLAIN_FLOAT
1.3e5 : E_FLOAT
-1.2e5 : E_FLOAT
0.0e0 : E_FLOAT
5e-3 : E_FLOAT
3e23 : E_FLOAT
3E5 : E_FLOAT
-3E-12 : E_FLOAT
aether : STRING
1ee3 : STRING
buzz : STRING
eeep : STRING
121212beep : STRING
[regex demo](https://regex101.com/r/J41gmM/1)
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论