在Python或NumPy中已经有一些方法可以确定数字的格式吗?

huangapple go评论116阅读模式
英文:

Is there already something in python or numpy to determine a number's format?

问题

  1. 我需要确定一个字符串是一个普通整数一个普通浮点数一个使用 `e` 的浮点数或者无法解析为数字这是我想出的方法但这感觉像是已经存在的东西也许在numpy我对库和谷歌进行了简要的扫描没有看到任何东西这已经是一个事情了吗只是我没有看到吗
  2. PLAIN_INT, PLAIN_FLOAT, E_FLOAT, STRING = range(4)
  3. # 应该只是可选的 - 然后是数字
  4. sample_plain_ints = ['1', '0', '-5', '333333333']
  5. # 需要包含一个点
  6. plain_floats = ['1.0', '-5.0', '-33.212', '0.0', '-1.', '-3.']
  7. # 不需要包含一个点
  8. e_floats = ['1.3e5', '-1.2e5', '0.0e0', '5e-3', '3e23', '3E5', '-3E-12']
  9. # 其他
  10. strings = ['aether', '1ee3', 'buzz', 'eeep', '121212beep']
  11. def determine_str_type(item):
  12. try:
  13. float(item)
  14. try:
  15. int(item)
  16. return PLAIN_INT
  17. except ValueError:
  18. return E_FLOAT if 'E' in item.upper() else PLAIN_FLOAT
  19. except ValueError:
  20. return STRING
  21. assert all([determine_str_type(item) == PLAIN_INT for item in sample_plain_ints])
  22. assert all([determine_str_type(item) == PLAIN_FLOAT for item in plain_floats])
  23. assert all([determine_str_type(item) == E_FLOAT for item in e_floats])
  24. assert all([determine_str_type(item) == STRING for item in strings])
英文:

I need to determine if a string is a plain int, a plain float, a float using e, or not parsable as a number. Here's what I came up with, but this feels like something that probably already exists, perhaps in numpy? I did a brief scan of the libraries and google and didn't see anything, is this already a thing and I'm just not seeing it?

  1. PLAIN_INT, PLAIN_FLOAT, E_FLOAT, STRING = range(4)
  2. # should be just optionally - then numbers
  3. sample_plain_ints = ['1', '0', '-5', '333333333']
  4. # need to contain a dot
  5. plain_floats = ['1.0', '-5.0', '-33.212', '0.0', '-1.', '-3.']
  6. # do not need to contain a dot
  7. e_floats = ['1.3e5', '-1.2e5', '0.0e0', '5e-3', '3e23', '3E5', '-3E-12']
  8. # other
  9. strings = ['aether', '1ee3', 'buzz', 'eeep', '121212beep']
  10. def determine_str_type(item):
  11. try:
  12. float(item)
  13. try:
  14. int(item)
  15. return PLAIN_INT
  16. except ValueError:
  17. return E_FLOAT if 'E' in item.upper() else PLAIN_FLOAT
  18. except ValueError:
  19. return STRING
  20. assert all([determine_str_type(item) == PLAIN_INT for item in sample_plain_ints])
  21. assert all([determine_str_type(item) == PLAIN_FLOAT for item in plain_floats])
  22. assert all([determine_str_type(item) == E_FLOAT for item in e_floats])
  23. assert all([determine_str_type(item) == STRING for item in strings])

答案1

得分: 5

  1. 我会用正则表达式 (`(-?\d+)(\.\d*)?([eE]-?\d+)?$`) 来处理捕获不同的部分然后根据输出决定
  2. ```python
  3. import re
  4. lst = ['1', '0', '-5', '333333333', '1.0', '-5.0', '-33.212', '0.0', '-1.', '-3.', '1.3e5', '-1.2e5', '0.0e0', '5e-3', '3e23', '3E5', '-3E-12', 'aether', '1ee3', 'buzz', 'eeep', '121212beep']
  5. def determine_str_type(s):
  6. pat = re.compile(r'(-?\d+)(\.\d*)?([eE]-?\d+)?$')
  7. match m.groups() if (m:=pat.match(s)) else None:
  8. case None:
  9. return 'STRING '
  10. case (_, None, None):
  11. return 'PLAIN_INT'
  12. case (_, _, None):
  13. return 'PLAIN_FLOAT'
  14. case (_, _, _):
  15. return 'E_FLOAT'
  16. for s in lst:
  17. print(f'{s: <11}: {determine_str_type(s)}')

输出:

  1. 1 : PLAIN_INT
  2. 0 : PLAIN_INT
  3. -5 : PLAIN_INT
  4. 333333333 : PLAIN_INT
  5. 1.0 : PLAIN_FLOAT
  6. -5.0 : PLAIN_FLOAT
  7. -33.212 : PLAIN_FLOAT
  8. 0.0 : PLAIN_FLOAT
  9. -1. : PLAIN_FLOAT
  10. -3. : PLAIN_FLOAT
  11. 1.3e5 : E_FLOAT
  12. -1.2e5 : E_FLOAT
  13. 0.0e0 : E_FLOAT
  14. 5e-3 : E_FLOAT
  15. 3e23 : E_FLOAT
  16. 3E5 : E_FLOAT
  17. -3E-12 : E_FLOAT
  18. aether : STRING
  19. 1ee3 : STRING
  20. buzz : STRING
  21. eeep : STRING
  22. 121212beep : STRING

正则表达式演示

  1. <details>
  2. <summary>英文:</summary>
  3. I would use a regex for that (`(-?\d+)(\.\d*)?([eE]-?\d+)?$`), capture the different parts and decide depending on the output:

import re

lst = ['1', '0', '-5', '333333333', '1.0', '-5.0', '-33.212', '0.0', '-1.', '-3.', '1.3e5', '-1.2e5', '0.0e0', '5e-3', '3e23', '3E5', '-3E-12', 'aether', '1ee3', 'buzz', 'eeep', '121212beep']

def determine_str_type(s):
pat = re.compile(r'(-?\d+)(.\d*)?([eE]-?\d+)?$')
match m.groups() if (m:=pat.match(s)) else None:
case None:
return 'STRING '
case (, None, None):
return 'PLAIN_INT'
case (
, , None):
return 'PLAIN_FLOAT'
case (
, _, _):
return 'E_FLOAT'

for s in lst:
print(f'{s: <11}: {determine_str_type(s)}')

  1. Output:

1 : PLAIN_INT
0 : PLAIN_INT
-5 : PLAIN_INT
333333333 : PLAIN_INT
1.0 : PLAIN_FLOAT
-5.0 : PLAIN_FLOAT
-33.212 : PLAIN_FLOAT
0.0 : PLAIN_FLOAT
-1. : PLAIN_FLOAT
-3. : PLAIN_FLOAT
1.3e5 : E_FLOAT
-1.2e5 : E_FLOAT
0.0e0 : E_FLOAT
5e-3 : E_FLOAT
3e23 : E_FLOAT
3E5 : E_FLOAT
-3E-12 : E_FLOAT
aether : STRING
1ee3 : STRING
buzz : STRING
eeep : STRING
121212beep : STRING

  1. [regex demo](https://regex101.com/r/J41gmM/1)
  2. </details>

huangapple
  • 本文由 发表于 2023年7月27日 22:42:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76780886.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定