英文:
How to create a pandas Series with a dtype which is a subclass of float?
问题
我想创建一个派生自float类型的pandas Series。然而,pandas会自动将其重新转换为float类型:
import pandas as pd
class PValue(float):
def __str__(self):
if self < 1e-4:
return '<1e-4'
return super().__str__()
s = pd.Series([0.1, 0.12e-5])
s = s.map(PValue)
print(s.apply(type)) # -> 返回 `float`,但我想要得到 `PValue`
英文:
I would like to create a pandas Series of a type derived from float. However, pandas automatically recast it as float:
import pandas as pd
class PValue(float):
def __str__(self):
if self < 1e-4:
return '<1e-4'
return super().__str__()
s = pd.Series([0.1, 0.12e-5])
s = s.map(PValue)
print(s.apply(type)) # -> returns `float`, but I want to get `PValue`
答案1
得分: 1
我认为你需要使用一个扩展类型来实现你想要的功能。
但是,只有一个方法的类可能不应该是一个类。可以参考来自PyCon 2012的Jack Diederich的演讲《停止编写类》。你可以使用一个格式化函数来实现相同的功能:
def pvalue(x: float) -> str:
if x < 1e-4:
return '<1e-4'
return str(x)
然后,例如:
s = pd.Series([0.1, 0.12e-5])
with pd.option_context('display.float_format', pvalue):
s
0 0.1
1 <1e-4
dtype: float64
或者,如果你不想将所有列都格式化为pvalue
,可以在数据框中使用样式:
pd.DataFrame({'p': s}).style.format({'p': pvalue})
这在Jupyter中显示为一个HTML表格,如下所示:
p
0 0.1
1 <1e-4
英文:
I think you'd need to use an extension type to get it to work how you want.
But, a class with only one method probably shouldn't be a class. Check out Stop Writing Classes by Jack Diederich from PyCon 2012. You can do the same thing with a formatter function:
def pvalue(x: float) -> str:
if x < 1e-4:
return '<1e-4'
return str(x)
Then for example:
s = pd.Series([0.1, 0.12e-5])
with pd.option_context('display.float_format', pvalue):
s
0 0.1
1 <1e-4
dtype: float64
Or, for use in a dataframe, if you don't want to format all the columns as pvalue
s, use a style:
pd.DataFrame({'p': s}).style.format({'p': pvalue})
This is shown in Jupyter as an HTML table like this:
p
0 0.1
1 <1e-4
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论