英文:
Python: Can I use if else function when creating a new dataset?
问题
I'm a beginner in python.
我是Python的初学者。
I need to find out what function should I use in order to create a new dataset (df2) for empty original dataset (df). For example,
我需要找出在空的原始数据集(df)上应该使用什么函数来创建一个新的数据集(df2)。例如,
df2 = df['count_days'].dt.days >= 1
当我的数据集中没有需要计算的天数时(空数据),我会得到一个错误。
当我的数据集中没有需要计算的天数时(空数据),我会得到一个错误。
Can I use if else statement to resolve this issue? For example:
我可以使用if else语句来解决这个问题吗?例如:
if df['count_days'].dt.days >= 1:
df2
else:
print("No Data")
BTW - I sometimes would have empty dataset because there is no record available.
顺便说一下 - 有时我的数据集会为空,因为没有可用的记录。
Thank you!
谢谢!
英文:
I'm a beginner in python.
I need to find out what function should I use in order to create a new dataset (df2) for empty original dataset (df). For example,
df2 = df['count_days'].dt.days >= 1
When I don't have no days to count in the dataset (empty data), I get an error.
Can I use if else statement to resolve this issue?
For example:
if df['count_days'].dt.days >= 1
df2
else
print("No Data")
BTW - I sometimes would have empty dataset because there is no record available.
Thank you!
答案1
得分: 2
是的,您可以使用if和else语句:
import pandas as pd
if df.empty:
print("没有数据")
else:
df2 = df['count_days'].dt.days >= 1
英文:
Yes you can use the if and else statements:
import pandas as pd
if df.empty:
print("No Data")
else:
df2 = df['count_days'].dt.days >= 1
答案2
得分: 1
import pandas as pd
df = pd.DataFrame({'count_days': pd.Series([], dtype='timedelta64[ns]')})
df['count_days'].dt.days >= 1 # 无错误
英文:
Instead of working around it, you could create a dataframe which doesn't produce these errors:
import pandas as pd
df = pd.DataFrame({'count_days': pd.Series([], dtype='timedelta64[ns]')})
df['count_days'].dt.days >= 1 # no error
</details>
# 答案3
**得分**: 0
如果您需要填充数据,通常在需要填充不存在的数据时使用 [NaN `np.nan`(或等效的 "not a number")](https://en.wikipedia.org/wiki/NaN) 或 `.nat`("not a time") - 这是 [IEEE 754 规范](https://en.wikipedia.org/wiki/IEEE_754) 的一部分。
```python
>>> df = pd.DataFrame({'a':[1,2],'b':[3,4]})
>>> df
a b
0 1 3
1 2 4
>>> df["c"] = np.nan # NumPy
>>> df
a b c
0 1 3 NaN
1 2 4 NaN
>>> pd.merge(df, pd.DataFrame({'a':[5],'b':[6],'c':[7]}), 'outer')
a b c
0 1 3 NaN
1 2 4 NaN
2 5 6 7.0
>>> df[df['c'].isna()] # 过滤 NaN 值
a b c
0 1 3 NaN
1 2 4 NaN
>>> df[~df['c'].isna()]
a b c
2 5 6 7.0
在 Python 中,如果您有一个需要存在但不执行任何操作的对象,通常使用 None
是正确的对象。
df = None
... # 中间逻辑
if df is None: # 特殊情况
...
英文:
If you need to fill data, NaN np.nan
(or equivalent "not a number") or .nat
("not a time") are frequently used when you need to fill data that does not exist - this is part of IEEE 754 specification
>>> df = pd.DataFrame({'a':[1,2],'b':[3,4]})
>>> df
a b
0 1 3
1 2 4
>>> df["c"] = np.nan # NumPy
>>> df
a b c
0 1 3 NaN
1 2 4 NaN
>>> pd.merge(df, pd.DataFrame({'a':[5],'b':[6],'c':[7]}), 'outer')
a b c
0 1 3 NaN
1 2 4 NaN
2 5 6 7.0
>>> df[df['c'].isna()] # filter NaNs
a b c
0 1 3 NaN
1 2 4 NaN
>>> df[~df['c'].isna()]
a b c
2 5 6 7.0
In Python, if you have an object that you need to exist, but do nothing, frequently None
is the correct object
df = None
... # intermediate logic
if df is None: # special case
...
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论