英文:
How to drop rows based on value of index
问题
我想根据索引值删除行,但我并不是指逐个列举它们。我想知道是否有一种方法可以不手动列出这些年份。
我想要删除索引在2000以下的行,有没有办法用一个公式来实现,我在想类似于 `drop(label=[df.index<2000]` 的东西。
显然这段代码是不正确的,但我希望它能给出我想要发生的事情的一个想法。
英文:
I want to drop rows based on index values, but I don't mean listing them. I want to see if there is a way where I could not list down the years manually.
I want to drop the rows with indices below 2000, is there anyway to do this with a formula, I'm thinking something like drop(label=[df.index<2000]
.
obviously the code is incorrect but i hope it gives an Idea of what I want to happen.
答案1
得分: 0
这是一种方法来做:
import numpy as np
import pandas as pd
# 为了可重现性设置随机种子
np.random.seed(42)
# 生成一个随机的DataFrame
index_values = np.arange(1000, 3001) # 索引值在1000到3000之间
data = np.random.randn(len(index_values), 3) # 随机数据
columns = ['A', 'B', 'C'] # 列名
df = pd.DataFrame(data, index=index_values, columns=columns)
# 删除索引低于2000的行
df_filtered = df.drop(df[df.index < 2000].index)
# 打印结果DataFrame
print(df_filtered)
过滤前:
A B C
1000 0.496714 -0.138264 0.647689
1001 1.523030 -0.234153 -0.234137
1002 1.579213 0.767435 -0.469474
1003 0.542560 -0.463418 -0.465730
1004 0.241962 -1.913280 -1.724918
... ... ... ...
2996 0.434941 -0.393987 0.537768
2997 0.306389 -0.998307 0.518793
2998 0.863528 0.171469 1.152648
2999 -1.217404 0.467950 -1.170281
3000 -1.114081 -0.630931 -0.942060
过滤后:
A B C
2000 -1.907808 -0.860385 -0.413606
2001 1.887688 0.556553 -1.335482
2002 0.486036 -1.547304 1.082691
2003 -0.471125 -0.093636 1.325797
2004 -1.287164 -1.397118 -0.583599
... ... ... ...
2996 0.434941 -0.393987 0.537768
2997 0.306389 -0.998307 0.518793
2998 0.863528 0.171469 1.152648
2999 -1.217404 0.467950 -1.170281
3000 -1.114081 -0.630931 -0.942060
英文:
Here is one way to do it:
import numpy as np
import pandas as pd
# Set the random seed for reproducibility
np.random.seed(42)
# Generate a random DataFrame
index_values = np.arange(1000, 3001) # Index values between 1000 and 3000
data = np.random.randn(len(index_values), 3) # Random data
columns = ['A', 'B', 'C'] # Column names
df = pd.DataFrame(data, index=index_values, columns=columns)
# Drop rows where index is below 2000
df_filtered = df.drop(df[df.index < 2000].index)
# Print the resulting DataFrame
print(df_filtered)
Before filtering:
A B C
1000 0.496714 -0.138264 0.647689
1001 1.523030 -0.234153 -0.234137
1002 1.579213 0.767435 -0.469474
1003 0.542560 -0.463418 -0.465730
1004 0.241962 -1.913280 -1.724918
... ... ... ...
2996 0.434941 -0.393987 0.537768
2997 0.306389 -0.998307 0.518793
2998 0.863528 0.171469 1.152648
2999 -1.217404 0.467950 -1.170281
3000 -1.114081 -0.630931 -0.942060
After filtering:
A B C
2000 -1.907808 -0.860385 -0.413606
2001 1.887688 0.556553 -1.335482
2002 0.486036 -1.547304 1.082691
2003 -0.471125 -0.093636 1.325797
2004 -1.287164 -1.397118 -0.583599
... ... ... ...
2996 0.434941 -0.393987 0.537768
2997 0.306389 -0.998307 0.518793
2998 0.863528 0.171469 1.152648
2999 -1.217404 0.467950 -1.170281
3000 -1.114081 -0.630931 -0.942060
答案2
得分: 0
To select all indexes with a value greater than 2000, you can use df.index > 2000
. To filter for greater or equal, use df.index >= 2000
. This will reduce the original DataFrame and drop all values with a smaller index. To see the difference, you can create a copy and compare it with the original data.
import pandas as pd
df = pd.DataFrame({'a': [0, 1, 2, 3, 4]}, index=[1998.0, 1999, 2000, 2001, 2002])
dropped_df = df[df.index > 2000].copy()
>>> dropped_df
a
2001.0 3
2002.0 4
(Note: I've only translated the code-related content as requested.)
英文:
To select all indexes with an value greater than 2000, you can use df.index>2000
. To filter for greater or equal use df.index>=2000
. This will reduce the original DataFrame and drop all values with a smaller index. To see the difference, you can create a copy and compare with the original data.
import pandas as pd
df = pd.DataFrame({'a':[0,1,2,3,4]}, index=[1998.0,1999,2000,2001,2002])
dropped_df = df[df.index>2000].copy()
>>> dropped_df
a
2001.0 3
2002.0 4
答案3
得分: 0
你可以尝试布尔索引 -
df = df.drop(df[df.index < 2000].index)
英文:
You can try boolean index -
df = df.drop(df[df.index < 2000].index)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论