根据索引值删除行

huangapple go评论64阅读模式
英文:

How to drop rows based on value of index

问题

我想根据索引值删除行,但我并不是指逐个列举它们。我想知道是否有一种方法可以不手动列出这些年份。

我想要删除索引在2000以下的行,有没有办法用一个公式来实现,我在想类似于 `drop(label=[df.index<2000]` 的东西。

显然这段代码是不正确的,但我希望它能给出我想要发生的事情的一个想法。
英文:

I want to drop rows based on index values, but I don't mean listing them. I want to see if there is a way where I could not list down the years manually.

I want to drop the rows with indices below 2000, is there anyway to do this with a formula, I'm thinking something like drop(label=[df.index&lt;2000].

obviously the code is incorrect but i hope it gives an Idea of what I want to happen.

根据索引值删除行

答案1

得分: 0

这是一种方法来做:

import numpy as np
import pandas as pd

# 为了可重现性设置随机种子
np.random.seed(42)

# 生成一个随机的DataFrame
index_values = np.arange(1000, 3001)  # 索引值在1000到3000之间
data = np.random.randn(len(index_values), 3)  # 随机数据
columns = ['A', 'B', 'C']  # 列名

df = pd.DataFrame(data, index=index_values, columns=columns)

# 删除索引低于2000的行
df_filtered = df.drop(df[df.index < 2000].index)

# 打印结果DataFrame
print(df_filtered)

过滤前:

      A         B         C
1000  0.496714 -0.138264  0.647689
1001  1.523030 -0.234153 -0.234137
1002  1.579213  0.767435 -0.469474
1003  0.542560 -0.463418 -0.465730
1004  0.241962 -1.913280 -1.724918
...        ...       ...       ...
2996  0.434941 -0.393987  0.537768
2997  0.306389 -0.998307  0.518793
2998  0.863528  0.171469  1.152648
2999 -1.217404  0.467950 -1.170281
3000 -1.114081 -0.630931 -0.942060

过滤后:

      A         B         C
2000 -1.907808 -0.860385 -0.413606
2001  1.887688  0.556553 -1.335482
2002  0.486036 -1.547304  1.082691
2003 -0.471125 -0.093636  1.325797
2004 -1.287164 -1.397118 -0.583599
...        ...       ...       ...
2996  0.434941 -0.393987  0.537768
2997  0.306389 -0.998307  0.518793
2998  0.863528  0.171469  1.152648
2999 -1.217404  0.467950 -1.170281
3000 -1.114081 -0.630931 -0.942060
英文:

Here is one way to do it:

import numpy as np
import pandas as pd

# Set the random seed for reproducibility
np.random.seed(42)

# Generate a random DataFrame
index_values = np.arange(1000, 3001)  # Index values between 1000 and 3000
data = np.random.randn(len(index_values), 3)  # Random data
columns = [&#39;A&#39;, &#39;B&#39;, &#39;C&#39;]  # Column names

df = pd.DataFrame(data, index=index_values, columns=columns)

# Drop rows where index is below 2000
df_filtered = df.drop(df[df.index &lt; 2000].index)

# Print the resulting DataFrame
print(df_filtered)

Before filtering:

      A         B         C
1000  0.496714 -0.138264  0.647689
1001  1.523030 -0.234153 -0.234137
1002  1.579213  0.767435 -0.469474
1003  0.542560 -0.463418 -0.465730
1004  0.241962 -1.913280 -1.724918
...        ...       ...       ...
2996  0.434941 -0.393987  0.537768
2997  0.306389 -0.998307  0.518793
2998  0.863528  0.171469  1.152648
2999 -1.217404  0.467950 -1.170281
3000 -1.114081 -0.630931 -0.942060

After filtering:

      A         B         C
2000 -1.907808 -0.860385 -0.413606
2001  1.887688  0.556553 -1.335482
2002  0.486036 -1.547304  1.082691
2003 -0.471125 -0.093636  1.325797
2004 -1.287164 -1.397118 -0.583599
...        ...       ...       ...
2996  0.434941 -0.393987  0.537768
2997  0.306389 -0.998307  0.518793
2998  0.863528  0.171469  1.152648
2999 -1.217404  0.467950 -1.170281
3000 -1.114081 -0.630931 -0.942060

答案2

得分: 0

To select all indexes with a value greater than 2000, you can use df.index > 2000. To filter for greater or equal, use df.index >= 2000. This will reduce the original DataFrame and drop all values with a smaller index. To see the difference, you can create a copy and compare it with the original data.

import pandas as pd
df = pd.DataFrame({'a': [0, 1, 2, 3, 4]}, index=[1998.0, 1999, 2000, 2001, 2002])
dropped_df = df[df.index > 2000].copy()
>>> dropped_df
        a
2001.0  3
2002.0  4

(Note: I've only translated the code-related content as requested.)

英文:

To select all indexes with an value greater than 2000, you can use df.index&gt;2000. To filter for greater or equal use df.index&gt;=2000. This will reduce the original DataFrame and drop all values with a smaller index. To see the difference, you can create a copy and compare with the original data.

import pandas as pd
df = pd.DataFrame({&#39;a&#39;:[0,1,2,3,4]}, index=[1998.0,1999,2000,2001,2002])
dropped_df = df[df.index&gt;2000].copy()
&gt;&gt;&gt; dropped_df
        a
2001.0  3
2002.0  4

答案3

得分: 0

你可以尝试布尔索引 -

df = df.drop(df[df.index < 2000].index)
英文:

You can try boolean index -

df = df.drop(df[df.index &lt; 2000].index)

huangapple
  • 本文由 发表于 2023年5月13日 16:22:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76241768.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定