根据索引值删除行

huangapple go评论97阅读模式
英文:

How to drop rows based on value of index

问题

  1. 我想根据索引值删除行,但我并不是指逐个列举它们。我想知道是否有一种方法可以不手动列出这些年份。
  2. 我想要删除索引在2000以下的行,有没有办法用一个公式来实现,我在想类似于 `drop(label=[df.index<2000]` 的东西。
  3. 显然这段代码是不正确的,但我希望它能给出我想要发生的事情的一个想法。
英文:

I want to drop rows based on index values, but I don't mean listing them. I want to see if there is a way where I could not list down the years manually.

I want to drop the rows with indices below 2000, is there anyway to do this with a formula, I'm thinking something like drop(label=[df.index&lt;2000].

obviously the code is incorrect but i hope it gives an Idea of what I want to happen.

根据索引值删除行

答案1

得分: 0

这是一种方法来做:

  1. import numpy as np
  2. import pandas as pd
  3. # 为了可重现性设置随机种子
  4. np.random.seed(42)
  5. # 生成一个随机的DataFrame
  6. index_values = np.arange(1000, 3001) # 索引值在1000到3000之间
  7. data = np.random.randn(len(index_values), 3) # 随机数据
  8. columns = ['A', 'B', 'C'] # 列名
  9. df = pd.DataFrame(data, index=index_values, columns=columns)
  10. # 删除索引低于2000的行
  11. df_filtered = df.drop(df[df.index < 2000].index)
  12. # 打印结果DataFrame
  13. print(df_filtered)

过滤前:

  1. A B C
  2. 1000 0.496714 -0.138264 0.647689
  3. 1001 1.523030 -0.234153 -0.234137
  4. 1002 1.579213 0.767435 -0.469474
  5. 1003 0.542560 -0.463418 -0.465730
  6. 1004 0.241962 -1.913280 -1.724918
  7. ... ... ... ...
  8. 2996 0.434941 -0.393987 0.537768
  9. 2997 0.306389 -0.998307 0.518793
  10. 2998 0.863528 0.171469 1.152648
  11. 2999 -1.217404 0.467950 -1.170281
  12. 3000 -1.114081 -0.630931 -0.942060

过滤后:

  1. A B C
  2. 2000 -1.907808 -0.860385 -0.413606
  3. 2001 1.887688 0.556553 -1.335482
  4. 2002 0.486036 -1.547304 1.082691
  5. 2003 -0.471125 -0.093636 1.325797
  6. 2004 -1.287164 -1.397118 -0.583599
  7. ... ... ... ...
  8. 2996 0.434941 -0.393987 0.537768
  9. 2997 0.306389 -0.998307 0.518793
  10. 2998 0.863528 0.171469 1.152648
  11. 2999 -1.217404 0.467950 -1.170281
  12. 3000 -1.114081 -0.630931 -0.942060
英文:

Here is one way to do it:

  1. import numpy as np
  2. import pandas as pd
  3. # Set the random seed for reproducibility
  4. np.random.seed(42)
  5. # Generate a random DataFrame
  6. index_values = np.arange(1000, 3001) # Index values between 1000 and 3000
  7. data = np.random.randn(len(index_values), 3) # Random data
  8. columns = [&#39;A&#39;, &#39;B&#39;, &#39;C&#39;] # Column names
  9. df = pd.DataFrame(data, index=index_values, columns=columns)
  10. # Drop rows where index is below 2000
  11. df_filtered = df.drop(df[df.index &lt; 2000].index)
  12. # Print the resulting DataFrame
  13. print(df_filtered)

Before filtering:

  1. A B C
  2. 1000 0.496714 -0.138264 0.647689
  3. 1001 1.523030 -0.234153 -0.234137
  4. 1002 1.579213 0.767435 -0.469474
  5. 1003 0.542560 -0.463418 -0.465730
  6. 1004 0.241962 -1.913280 -1.724918
  7. ... ... ... ...
  8. 2996 0.434941 -0.393987 0.537768
  9. 2997 0.306389 -0.998307 0.518793
  10. 2998 0.863528 0.171469 1.152648
  11. 2999 -1.217404 0.467950 -1.170281
  12. 3000 -1.114081 -0.630931 -0.942060

After filtering:

  1. A B C
  2. 2000 -1.907808 -0.860385 -0.413606
  3. 2001 1.887688 0.556553 -1.335482
  4. 2002 0.486036 -1.547304 1.082691
  5. 2003 -0.471125 -0.093636 1.325797
  6. 2004 -1.287164 -1.397118 -0.583599
  7. ... ... ... ...
  8. 2996 0.434941 -0.393987 0.537768
  9. 2997 0.306389 -0.998307 0.518793
  10. 2998 0.863528 0.171469 1.152648
  11. 2999 -1.217404 0.467950 -1.170281
  12. 3000 -1.114081 -0.630931 -0.942060

答案2

得分: 0

To select all indexes with a value greater than 2000, you can use df.index > 2000. To filter for greater or equal, use df.index >= 2000. This will reduce the original DataFrame and drop all values with a smaller index. To see the difference, you can create a copy and compare it with the original data.

  1. import pandas as pd
  2. df = pd.DataFrame({'a': [0, 1, 2, 3, 4]}, index=[1998.0, 1999, 2000, 2001, 2002])
  3. dropped_df = df[df.index > 2000].copy()
  4. >>> dropped_df
  5. a
  6. 2001.0 3
  7. 2002.0 4

(Note: I've only translated the code-related content as requested.)

英文:

To select all indexes with an value greater than 2000, you can use df.index&gt;2000. To filter for greater or equal use df.index&gt;=2000. This will reduce the original DataFrame and drop all values with a smaller index. To see the difference, you can create a copy and compare with the original data.

  1. import pandas as pd
  2. df = pd.DataFrame({&#39;a&#39;:[0,1,2,3,4]}, index=[1998.0,1999,2000,2001,2002])
  3. dropped_df = df[df.index&gt;2000].copy()
  4. &gt;&gt;&gt; dropped_df
  5. a
  6. 2001.0 3
  7. 2002.0 4

答案3

得分: 0

你可以尝试布尔索引 -

  1. df = df.drop(df[df.index < 2000].index)
英文:

You can try boolean index -

  1. df = df.drop(df[df.index &lt; 2000].index)

huangapple
  • 本文由 发表于 2023年5月13日 16:22:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76241768.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定