Pandas MultiIndex Lookup By Equality and Set Membership

huangapple go评论74阅读模式
英文:

Pandas MultiIndex Lookup By Equality and Set Membership

问题

如何使用MultiIndex来查找第一级相等并在第二级上使用isin来查找?例如,如何将所有第一级等于a且第二级在集合{2, 3, 4}中的值设置为1.0?感谢您的考虑和回复。

英文:

Given a pandas Series, or Dataframe, with a multiindex:

first_key = ['a', 'b', 'c']
second_key = [1, 2, 3]

m_index = pd.MultiIndex.from_product([first_key, second_key],
                                     names=['first_key', 'second_key'])

series_with_index = pd.Series(0.0, index=m_index)

How can the MultiIndex be indexed to lookup an equality for the first level and an isin on the second index?

For example, how can all values where the first level is equal to a and the second level is in the set {2, 3, 4} be set to 1.0?

Thank you in advance for your consideration and response.

答案1

得分: 1

你可以使用 index.get_level_values() 查找第一层中等于 a 的所有值。

index.isin() 有一个 level 参数,所以你可以将你的集合传递给它。

最后,将序列中的值更改为两者都为 True 的地方为 1

m1 = series_with_index.index.get_level_values(0) == 'a'
m2 = series_with_index.index.isin({2, 3, 4}, level=1)

series_with_index.mask(m1 & m2, 1)

输出:

first_key  second_key
a          1             0.0
           2             1.0
           3             1.0
b          1             0.0
           2             0.0
           3             0.0
c          1             0.0
           2             0.0
           3             0.0
英文:

Try this:

you can use index.get_level_values() to find all the values in the first level that equal a.

index.isin() has a level parameter, so you can pass your set into that.

lastly change the values in the series to 1 where they are both True

m1 = series_with_index.index.get_level_values(0) == 'a'
m2 = series_with_index.index.isin({2,3,4},level=1)

series_with_index.mask(m1 & m2,1)

Output:

first_key  second_key
a          1             0.0
           2             1.0
           3             1.0
b          1             0.0
           2             0.0
           3             0.0
c          1             0.0
           2             0.0
           3             0.0

答案2

得分: 1

使用组合条件来处理2个层级中的每一个:

s = pd.Series(0.0, index=m_index)
s[(s.index.get_level_values(0) == 'a') & ((s.index.get_level_values(1).isin({2, 3, 4})))] = 1.0

first_key  second_key
a          1             0.0
           2             1.0
           3             1.0
b          1             0.0
           2             0.0
           3             0.0
c          1             0.0
           2             0.0
           3             0.0
英文:

With composite conditions for each of 2 levels:

s = pd.Series(0.0, index=m_index)
s[(s.index.get_level_values(0) == 'a') & ((s.index.get_level_values(1).isin({2, 3, 4})))] = 1.0

first_key  second_key
a          1             0.0
           2             1.0
           3             1.0
b          1             0.0
           2             0.0
           3             0.0
c          1             0.0
           2             0.0
           3             0.0

答案3

得分: 1

你可以使用 get_level_values 并执行布尔索引

m1 = series_with_index.index.get_level_values('first_key') == 'a'
m2 = series_with_index.index.get_level_values('second_key').isin([2, 3, 4])
series_with_index[m1 & m2] = 1

更新后的 Series:

first_key  second_key
a          1             0.0
           2             1.0
           3             1.0
b          1             0.0
           2             0.0
           3             0.0
c          1             0.0
           2             0.0
           3             0.0
dtype: float64
英文:

You can use get_level_values and perform boolean indexing:

m1 = series_with_index.index.get_level_values('first_key')=='a'
m2 = series_with_index.index.get_level_values('second_key').isin([2,3,4])
series_with_index[m1&m2] = 1

Updated Series:

first_key  second_key
a          1             0.0
           2             1.0
           3             1.0
b          1             0.0
           2             0.0
           3             0.0
c          1             0.0
           2             0.0
           3             0.0
dtype: float64

答案4

得分: 0

query 只适用于带有命名索引的 DataFrame:

my_set = {2, 3, 4}
series_with_index.to_frame()\
    .query('first_key=="a" & second_key in @my_set')\
    .iloc[:,0]
英文:

query works with named index, but only available for DataFrame:

my_set = {2, 3, 4}
series_with_index.to_frame()\
    .query('first_key=="a" & second_key in @my_set')
    .iloc[:,0]

huangapple
  • 本文由 发表于 2023年2月13日 23:16:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/75437773.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定