从底部向顶部筛选表格以仅移除一个列中的重复项。

huangapple go评论67阅读模式
英文:

Filter a table to remove duplicates from one column only, bottom to top

问题

你可以尝试以下公式来实现你的需求:

=FILTER(A2:C5, COUNTIF(B2:B5, B2:B5) > 1)

这个公式将会过滤掉列B中的重复值,只保留最后一个重复的行。

英文:

I have some data that looks like this:

Column A Column B Column C
a z ba
b c ba
d w vo
g z ba

There are duplicate values in column B and C, but I only want to remove duplicates from col B. The result should look like this, keeping the last row that contains a duplicate in column B:

Column A Column B Column C
b c ba
d w vo
g z ba

I need to use =filter and not the Remove Duplicates tool. So far, my formulas only either return col B or display an error. I've tried:

=unique(B2:B5)

=filter(A2:C5,unique(B2:B5)=B2:B5)

答案1

得分: 7

Use XMATCH with a fourth parameter of -1:

=LET(
ζ, A2:C5,
ξ, INDEX(ζ, , 2),
FILTER(ζ, XMATCH(ξ, ξ, , -1) = SEQUENCE(ROWS(ζ)))
)

英文:

Use XMATCH with a fourth parameter of -1:

=LET(
    ζ, A2:C5,
    ξ, INDEX(ζ, , 2),
    FILTER(ζ, XMATCH(ξ, ξ, , -1) = SEQUENCE(ROWS(ζ)))
)

答案2

得分: 4

Formula in E1:

=CHOOSEROWS(A1:C4,SORT(XMATCH(UNIQUE(B1:B4),B1:B4,,-1)))

英文:

Alternatively, you could try:

从底部向顶部筛选表格以仅移除一个列中的重复项。

Formula in E1:

=CHOOSEROWS(A1:C4,SORT(XMATCH(UNIQUE(B1:B4),B1:B4,,-1)))

答案3

得分: 3

以下是翻译好的部分:

可能有更简单的方法来实现它:

=LET(in, A1:C4, seq,SEQUENCE(ROWS(in)), B,INDEX(in,,2),
 cnt, MAP(B,seq,LAMBDA(x,s, ROWS(FILTER(B, (seq <= s) * (B=x))))),
 idx,UNIQUE(MAP(B,LAMBDA(x, FILTER(seq, (B=x) * (cnt=MAX(FILTER(cnt, B=x))))))),
 CHOOSEROWS(in,SORT(idx)))

注意:添加SORT确保输出中的预期排序。感谢@JvdV提供的解决方案。

这是输出结果:
从底部向顶部筛选表格以仅移除一个列中的重复项。

名称cnt通过MAP计算每个B值的总重复值,包括当前和之前的值。名称idx找到具有给定B值的最大计数发生的索引位置。最后,我们通过CHOOSEROWS选择了已识别的索引位置的行。

英文:

Probably there are simpler ways of achieving it:

=LET(in, A1:C4, seq,SEQUENCE(ROWS(in)), B,INDEX(in,,2),
 cnt, MAP(B,seq,LAMBDA(x,s, ROWS(FILTER(B, (seq <= s) * (B=x))))),
 idx,UNIQUE(MAP(B,LAMBDA(x, FILTER(seq, (B=x) * (cnt=MAX(FILTER(cnt, B=x))))))),
 CHOOSEROWS(in,SORT(idx)))

Note: Adding SORT ensures the expected ordering in the output. Credit to @JvdV from his solution.

Here is the output:
从底部向顶部筛选表格以仅移除一个列中的重复项。

The name cnt, counts total current and previous repeated values for each value of B via MAP. The name idx, finds the index position where the max counts happens for a given B value. Finally, we select the rows via CHOOSEROWS for the index positions identified.

huangapple
  • 本文由 发表于 2023年6月29日 07:14:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/76577220.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定