英文:
remove duplicate coordinates from X and Y column
问题
根据以下数据,如何删除具有重复的 X
和 Y
coordinates
的行?在下面的示例中,您将注意到 X
坐标之一是 -1.52
,它重复了 两次
,但它不是重复的,因为它对应的 Y
坐标是不同的。
我不知道这是否重要,但请注意,原始数据集的X和Y值有 2
位以上的小数位。
样本数据:
structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), X = c(-1.01,
-1.11, -1.11, -2.13, -2.13, -1.52, -1.52, -1.98, -3.69, -4.79),
Y = c(2.11, 3.33, 3.33, 6.66, 6.66, 7.77, 8.88, 9.99, 1.11,
6.68)), class = "data.frame", row.names = c(NA, -10L))
期望的数据:
id X Y
1 -1.01 2.11
2 -1.11 3.33
4 -2.13 6.66
6 -1.52 7.77
7 -1.52 8.88
8 -1.98 9.99
9 -3.69 1.11
19 -4.79 6.68
英文:
Based on the data below how can I remove the rows with duplicate X
and Y
coordinates
? In the example below, you will notice that one of X
coordinate is -1.52
which is repeated twice
but it's not a duplicate since it's corresponding Y
coordiantes are different.
I don't know if it matters but please note that the orginal dataset has more than 2
decimal places for the X and Y values.
Sample data:
structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), X = c(-1.01,
-1.11, -1.11, -2.13, -2.13, -1.52, -1.52, -1.98, -3.69, -4.79),
Y = c(2.11, 3.33, 3.33, 6.66, 6.66, 7.77, 8.88, 9.99, 1.11,
6.68)), class = "data.frame", row.names = c(NA, -10L))
Desired data:
id X Y
1 -1.01 2.11
2 -1.11 3.33
4 -2.13 6.66
6 -1.52 7.77
7 -1.52 8.88
8 -1.98 9.99
9 -3.69 1.11
19 -4.79 6.68
答案1
得分: 1
使用 duplicated
函数:
subset(df1, !duplicated(df1[-1]))
输出结果:
id X Y
1 1 -1.01 2.11
2 2 -1.11 3.33
4 4 -2.13 6.66
6 6 -1.52 7.77
7 7 -1.52 8.88
8 8 -1.98 9.99
9 9 -3.69 1.11
10 10 -4.79 6.68
或者使用 distinct
函数:
library(dplyr)
df1 %>%
distinct(X, Y, .keep_all = TRUE)
英文:
Use duplicated
subset(df1, !duplicated(df1[-1]))
-output
id X Y
1 1 -1.01 2.11
2 2 -1.11 3.33
4 4 -2.13 6.66
6 6 -1.52 7.77
7 7 -1.52 8.88
8 8 -1.98 9.99
9 9 -3.69 1.11
10 10 -4.79 6.68
Or with distinct
library(dplyr)
df1 %>%
distinct(X, Y, .keep_all = TRUE)
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论