从X和Y列中删除重复的坐标。

huangapple go评论75阅读模式
英文:

remove duplicate coordinates from X and Y column

问题

根据以下数据,如何删除具有重复的 XY coordinates 的行?在下面的示例中,您将注意到 X 坐标之一是 -1.52,它重复了 两次,但它不是重复的,因为它对应的 Y 坐标是不同的。

我不知道这是否重要,但请注意,原始数据集的X和Y值有 2 位以上的小数位。

样本数据:

  1. structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), X = c(-1.01,
  2. -1.11, -1.11, -2.13, -2.13, -1.52, -1.52, -1.98, -3.69, -4.79),
  3. Y = c(2.11, 3.33, 3.33, 6.66, 6.66, 7.77, 8.88, 9.99, 1.11,
  4. 6.68)), class = "data.frame", row.names = c(NA, -10L))

期望的数据:

  1. id X Y
  2. 1 -1.01 2.11
  3. 2 -1.11 3.33
  4. 4 -2.13 6.66
  5. 6 -1.52 7.77
  6. 7 -1.52 8.88
  7. 8 -1.98 9.99
  8. 9 -3.69 1.11
  9. 19 -4.79 6.68
英文:

Based on the data below how can I remove the rows with duplicate X and Y coordinates? In the example below, you will notice that one of X coordinate is -1.52 which is repeated twice but it's not a duplicate since it's corresponding Y coordiantes are different.

I don't know if it matters but please note that the orginal dataset has more than 2 decimal places for the X and Y values.

Sample data:

  1. structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), X = c(-1.01,
  2. -1.11, -1.11, -2.13, -2.13, -1.52, -1.52, -1.98, -3.69, -4.79),
  3. Y = c(2.11, 3.33, 3.33, 6.66, 6.66, 7.77, 8.88, 9.99, 1.11,
  4. 6.68)), class = "data.frame", row.names = c(NA, -10L))

Desired data:

  1. id X Y
  2. 1 -1.01 2.11
  3. 2 -1.11 3.33
  4. 4 -2.13 6.66
  5. 6 -1.52 7.77
  6. 7 -1.52 8.88
  7. 8 -1.98 9.99
  8. 9 -3.69 1.11
  9. 19 -4.79 6.68

答案1

得分: 1

使用 duplicated 函数:

  1. subset(df1, !duplicated(df1[-1]))

输出结果:

  1. id X Y
  2. 1 1 -1.01 2.11
  3. 2 2 -1.11 3.33
  4. 4 4 -2.13 6.66
  5. 6 6 -1.52 7.77
  6. 7 7 -1.52 8.88
  7. 8 8 -1.98 9.99
  8. 9 9 -3.69 1.11
  9. 10 10 -4.79 6.68

或者使用 distinct 函数:

  1. library(dplyr)
  2. df1 %>%
  3. distinct(X, Y, .keep_all = TRUE)
英文:

Use duplicated

  1. subset(df1, !duplicated(df1[-1]))

-output

  1. id X Y
  2. 1 1 -1.01 2.11
  3. 2 2 -1.11 3.33
  4. 4 4 -2.13 6.66
  5. 6 6 -1.52 7.77
  6. 7 7 -1.52 8.88
  7. 8 8 -1.98 9.99
  8. 9 9 -3.69 1.11
  9. 10 10 -4.79 6.68

Or with distinct

  1. library(dplyr)
  2. df1 %>%
  3. distinct(X, Y, .keep_all = TRUE)
  4. </details>

huangapple
  • 本文由 发表于 2023年2月16日 10:52:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/75467375.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定