从使用zip创建的元组列表中删除浮点数的重复项。

huangapple go评论88阅读模式
英文:

Remove float duplicates from a list of tuples created by zip

问题

我使用zip函数将三个列表XaData、Y1aData和Y2aData合并成一个元组的列表XYZip:

  1. XYZip = list(zip(XaData, Y1aData, Y2aData))

然后,您想要移除XaData值重复的元组,以确保x值是严格递增的。您可以使用以下方法实现:

  1. # 创建一个字典来存储XaData值作为键,将唯一的元组作为值
  2. unique_data = {}
  3. for x, y1, y2 in XYZip:
  4. if x not in unique_data:
  5. unique_data[x] = (x, y1, y2)
  6. # 获取唯一元组的列表
  7. XYUnique = list(unique_data.values())
  8. # 按照XaData值进行排序
  9. XYSorted = sorted(XYUnique, key=lambda item: item[0])
  10. # 分离XaData、Y1aData和Y2aData
  11. XaData, Y1aData, Y2aData = zip(*XYSorted)

这将生成包含唯一XaData值的元组列表并按照XaData值进行排序,以满足您的要求。

英文:

I create a list of tuples by zipping three lists together, data pairs:

  1. XYZip = list(zip(XaData, Y1aData, Y2aData))
  1. [
  2. (0.001625625, 4.782947316198166, -0.011032947316198166),
  3. (-2.5e-06, 4.783447358402665, 0.020216552641597337),
  4. (0.0008137499999999999, 4.782997384780477, -0.017282997384780476),
  5. (0.00081, 4.783247405882726, 0.020216752594117274),
  6. (0.001625625, 4.782066023993667, -0.011032066023993668),
  7. (0.00324625, 4.780809700135795, 0.03271919029986421),
  8. ...,
  9. (19.4121325, 4.653511649011105, 1.1703464883509889)
  10. ]

I need to get rid of the tuples where the XaData value is a duplicate, like this one 0.001625625. The whole tuple (0.001625625, 4.782066023993667, -0.011032066023993668) needs to go. Order doesn't matter. I can sort in a second step. I tried set() to no avail.

Data will be fed to a scipy.CubicSpline function where x need to be strictly increasing and will not accept duplicates!

I tried...

  1. XYZip = list(zip(XaData, Y1aData, Y2aData))
  2. XYUnique = set(XYZip)
  3. XYSorted = sorted(XYUnique)
  4. XaData, Y1aData, Y2aData = zip(*XYSorted)

...obviously not removing the tuples with the duplicate XaData value.

This is what I need in the first step:

  1. [
  2. (0.001625625, 4.782947316198166, -0.011032947316198166),
  3. (-2.5e-06, 4.783447358402665, 0.020216552641597337),
  4. (0.0008137499999999999, 4.782997384780477, -0.017282997384780476),
  5. (0.00081, 4.783247405882726, 0.020216752594117274),
  6. (0.00324625, 4.780809700135795, 0.03271919029986421),
  7. ...,
  8. (19.4121325, 4.653511649011105, 1.1703464883509889)
  9. ]

答案1

得分: 1

为什么元组的集合在这种情况下没有返回正确的答案

  1. (0.001625625, 4.782947316198166, -0.011032947316198166)
  2. , | , |
  3. 相同, 不相同 , 不相同
  4. , | , |
  5. (0.001625625, 4.782066023993667, -0.011032066023993668)
  6. # 所以
  7. (0.001625625, 4.782947316198166, -0.011032947316198166)
  8. # 不相等
  9. (0.001625625, 4.782066023993667, -0.011032066023993668)

如果您使用 pandas,就会变得简单而直观。

示例:

  1. x = 1,1,2,3,4,5
  2. y = 1,2,3,4,5,6
  3. z = 1,3,5,7,9,11
  4. pd.DataFrame(zip(x, y, z), columns=["x","y","z"]).drop_duplicates(subset='x', keep="last")

步骤 1
创建数据框

  1. df = pd.DataFrame(zip(x, y, z), columns=["x","y","z"])
x y z
0 1 1 1
1 1 2 3
2 2 3 5
3 3 4 7
4 4 5 9
5 5 6 11

步骤 2
删除重复项 参考

  1. df = df.drop_duplicates(subset='x', keep="last")
x y z
1 1 2 3
2 2 3 5
3 3 4 7
4 4 5 9
5 5 6 11

删除 (1,1,1) 因为 keep="last"

将它整合到您的代码中

  1. XYZip = list(zip(XaData, Y1aData, Y2aData))
  2. # 创建数据框
  3. df = pd.DataFrame(XYZip, columns=["XaData","Y1aData","Y2aData"])
  4. # 删除 XaData 值上的重复项。
  5. df = df.drop_duplicates(subset='XaData', keep="last")
  6. # 如果您想要转换为元组的列表
  7. result = [tuple(i) for i in df.values]
  8. # result = [(1, 2, 3), (2, 3, 5), (3, 4, 7), (4, 5, 9), (5, 6, 11)]

或者

使用元组的字典。

  1. temp = {i_x: (i_y, i_z) for i_x, i_y, i_z in zip(x, y, z)}
  2. [((i,)+temp[i]) for i in temp]

步骤 1
将 x、y、z 转换为字典(键为 x,因为我需要删除 x 上的重复项)

  1. temp = {i_x: (i_y, i_z) for i_x, i_y, i_z in zip(x, y, z)}

步骤 2
转换为元组的列表

  1. [((i,)+temp[i]) for i in temp]

结果

  1. [(1, 2, 3), (2, 3, 5), (3, 4, 7), (4, 5, 9), (5, 6, 11)]
  2. # 删除 (1,1,1) 因为 (1, 1, 1) 和 (1, 2, 3) 在第一个元素上相同。
英文:

Why set of tuple is not return correct answer in this case

  1. (0.001625625, 4.782947316198166, -0.011032947316198166)
  2. , | , |
  3. same, not same , not same
  4. , | , |
  5. (0.001625625, 4.782066023993667, -0.011032066023993668)
  6. # so
  7. (0.001625625, 4.782947316198166, -0.011032947316198166)
  8. # not equal
  9. (0.001625625, 4.782066023993667, -0.011032066023993668)

It is easy and straightforward if you use pandas instead.

Example:

  1. x = 1,1,2,3,4,5
  2. y = 1,2,3,4,5,6
  3. z = 1,3,5,7,9,11
  4. pd.DataFrame(zip(x, y, z), columns=["x","y","z"]).drop_duplicates(subset='x', keep="last")

step 1
create DataFrame

  1. df = pd.DataFrame(zip(x, y, z), columns=["x","y","z"])
x y z
0 1 1 1
1 1 2 3
2 2 3 5
3 3 4 7
4 4 5 9
5 5 6 11

step 2
drop duplicates Reference

  1. df = df.drop_duplicates(subset='x', keep="last")
x y z
1 1 2 3
2 2 3 5
3 3 4 7
4 4 5 9
5 5 6 11

drop 1, 1, 1 because keep="last"

Combine to your code

  1. XYZip = list(zip(XaData, Y1aData, Y2aData))
  2. # create data frame
  3. df = pd.DataFrame(XYZip, columns=["XaData","Y1aData","Y2aData"])
  4. # removing the duplicate on XaData value.
  5. df = df.drop_duplicates(subset='XaData', keep="last")
  6. # if you want to convert to list of tuple
  7. result = [tuple(i) for i in df.values]
  8. # result = [(1, 2, 3), (2, 3, 5), (3, 4, 7), (4, 5, 9), (5, 6, 11)]

or

Use dictionary of tuple instead.

  1. temp = {i_x: (i_y, i_z) for i_x, i_y, i_z in zip(x, y, z)}
  2. [((i,)+temp[i]) for i in temp]

step 1
convert x y z to dictionary (key x because I need to delete duplicate on x)

  1. temp = {i_x: (i_y, i_z) for i_x, i_y, i_z in zip(x, y, z)}

step 2
convert to list of tuple

  1. [((i,)+temp[i]) for i in temp]

result

  1. [(1, 2, 3), (2, 3, 5), (3, 4, 7), (4, 5, 9), (5, 6, 11)]
  2. # drop (1,1,1) because (1, 1, 1) and (1, 2, 3) are same in first element.

答案2

得分: 0

我通过检查值是否已在列表中解决了这个问题,例如:

  1. if x not in x_list:
  2. x_list.append(x)
  3. y_list.append(y)
  4. z_list.append(z)

这解决了问题。所有带有 set() 的构造都失败了,而且 scipy 的样条插值引发了关于 x 值不是严格递增的错误。

英文:

I solved it by checking if value is already in list like

  1. if x not in x_list:
  2. x_list.append(x)
  3. y_list.append(y)
  4. z_list.append(z)

That solved it. All constructs with set() failed and scipy spline interpolation threw an error about not strictly increasing x-values.

huangapple
  • 本文由 发表于 2023年4月11日 10:53:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/75982083.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定