2023年2月7日 04:43:15go评论94阅读模式

英文:

Finding shortest distance of every point to a non-straight line in Python

问题

我已经创建了类似于这个的图形：

找到每个点到一条非直线的最短距离，使用Python。

我的目标是计算每个蓝色点到红色线上任意一点的最短距离。理想情况下，这可以用于选择最接近x%的点或那些落在某个特定距离内的点，但这里的主要问题是首先计算每个距离。

这些点是从数据文件中提取并绘制的：

data = np.loadtxt('gr.dat')
...
ax.scatter(data[:,0],data[:,1])

而红色线是一个计算出的Baraffe轨迹，用于创建该线的所有点都存储在一个数据文件中，通过以下方式绘制：

df=pd.read_csv('baraffe.dat', sep="\s+", names= ['mass', 'age', 'g', 'r', 'i'])
df2 = pd.DataFrame(df, columns=["mass", "age", "g", "r", "i"])
df2['b_color'] = df2['g'] - df2['r']
df2.plot(ax=ax, x='b_color',y='g', color="r")
...

基本上，我想要计算每个点在x和y方向上需要移动的最小距离，以达到红色线上的任何一点。
我尝试模仿这里的答案，但我不确定如何将该定义应用于数据框或更大的数组，总是得到TypeError。如果有任何见解，我将不胜感激，谢谢！

英文:

I have created figures similar to this one here:

找到每个点到一条非直线的最短距离，使用Python。

My goal here is to take each blue point and calculate the shortest distance it would take to get to any point on the red line. Ideally, this could be used to select the x% closest points or those falling within a certain distance, but the primary issue here is calculating each distance in the first place.

The points were taken from a data file and plotted as such:
> data = np.loadtxt('gr.dat') > ... > ax.scatter(data[:,0],data[:,1])

whereas the red line is a calculated Baraffe track where all points used to create the line were stored in a dat file and plotted via:

df=pd.read_csv(&#39;baraffe.dat&#39;, sep=&quot;\s+&quot;, names= [&#39;mass&#39;, &#39;age&#39;, &#39;g&#39;, &#39;r&#39;, &#39;i&#39;])
df2 = pd.DataFrame(df, columns=[&quot;mass&quot;, &quot;age&quot;, &quot;g&quot;, &quot;r&quot;, &quot;i&quot;])
df2[&#39;b_color&#39;] = df2[&#39;g&#39;] - df2[&#39;r&#39;]
df2.plot(ax=ax, x=&#39;b_color&#39;,y=&#39;g&#39;, color=&quot;r&quot;)
...`

This is my first attempt at using pandas so I know my code could definitely be optimized and is likely redundant, but it does output the figure attached.

Essentially, I want to calculate the smallest distance each dot would have to move (in both x and y) to reach any point on the red line.
I did try and mimic the answer in (here) but I am unsure how to apply that definition to a dataframe or larger array without always getting a TypeError. If there is any insight to this I would greatly appreciate it, and thank you!

答案1

得分: 1

使用scipy.spatial.KDTree。

一旦你在Baraffe轨迹上的点上构建了KDTree，你可以使用KDTree实例的不同方法来计算你感兴趣的所有数量。

在这里，为了简单起见，我只展示了如何使用query方法来建立最近邻点之间的一对一对应关系。

import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import KDTree
np.random.seed(20230307)
x = np.linspace(0, 10, 51)
y = np.sin(x) * 0.7
x, y = +x * 0.6 + y * 0.8, -0.8 * x + 0.6 * y
xp = np.linspace(1, 9, 21)
yp = -1 + np.random.rand(21) * 0.4
xp, yp = +xp * 0.6 + yp * 0.8, -0.8 * xp + 0.6 * yp
kdt = KDTree(np.vstack((x, y)).T) # 被索引的数组必须是N×2
distances, indices = kdt.query(np.vstack((xp, yp)).T, k=1)
fig, ax = plt.subplots()
ax.set纵横比(1)
ax.plot(x, y, color='k', lw=0.8)
ax.scatter(xp, yp, color='r')
for x0, y0, i in zip(xp, yp, indices):
    plt.plot((x0, x[i]), (y0, y[i]), color='g', lw=0.5)
plt.show()

英文:

Use scipy.spatial.KDTree.

Once you have built the KDTree on the points of the Baraffe track, you can use the different methods of the KDTree instance to compute all the quantities that are interesting you.

Here, for simplicity, I have just shown how to use the query method to build a 1—1 correspondence between most-neighboring points.

import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import KDTree
np.random.seed(20230307)
x = np.linspace(0, 10, 51)
y = np.sin(x)*0.7
x, y = +x*0.6+y*0.8, -0.8*x+0.6*y
xp = np.linspace(1, 9, 21)
yp = -1+np.random.rand(21)*0.4
xp, yp = +xp*0.6+yp*0.8, -0.8*xp+0.6*yp
kdt = KDTree(np.vstack((x, y)).T) # the array that is indexed must be N&#215;2
distances, indices = kdt.query(np.vstack((xp, yp)).T, k=1)
fig, ax = plt.subplots()
ax.set_aspect(1)
ax.plot(x, y, color=&#39;k&#39;, lw=0.8)
ax.scatter(xp, yp, color=&#39;r&#39;)
for x0, y0, i in zip(xp, yp, indices):
    plt.plot((x0, x[i]), (y0, y[i]), color=&#39;g&#39;, lw=0.5)
plt.show()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

找到每个点到一条非直线的最短距离，使用Python。

问题

答案1

PyArrow Tensor 类的用途是什么？

Why doesn't my "formula" variable update automatically, like in a spreadsheet? How can I re-compute the value?

在列表中循环多个范围

如何在Python中读取dbf文件并将其转换为数据框架

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论