问题

我有2个天文目录，包含了宇宙中的星系及其相应的天空坐标（赤经，赤纬）。我将这些目录处理为数据框。这些目录来自不同的观测调查，其中一些星系出现在两个目录中。我想交叉匹配这些星系并将它们放入新的目录。我该如何用Python做到这一点？我认为可能有一些简单的方法可以用numpy、pandas、astropy或其他包来实现，但我找不到解决方案。谢谢

英文:

I have 2 astronomical catalogues, containing galaxies with their respective sky coordinates (ra, dec). I handle the catalogues as data frames. The catalogs are from different observational surveys and there are some galaxies that appear in both catalogs. I want to cross match these galaxies and put them in a new catalog. How can I do this is with python? I taught there should be some easy way with numpy, pandas, astropy or another package, but I couldn't find a solution? Thx

答案1

得分: 0

以下是您要翻译的内容：

经过大量研究，我发现使用一个叫做 astroml 的软件包是最简单的方法，这里有一个教程。我在以下的笔记本中使用了它：cross_math_data_and_colour_cuts_.ipynb 和 PS_data_cleaning_and_processing.ipynb。

from astroML.crossmatch import crossmatch_angular
# 如果你使用 Google Colab，请先运行这一行命令："!pip install astroml"

df_1 = pd.read_csv('catalog_1.csv')
df_2 = pd.read_csv('catalog_2.csv')

# 对目录进行交叉匹配
max_radius = 1. / 3600  # 1 弧秒
# 注意，为了使下面的代码正常运行，目录的前两列应该是 ra 和 dec
# 此外，df_1 应该是两个目录中较长的一个，否则会出现索引错误
dist, ind = crossmatch_angular(df_1.values, df_2.values, max_radius)
match = ~np.isinf(dist)
# 那么所需的解决方案是：
df_crossed = df_1[match]

# 或者：
# ind 包含与第二个目录相匹配的星系的索引，
# 当没有匹配时，索引的值等于第一个目录的长度
# 所以，如果你必须使用第二个目录的索引而不是第一个目录的索引，请执行以下操作：
df_2['new_var'] = [df_2.old_var[i] if i<len(df_2) else -999 for i in ind]
# 这样，每当你有一个匹配时，'new_var' 将包含来自 'old_var' 的正确值
# 每当你有一个不匹配时，它将包含 -999 作为标志

英文:

After a lot of research the easiest way I have found is by using a package called astroml, here a tutorial. Notebooks I have used it in are called cross_math_data_and_colour_cuts_.ipynb and PS_data_cleaning_and_processing.ipynb.

from astroML.crossmatch import crossmatch_angular
# if you are using google colab use first the line &quot;!pip install astroml&quot;

df_1 = pd.read_csv(&#39;catalog_1.csv&#39;)
df_2 = pd.read_csv(&#39;catalog_2.csv&#39;)

# crossmatch catalogs
max_radius = 1. / 3600  # 1 arcsec
# note, that for the below to work the first 2 columns of the catalogs should be ra, dec
# also, df_1 should be the longer of the 2 catalogs, else there will be index errors
dist, ind = crossmatch_angular(df_1.values, df_2.values, max_radius)
match = ~np.isinf(dist)
# THE DESIRED SOLUTION IS THEN:
df_crossed = df_1[match]


# ALTERNATIVELY:
# ind contains the indices of the cross-matched galaxies in respect to the second catalog,
# when there is no match it the kind value is the length of the first catalog
# so if you necessarily have to work with the indices of the second catalog, instead of the first, do:
df_2[&#39;new_var&#39;] = [df_2.old_var[i] if i&lt;len(df_2) else -999 for i in mind]
# that way whenever you have a match &#39;new_var&#39; will contain the correct value from &#39;old_var&#39;
# and whenever you have a mismatch it will contain -999 as a flag

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何使用Python 2的数据框按（笛卡尔）坐标进行交叉匹配？

问题

答案1

尝试在Python 3.11中的while循环内并发运行线程

尝试使用pytube下载时出现问题

Remove the observations which are more than the i’th duplicated observation in pandas.

如何根据数据框中的条件将逻辑从（True更改为False）或（False更改为True）？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论