问题

我需要在我的数据上执行K均值聚类。我已经在R和Python中都实现了K均值算法，具体使用了sklearn和SciPy库。然而，我在两种语言之间的聚类结果中遇到了差异，Python似乎生成了一个在R中不存在的离群值。

我已经确保在R和Python中使用了相同的输入数据和参数（例如，簇的数量、初始化方法）。尽管如此，我无法获得与R生成的相同的簇中心。我还尝试在Python中使用了K均值++初始化方法，但问题仍然存在。

我将非常感谢任何关于如何解决这种差异并在R和Python之间获得一致的聚类结果的见解或建议。我是否可能遗漏了任何特定的考虑或参数设置？

英文:

I need to perform K-means clustering on my data. I have implemented the K-means algorithm in both R and Python, specifically using the libraries sklearn and SciPy. However, I am encountering a discrepancy in the clustering results between the two languages, where Python seems to generate an outlier that is not present in R.

I have ensured that I am using the same input data and parameters (e.g., number of clusters, initialization method) in both R and Python. Despite this, I am unable to obtain identical cluster centers as generated by R. I have also attempted using the K-means++ initialization method in Python, but the issue persists.

I would greatly appreciate any insights or suggestions on how to resolve this discrepancy and achieve consistent clustering results between R and Python. Is there any specific consideration or parameter setting that I might be missing?

答案1

得分: 1

我不是一个机器学习专家，所以也许你不应该太过信任我的话，但是，你考虑过K均值可能不是一种确定性方法，导致结果不同吗？

如果我是你，我会尝试在两者中都使用更低的容差和更多的迭代次数来运行算法，并检查结果是否继续不同。

英文:

I am not a machine learning expert so maybe you should not trust my word very much but, have you considered that maybe the differing results are because kmeans is not a deterministic approach?

If I were you, I would try to run the algorithm using a lower tolerance and a higher number of iterations in both, and check if the results continue differing.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

有没有Python库可以生成与R KMeans完全相同的输出？

问题

答案1

如何在ggplot中重新排序以数字+字符串作为刻度标签的数字？

允许 eval() 仅评估算术表达式和特定函数。

keyerror: 2

使用numpy.unique从先前计算的计数中恢复计数。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。