英文:
Spark KMeans produces deterministic results and not random
问题
我正在运行 Spark KMeans,并希望每次运行都有不同的随机种子以获得不同的结果,但实际情况并非如此。这是我正在使用的代码:
KMeans kmeans = new KMeans().setK(4).setInitMode("random");
KMeansModel model = kmeans.fit(ds);
Dataset<Row> predictions = model.transform(ds);
我总是得到相同的分数和相同的聚类结果。在代码中是否有遗漏的部分?
英文:
I am running Spark KMeans and I would like to have random seeds in every run for different results every time, however this is not the case. This is the code that I am using:
KMeans kmeans = new KMeans().setK(4).setInitMode("random");
KMeansModel model = kmeans.fit(ds);
Dataset<Row> predictions = model.transform(ds);
I always get the same score and the same clusters. Am I missing something in the code?
答案1
得分: 0
我认为你缺少了随机种子:
// 设置随机种子
long seed = System.currentTimeMillis();
// 创建KMeans实例并设置随机种子
KMeans kmeans = new KMeans().setK(4).setInitMode("random").setSeed(seed);
KMeansModel model = kmeans.fit(ds);
Dataset
英文:
I think you're missing the random seed:
// Set the random seed
long seed = System.currentTimeMillis();
// Create the KMeans instance and set the random seed
KMeans kmeans = new KMeans().setK(4).setInitMode("random").setSeed(seed);
KMeansModel model = kmeans.fit(ds);
Dataset<Row> predictions = model.transform(ds);
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论