“Spark KMeans 生成确定性结果,而非随机结果。”

huangapple go评论59阅读模式
英文:

Spark KMeans produces deterministic results and not random

问题

我正在运行 Spark KMeans,并希望每次运行都有不同的随机种子以获得不同的结果,但实际情况并非如此。这是我正在使用的代码:

KMeans kmeans = new KMeans().setK(4).setInitMode("random");
KMeansModel model = kmeans.fit(ds);
Dataset<Row> predictions = model.transform(ds);

我总是得到相同的分数和相同的聚类结果。在代码中是否有遗漏的部分?

英文:

I am running Spark KMeans and I would like to have random seeds in every run for different results every time, however this is not the case. This is the code that I am using:

KMeans kmeans = new KMeans().setK(4).setInitMode(&quot;random&quot;);
KMeansModel model = kmeans.fit(ds);
Dataset&lt;Row&gt; predictions = model.transform(ds);

I always get the same score and the same clusters. Am I missing something in the code?

答案1

得分: 0

我认为你缺少了随机种子:

// 设置随机种子
long seed = System.currentTimeMillis();

// 创建KMeans实例并设置随机种子
KMeans kmeans = new KMeans().setK(4).setInitMode("random").setSeed(seed);
KMeansModel model = kmeans.fit(ds);
Dataset predictions = model.transform(ds);

英文:

I think you're missing the random seed:

// Set the random seed
long seed = System.currentTimeMillis();

// Create the KMeans instance and set the random seed
KMeans kmeans = new KMeans().setK(4).setInitMode(&quot;random&quot;).setSeed(seed);
KMeansModel model = kmeans.fit(ds);
Dataset&lt;Row&gt; predictions = model.transform(ds);

huangapple
  • 本文由 发表于 2023年5月15日 15:06:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76251623.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定