Dtaidistance SSE 和 Silhouette 分数

huangapple go评论77阅读模式
英文:

Dtaidistance SSE and Silhouette score

问题

我正在寻找一种在数据上训练后轻松打印Dtaidistance(https://dtaidistance.readthedocs.io/en/latest/index.html)k均值模型的SSE值和轮廓分数的方法。虽然TSlearn k均值可以生成_inertia和_labels,我可以从中检索所需的信息,但似乎在Dtaidistance库中没有等效的方法来实现这一点。我想避免再次运行训练,因为我有一个庞大的时间序列数据集。
谢谢大家 Dtaidistance SSE 和 Silhouette 分数

# kmeans k = 4 python - dtaidistance kmeans settings
km0 = dtaikm(
    k=4, 
    max_it=5, 
    max_dba_it=5, 
    thr=0.0001,
    drop_stddev=3, 
    initialize_with_kmeanspp=True, 
    initialize_sample_size= 4, 
    show_progress=True
    )

# fit
cluster_idx, performed_it = km0.fit_fast(x_red)

# 现在在km0.means[i]中,我有质心i,
# 在cluster_idx[i]中,我有分配到簇i的行的id列表
英文:

I'm searching for a way to easily print SSE value and Silhouette score of a Dtaidistance (https://dtaidistance.readthedocs.io/en/latest/index.html) kmeans model after its training on data. While TSlearn kmeans produce _inertia and _labels from which I can retrieve the information needed, doesn't seems to me an equivalent way to do that with Dtaidistance library. I'd like to avoid another run of training because I have an huge dataset of time series.
Thank you everyone Dtaidistance SSE 和 Silhouette 分数

#kmeans k = 4 python - dtaidistance kmeans settings
km0 = dtaikm(
    k=4, 
    max_it=5, 
    max_dba_it=5, 
    thr=0.0001,
    drop_stddev=3, 
    initialize_with_kmeanspp=True, 
    initialize_sample_size= 4, 
    show_progress=True
    )

# fit
cluster_idx, performed_it = km0.fit_fast(x_red)

#now i have in km0.means[i] the centroid i and
#in cluster_idx[i] the list of rows' ids assigned to cluster i

答案1

得分: 1

这部分内容的中文翻译如下:

这不受支持。我们(这里的作者)已经向fit函数添加了一个额外的参数(monitor_distances),它接受一个函数,您可以在其中计算惯性。这在Github的主分支上可用(并将成为下一个版本的一部分)。

这允许您执行类似以下的操作:

def mymonitor(clusters_distances, clustering_ended):
    clusters, distances = zip(*clusters_distances)
    ... 计算惯性并打印/绘制/保存
    return True
cluster_idx, performed_it = km0.fit_fast(x_red, monitor_distances=mymonitor)
英文:

This was not supported. We (authors here) have added an extra argument to the fit function (monitor_distances) that accepts a function in which you can compute inertia. This is available in the master branch on Github (and will be part of the next release).

This allowed you to do something like:

def mymonitor(clusters_distances, clustering_ended):
    clusters, distances = zip(*clusters_distances)
    ... compute inertia and print/plot/save
    return True
cluster_idx, performed_it = km0.fit_fast(x_red, monitor_distances=mymonitor)

huangapple
  • 本文由 发表于 2023年6月19日 14:49:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76504235.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定