执行K均值聚类分析时,如何将数据重新组织为各个簇?

huangapple go评论60阅读模式
英文:

Performing k means cluster analysis, how can I reorganize the data into individual clusters?

问题

我正在对一个包含62个变量的数据框执行k均值聚类分析:Tapping number 1-62 和75000列。如何将数据框组织成单独的聚类?

我使用fviz_cluster来可视化聚类:

r_fit = kmeans(pressure_rotate, 5, nstart = 25)
fviz_cluster(r_fit, data = pressure_rotate)

我能够通过使用r_fit$cluster命令访问表格,查看哪个变量属于哪个聚类,但如何重新组织数据以查看每个聚类包含的内容?类似以下内容:

cluster 1: Tapping number 3, Tapping number 5, Tapping number 12, ...
cluster 2: Tapping number 7, Tapping number 9, ....
等等
英文:

I am performing a k-means cluster analysis on a data frame with 62 variables: Tapping number 1-62 and 75000 columns. How can I organize the data frame into individual clusters?

I used fviz_cluster to visualize the clusters:

r_fit = kmeans(pressure_rotate, 5, nstart = 25)
fviz_cluster(r_fit,data = pressure_rotate)

and I was able to access a table for which variable belongs to which cluster with r_fit$cluster command, but how can I reorganize the data so that I can see what each cluster contains? Like something along the lines of:

cluster 1: Tapping number 3, Tapping number 5, Tapping number 12, ...
cluster 2: Tapping number 7, tapping number 9, ....
etc

答案1

得分: 1

您有62行观测和75000列/变量。这正确吗?不是62个变量。不清楚“Tapping number”是否是您的数据中的一列还是行号。以下是使用R中包含的iris数据的示例:

data(iris)  # 150行,4个数值变量,一个物种变量
iris.km <- kmeans(iris[, -5], 3, nstart=25)   # 排除物种变量
fviz_cluster(iris.km, iris[, -5])       # 制作显示聚类的图表
split(rownames(iris), iris.km$cluster)  # 按行名显示簇成员
# $`1`
#  [1] "51"  "52"  "54"  "55"  "56"  "57"  "58"  "59"  "60"  "61"  "62"  "63"  "64"  "65"  "66"  "67"  "68"  "69"  "70"  "71"  "72"  "73"  "74"  "75"  "76"  "77" 
# [27] "79"  "80"  "81"  "82"  "83"  "84"  "85"  "86"  "87"  "88"  "89"  "90"  "91"  "92"  "93"  "94"  "95"  "96"  "97"  "98"  "99"  "100" "102" "107" "114" "115"
# [53] "120" "122" "124" "127" "128" "134" "139" "143" "147" "150"

# $`2`
#  [1] "53"  "78"  "101" "103" "104" "105" "106" "108" "109" "110" "111" "112" "113" "116" "117" "118" "119" "121" "123" "125" "126" "129" "130" "131" "132" "133"
# [27] "135" "136" "137" "138" "140" "141" "142" "144" "145" "146" "148" "149"

# $`3`
#  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32"
# [33] "33" "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44" "45" "46" "47" "48" "49" "50"

请让我知道如果您需要进一步的帮助。

英文:

You have 62 rows/observations and 75000 columns/variables. Is that correct? Not 62 variables. It is not clear if "Tapping number" is a column in your data or just the row number. Here is an example using the iris data included in R:

data(iris)  # 150 rows, 4 numeric variables, one species variable
iris.km &lt;- kmeans(iris[, -5], 3, nstart=25)   # Exclude species variable
fviz_cluster(iris.km, iris[, -5])       # Make a plot showing the clusters
split(rownames(iris), iris.km$cluster)  # Show cluster membership by row name
# $`1`
#  [1] &quot;51&quot;  &quot;52&quot;  &quot;54&quot;  &quot;55&quot;  &quot;56&quot;  &quot;57&quot;  &quot;58&quot;  &quot;59&quot;  &quot;60&quot;  &quot;61&quot;  &quot;62&quot;  &quot;63&quot;  &quot;64&quot;  &quot;65&quot;  &quot;66&quot;  &quot;67&quot;  &quot;68&quot;  &quot;69&quot;  &quot;70&quot;  &quot;71&quot;  &quot;72&quot;  &quot;73&quot;  &quot;74&quot;  &quot;75&quot;  &quot;76&quot;  &quot;77&quot; 
# [27] &quot;79&quot;  &quot;80&quot;  &quot;81&quot;  &quot;82&quot;  &quot;83&quot;  &quot;84&quot;  &quot;85&quot;  &quot;86&quot;  &quot;87&quot;  &quot;88&quot;  &quot;89&quot;  &quot;90&quot;  &quot;91&quot;  &quot;92&quot;  &quot;93&quot;  &quot;94&quot;  &quot;95&quot;  &quot;96&quot;  &quot;97&quot;  &quot;98&quot;  &quot;99&quot;  &quot;100&quot; &quot;102&quot; &quot;107&quot; &quot;114&quot; &quot;115&quot;
# [53] &quot;120&quot; &quot;122&quot; &quot;124&quot; &quot;127&quot; &quot;128&quot; &quot;134&quot; &quot;139&quot; &quot;143&quot; &quot;147&quot; &quot;150&quot;
# 
# $`2`
#  [1] &quot;53&quot;  &quot;78&quot;  &quot;101&quot; &quot;103&quot; &quot;104&quot; &quot;105&quot; &quot;106&quot; &quot;108&quot; &quot;109&quot; &quot;110&quot; &quot;111&quot; &quot;112&quot; &quot;113&quot; &quot;116&quot; &quot;117&quot; &quot;118&quot; &quot;119&quot; &quot;121&quot; &quot;123&quot; &quot;125&quot; &quot;126&quot; &quot;129&quot; &quot;130&quot; &quot;131&quot; &quot;132&quot; &quot;133&quot;
# [27] &quot;135&quot; &quot;136&quot; &quot;137&quot; &quot;138&quot; &quot;140&quot; &quot;141&quot; &quot;142&quot; &quot;144&quot; &quot;145&quot; &quot;146&quot; &quot;148&quot; &quot;149&quot;
# 
# $`3`
#  [1] &quot;1&quot;  &quot;2&quot;  &quot;3&quot;  &quot;4&quot;  &quot;5&quot;  &quot;6&quot;  &quot;7&quot;  &quot;8&quot;  &quot;9&quot;  &quot;10&quot; &quot;11&quot; &quot;12&quot; &quot;13&quot; &quot;14&quot; &quot;15&quot; &quot;16&quot; &quot;17&quot; &quot;18&quot; &quot;19&quot; &quot;20&quot; &quot;21&quot; &quot;22&quot; &quot;23&quot; &quot;24&quot; &quot;25&quot; &quot;26&quot; &quot;27&quot; &quot;28&quot; &quot;29&quot; &quot;30&quot; &quot;31&quot; &quot;32&quot;
# [33] &quot;33&quot; &quot;34&quot; &quot;35&quot; &quot;36&quot; &quot;37&quot; &quot;38&quot; &quot;39&quot; &quot;40&quot; &quot;41&quot; &quot;42&quot; &quot;43&quot; &quot;44&quot; &quot;45&quot; &quot;46&quot; &quot;47&quot; &quot;48&quot; &quot;49&quot; &quot;50&quot;

huangapple
  • 本文由 发表于 2023年2月23日 22:12:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/75545972.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定