统计重复测量中值的发生次数

huangapple go评论71阅读模式
英文:

Count occurrence of value in repeated measure

问题

你好,我有以下数据集:

```R
ID <- c(1,1,1,2,2,3,3,3,4,4,4)
diagnosis <- c("A","A","B","C","C","B","A","A","C","C","B")
df <- data.frame(ID,diagnosis)

我想统计每种诊断类型有多少人。有些人多次拥有相同的诊断,我希望它们只计数一次。

例如,只有两人被诊断为"A"。 (ID 1 和 ID 3)

例如,只有两人被诊断为"C"。 (ID 2 和 ID 4)

例如,只有三人被诊断为"B"。 (ID 1、ID 2 和 ID 4)

我想知道是否有一种方法可以将上述信息汇总成一张表格。

非常感谢任何帮助!谢谢!


<details>
<summary>英文:</summary>

Hi I have the dataset below:

ID <- c(1,1,1,2,2,3,3,3,4,4,4)
diagnosis <- c("A","A","B","C","C","B","A","A","C","C","B")
df <- data.frame(ID,diagnosis)

ID diagnosis
1 A
1 A
1 B
2 C
2 C
3 B
3 A
3 A
4 C
4 C
4 B

I would like to count how many people had each type of diagnosis. Some people have the same diagnosis multiple times which I would like to have them count once. 

ie. Only two people would have diagnosis &quot;A&quot;. (ID 1 and ID 3)

ie. Only two people would have diagnosis &quot;C&quot;. (ID 2 and ID 4)

ie. Only three people would have diagnosis &quot;B&quot;. (ID 1, ID 2 and ID 4)

I&#39;m wondering if there&#39;s a way of summarizing the above into a table.

I would appreciate all the help there is! Thanks!!!

</details>


# 答案1
**得分**: 3

你可以使用 `group_by` 按照诊断进行分组,并使用 `summarise` 结合 `n_distinct` 来计算每个分组中的 ID 数量,就像这样:

``` r
library(dplyr)
df %>%
  group_by(diagnosis) %>%
  summarise(n = n_distinct(ID))

#> # A tibble: 3 × 2
#> diagnosis n
#>
#> 1 A 2
#> 2 B 3
#> 3 C 2


<sup>创建于2023年03月31日,使用 [reprex v2.0.2](https://reprex.tidyverse.org)。</sup>

<details>
<summary>英文:</summary>

You could `group_by` on diagnosis and `summarise` with `n_distinct` to count the ID&#39;s per group like this:

``` r
library(dplyr)
df %&gt;%
  group_by(diagnosis) %&gt;%
  summarise(n = n_distinct(ID))
#&gt; # A tibble: 3 &#215; 2
#&gt;   diagnosis     n
#&gt;   &lt;chr&gt;     &lt;int&gt;
#&gt; 1 A             2
#&gt; 2 B             3
#&gt; 3 C             2

<sup>Created on 2023-03-31 with reprex v2.0.2</sup>

答案2

得分: 3

# A B C 
# 2 3 2
英文:
cols &lt;- c(&quot;ID&quot;, &quot;diagnosis&quot;)

table(unique(df[cols])$diagnosis)

# A B C 
# 2 3 2 

答案3

得分: 2

尝试 table + colSums

> colSums(table(df) > 0)
A B C
2 3 2
英文:

Try table + colSums

&gt; colSums(table(df) &gt; 0)
A B C
2 3 2

huangapple
  • 本文由 发表于 2023年3月31日 20:43:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/75898676.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定