英文:
Count valid/non-NA observations grouped by variable
问题
我已经谷歌了大约两个小时,试图找到一个简单问题的解决方案,但是我一无所获。我甚至找不到正确的函数来实现我想要的结果,所以我不得不求助,尽管这似乎是一个非常基本的问题。
我有一个跨越多个国家的调查,我正在创建一个包含从数据中派生的统计数据的数据框(例如给定变量的国家均值)。
假设数据框 df 看起来像这样:
country <- c(1,1,1,1,1,1,2,2,2,2,2,2)
var <- c(1,NA,2,1,2,2,3,3,1,3,4,NA)
df <- cbind.data.frame(country, var)
我可以很容易地计算名义 n 的数量:
df %>% group_by(country) %>% summarize(n=n())
但是如何计算变量 var
上的有效观测值的数量呢?
英文:
I have been googling for about two hours now trying to find the solution to a simple problem, but I am not getting anywhere. I am not even finding the right function I could use to get where I want to get, so I have to resort to asking for help even though it seems a very basic question.
I have a survey spanning various countries, and I am in the process of creating a dataframe with statistics derived from the data (such as national mean on a given variable).
Let's say the dataframe df looks like this:
country <- c(1,1,1,1,1,1,2,2,2,2,2,2)
var <- c(1,NA,2,1,2,2,3,3,1,3,4,NA)
df <- cbind.data.frame(country, var)
I can easily count the nominal n's:
df %>% group_by(country) %>% summarize(n=n())
But how do I count valid observations on the variable var
?
答案1
得分: 2
任何一个都可以。group_by()/summarise()/n()
这三个函数调用可以被一个函数count()
替代。
country <- c(1,1,1,1,1,1,2,2,2,2,2,2)
var <- c(1,NA,2,1,2,2,3,3,1,3,4,NA)
df <- cbind.data.frame(country, var)
suppressPackageStartupMessages(
library(dplyr)
)
df %>%
na.exclude() %>%
count(country)
#> country n
#> 1 1 5
#> 2 2 5
df %>%
na.omit() %>%
count(country)
#> country n
#> 1 1 5
#> 2 2 5
df %>%
tidyr::drop_na() %>%
count(country)
#> country n
#> 1 1 5
#> 2 2 5
创建于2023-08-08,使用 reprex v2.0.2
英文:
Any of these will do it.
Three function calls, group_by()/summarise()/n()
can be replaced by one only, count()
.
country <- c(1,1,1,1,1,1,2,2,2,2,2,2)
var <- c(1,NA,2,1,2,2,3,3,1,3,4,NA)
df <- cbind.data.frame(country, var)
suppressPackageStartupMessages(
library(dplyr)
)
df %>%
na.exclude() %>%
count(country)
#> country n
#> 1 1 5
#> 2 2 5
df %>%
na.omit() %>%
count(country)
#> country n
#> 1 1 5
#> 2 2 5
df %>%
tidyr::drop_na() %>%
count(country)
#> country n
#> 1 1 5
#> 2 2 5
<sup>Created on 2023-08-08 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论