2023年4月19日 17:41:15go评论92阅读模式

英文:

How do I test if the values of a variable are NAs while using the paste() function to designate the variable?

问题

Sure, here's the translated code part:

我想创建一个循环，如果原始变量的相应值也为NA，则将多个虚拟变量的值替换为NA（以重新编码MCQ调查）。
我有19个问题，标记为Q1到Q19，虚拟变量标记为Q1\_[answer1]，Q1\_[answer2]等等。我创建了值为1和0的虚拟变量，而不是嵌套另一个ifelse函数来查看Q1，Q2等的值，我想创建一个循环，自动获取虚拟变量（通过使用grep("Q", [n], "_")，其中n随着循环的进行而增加）。
这基本上是我的数据框的样子
```R
df1 <- data.frame(Q1 = c("a", "b,c", NA, "a,b", "b", NA, "c"))

为了检查我的Q1的值是否缺失，我想使用以下代码（或等效代码：

is.na(paste0("df$Q",n))

这将允许我循环遍历不同的问题。然而，这会测试"df$Q1"是否等于NA，而不是查看Q1作为变量。我想找到一种方法，使它就像我直接输入"df$Q1"一样，返回变量所有值的is.na()测试结果：

is.na(df$Q1)

是否有类似is.na或paste0的函数可以轻松实现这个？


<details>
<summary>英文:</summary>
I&#39;d like to create a loop which replaces the values of several dummy variables by NA if the corresponding value of the original variable is NA as well (in order to recode a MCQ survey).
I have 19 questions, labeled Q1 through Q19, with dummy variables labeled Q1\_\[answer1\], Q1\_\[answer2\] etc. I made dummy variables with values 1 and 0, and instead of nesting another ifelse function which looks at the value of Q1, Q2 etc, I&#39;d like to create a loop that takes the dummy variables automatically (by using grep(&quot;Q&quot;, \[n\], &quot;\_&quot;) where n increases as the loop progresses).
Here is essentially what my dataframe is like

df1 <- data.frame(Q1 = c("a", "b,c", NA, "a,b", "b", NA, "c"))

#this is done for the purposes of the loop, which I'm not including here
n <- 1


In order to check if the values of my Q1 are missing or not, I&#39;d like to use the following code (or equivalent:

is.na(paste0("df$Q",n))

[1] FALSE


which would allow me to cycle through the different questions. However, this tests if &quot;df$Q1&quot; is equal to NA rather than looking at Q1 as a variable. I would like to find a way for it to be like if I had input &quot;df$Q1&quot; directly, which returns the list of results for the is.na() test for all values of the variable:

is.na(df$Q1)

[1] FALSE FALSE TRUE FALSE FALSE TRUE FALSE


Is there a function like is.na or like paste0 which would allow me to do this easily?
</details>
# 答案1
**得分**: 1
如何处理这个问题可能取决于您打算如何处理结果。如评论中所示，最简单的方法是通过 `is.na()` 函数将向量传递。
如果您有一个大型数据框，这可能会很费力。相反，使用循环或 `sapply` 函数可能会得到您需要的结果。
扩展您的数据框以进行演示：
```R
df1 <- data.frame(Q1 = c("a", "b,c", NA, "a,b", "b", NA, "c"), 
                  Q2 = c("a", "b,c", "a.b", "a,b", "b", "b", "c"),
                  Q3 = c("a", "b,c", NA, "a,b", "b", "c", NA))
for(i in 1:length(names(df1))) {
    print(is.na(df1[[paste0("Q", i)]]))
}

结果如下：

[1] FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[1] FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE

或者使用 sapply(df1, is.na)：

     Q1    Q2    Q3
[1,] FALSE FALSE FALSE
[2,] FALSE FALSE FALSE
[3,]  TRUE FALSE  TRUE
[4,] FALSE FALSE FALSE
[5,] FALSE FALSE FALSE
[6,]  TRUE FALSE FALSE
[7,] FALSE FALSE  TRUE

一个技巧是使用 sapply(df1, is.na) * 1，这样可以轻松进行求和。

英文:

How you tackle this, may depend on what you intend to do with the result. The simplest approach, as given in the comments, is to pass the vector through is.na().

If you have a large data.frame this can be labourious. Instead, a loop or the sapply may give what you need.

Extending your data.frame to demonstrate:

df1 &lt;- data.frame(Q1 = c(&quot;a&quot;, &quot;b,c&quot;, NA, &quot;a,b&quot;, &quot;b&quot;, NA, &quot;c&quot;), 
                  Q2 = c(&quot;a&quot;, &quot;b,c&quot;, &quot;a.b&quot;, &quot;a,b&quot;, &quot;b&quot;, &quot;b&quot;, &quot;c&quot;),
                  Q3 = c(&quot;a&quot;, &quot;b,c&quot;, NA, &quot;a,b&quot;, &quot;b&quot;, &quot;c&quot;, NA))
  for(i in 1:length(names(df1))) {
    print(is.na(df1[[paste0(&quot;Q&quot;, i)]]))
  }

Gives:

[1] FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[1] FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE

or sapply(df1, is.na)

        Q1    Q2    Q3
[1,] FALSE FALSE FALSE
[2,] FALSE FALSE FALSE
[3,]  TRUE FALSE  TRUE
[4,] FALSE FALSE FALSE
[5,] FALSE FALSE FALSE
[6,]  TRUE FALSE FALSE
[7,] FALSE FALSE  TRUE

One trick is to use sapply(df1, is.na) * 1, which allows easy summation.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在使用paste()函数指定变量时测试变量的值是否为NA？

问题

如何在使用geom_sf和ggplot2制作的地图上对两个边界点之间的线进行着色？

在R中用几行代码创建多个图表。

如何使用lapply将列表的相对数字添加为标题

在R中将Levene检验和双向方差分析放入用户定义函数中。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。