2023年3月9日 23:51:54go评论101阅读模式

英文:

How to select columns based on their properties?

问题

以下是根据您提供的信息生成的筛选数据框的代码示例：

在基本R中：

# 创建示例数据框
df <- data.frame(
  c1 = rep(0, 7),
  c2 = rep(1, 7),
  c3 = rep(0, 7),
  c4 = rep(1, 7),
  c5 = c("?", 1, 1, 1, 1, 1, 1),
  c6 = c("?", 0, 0, 0, 0, 0, 0),
  c7 = c("?", "?", 0, 0, 0, 0, 0),
  c8 = c("?", "?", 1, 1, 1, 1, 1),
  c9 = rep(0, 7),
  c10 = rep(1, 7)
)
# 移除所有列中的 0 或 1
df_filtered <- df[, !sapply(df, function(x) all(x %in% c(0, 1)))]
# 移除包含 "?" 的列
df_filtered <- df_filtered[, !sapply(df_filtered, function(x) any(x == "?"))]

在Tidyverse中：

library(dplyr)
# 创建示例数据框
df <- data.frame(
  c1 = rep(0, 7),
  c2 = rep(1, 7),
  c3 = rep(0, 7),
  c4 = rep(1, 7),
  c5 = c("?", 1, 1, 1, 1, 1, 1),
  c6 = c("?", 0, 0, 0, 0, 0, 0),
  c7 = c("?", "?", 0, 0, 0, 0, 0),
  c8 = c("?", "?", 1, 1, 1, 1, 1),
  c9 = rep(0, 7),
  c10 = rep(1, 7)
)
# 移除所有列中的 0 或 1
df_filtered <- df %>%
  select_if(~!all(. %in% c(0, 1)))
# 移除包含 "?" 的列
df_filtered <- df_filtered %>%
  select_if(~!any(. == "?"))

无论您选择使用基本R还是Tidyverse，上述代码将生成一个新的数据框df_filtered，其中不包含所有列都是0或1的列以及包含"?"的列。

英文:

I have a data frame with the following three values: 0, 1, and ?. The 0s and 1s are characters and not numeric data. I am trying to subset the data frame to remove the following:

All columns that are uniformly 0 or 1
All columns that have at least one ?

So the dataset should no invariant columns or columns with missing values.

Here is an illustration of the data frame:

   c1 c2 c3 c4 c5 c6 c7 c8 c9 c10
r1 0  1  0  1  ?  ?  ?  ?  0  1
r2 0  1  0  1  1  0  ?  ?  0  1
r3 0  1  0  1  1  0  0  1  1  0
r4 0  1  0  1  1  0  0  1  1  0
r5 0  1  0  1  1  0  0  1  ?  1
r6 0  1  0  1  1  0  0  1  ?  0
r7 0  1  1  0  1  0  0  1  0  0

So I want to exclude c1, c2, c5, c6, c7, c8, and c9. How do I do this in base R or tidyverse?

答案1

得分: 3

在tidyverse中：

df %>% select_if(~!any(.x == '?') & !all(.x == 1) & !all(.x == 0))
   c3 c4 c10
r1  0  1   1
r2  0  1   1
r3  0  1   0
r4  0  1   0
r5  0  1   1
r6  0  1   0
r7  1  0   0

英文:

In tidyverse:

df %&gt;% select_if(~!any(.x == &#39;?&#39;) &amp; !all(.x == 1) &amp; !all(.x == 0))
   c3 c4 c10
r1  0  1   1
r2  0  1   1
r3  0  1   0
r4  0  1   0
r5  0  1   1
r6  0  1   0
r7  1  0   0

答案2

得分: 2

    &gt; sapply(df, is.integer) &amp; colMeans(df == 0) &lt; 1 &amp; colMeans(df == 1) &lt; 1
       c1    c2    c3    c4    c5    c6    c7    c8    c9   c10 
    FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE

英文:

&gt; sapply(df,is.integer) &amp; colMeans(df==0)&lt;1 &amp; colMeans(df==1)&lt;1
   c1    c2    c3    c4    c5    c6    c7    c8    c9   c10 
FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据它们的属性选择列如何操作？

问题

答案1

答案2

ggplot指定经度/纬度轴刻度值

使用R从API中循环将每个ID提取到数据框中

How do I extract value of a cell with the row and column numbers from a dataframe in R and then create a separate dataframe with the extracted values?

带虚拟变量交互的LM模型

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。