2023年6月16日 15:13:54go评论96阅读模式

英文:

Call a variable created within a dplyr pipe - R

问题

可重现的数据：

# 截断指数分布
lambda_1 <- 1/2
lambda_2 <- 1/10
ff1 <- function(x) pexp(x, lambda_1)
f1.inv <- function(q) qexp(q, lambda_1)
ff2 <- function(x) pexp(x, lambda_2)
f2.inv <- function(q) qexp(q, lambda_2)
a <- 0
n <- 50
x1 <- f1.inv(runif(n))
x1.trunc <- f1.inv(runif(n, ff1(a)))
x2 <- f2.inv(runif(n))
x2.trunc <- f2.inv(runif(n, ff2(a)))
T_Phone <- c(x1.trunc, x2.trunc)
# 正态数据 - 方差相等
Normal_1_Eq <- rnorm(n = 50, mean = 24.6, sd = .95)
Normal_2_Eq <- rnorm(n = 50, mean = 38, sd = 1.05)
Weight <- c(Normal_1_Eq, Normal_2_Eq)
# 正态数据 - 方差不相等
Normal_1_Uneq <- rnorm(n = 50, mean = 24.6, sd = .23)
Normal_2_Uneq <- rnorm(n = 50, mean = 38, sd = 2.95)
Head_Circumference <- c(Normal_1_Uneq, Normal_2_Uneq)
# 泊松分布
Poisson_1 <- rpois(n = 50, lambda = 4.5)
Poisson_2 <- rpois(n = 50, lambda = 14.5)
Daily_Snacks <- c(Poisson_1, Poisson_2)
# 分配组别
Group <- rep(c("A", "B"), each = 50)
ID <- rep(c(1:50), each = 1, times = 2)
# 转为数据框
df <- data.frame(ID, Group, Weight, Head_Circumference, Daily_Snacks, T_Phone)
df[,c(1:2)] <- lapply(df[,c(1:2)], as.factor)
df[,c(3:6)] <- lapply(df[,c(3:6)], as.numeric)
df <- df %>% janitor::clean_names()

问题：
我尝试使用上面的长格式数据，并且只在需要时将其重塑为宽格式，以供 dplyr 管道链使用。我已成功地对“weight”变量执行了此操作：

df %>% select(id, group, weight) %>% spread(key = "group", value = "weight")

现在，我想调用新创建的变量 A 和 B，并测试它们之间的方差齐性：

df %>% select(id, group, weight) %>% spread(key = "group", value = "weight") %>%
  var.test(.$A, .$B)

但是，当使用最后一个命令（var.test(.$)）时，我只能访问我在 df 中最初选择的变量（例如，id 和 group）。

如果我将此保存到一个新的数据框中：

t_frame <- df %>% select(id, group, weight) %>% spread(key = "group", value = "weight")
var.test(t_frame$A, t_frame$B)

那么一切都可以正常工作。如何使新创建的 A 和 B 变量在管道内的 var.test 中填充？

英文:

Reproducible data:

&#39;# Truncated Exponential Dist&#39;s
lambda_1 &lt;- 1/2
lambda_2 &lt;- 1/10
ff1 &lt;- function(x) pexp(x, lambda_1)
f1.inv &lt;- function(q) qexp(q, lambda_1)
ff2 &lt;- function(x) pexp(x, lambda_2)
f2.inv &lt;- function(q) qexp(q, lambda_2)
a &lt;- 0
n &lt;- 50
x1 &lt;- f1.inv(runif(n))
x1.trunc &lt;- f1.inv(runif(n, ff1(a)))
x2 &lt;- f2.inv(runif(n))
x2.trunc &lt;- f2.inv(runif(n, ff2(a)))
T_Phone &lt;- c(x1.trunc,x2.trunc)
#Normal Data - Equal Variances
Normal_1_Eq &lt;- rnorm(n = 50, mean = 24.6, sd = .95)
Normal_2_Eq &lt;- rnorm(n = 50, mean = 38, sd = 1.05)
Weight &lt;- c(Normal_1_Eq,Normal_2_Eq)
#Normal Data - Unequal Variances
Normal_1_Uneq &lt;- rnorm(n = 50, mean = 24.6, sd = .23)
Normal_2_Uneq &lt;- rnorm(n = 50, mean = 38, sd = 2.95)
Head_Circumference &lt;- c(Normal_1_Uneq, Normal_2_Uneq)
#Poisson
Poisson_1 &lt;- rpois(n = 50, lambda = 4.5)
Poisson_2 &lt;- rpois(n = 50, lambda = 14.5)
Daily_Snacks &lt;- c(Poisson_1,Poisson_2)
#Assign Groups
Group &lt;- rep(c(&quot;A&quot;,&quot;B&quot;), each = 50)
ID &lt;- rep(c(1:50), each = 1, times = 2)
#Group &lt;- sample(Group)
#Set Into Dataframe
df &lt;- data.frame(ID,Group, Weight,Head_Circumference,Daily_Snacks,T_Phone)
df[,c(1:2)] &lt;- lapply(df[,c(1:2)], as.factor)
df[,c(3:6)] &lt;- lapply(df[,c(3:6)], as.numeric)
df &lt;- df %&gt;% janitor::clean_names()`

Question
I am attempting to work with the above long-data format and only reshape it into wide format when needed in a dplyr piped chain. I've been successful in doing so with the following (only applied to variable "weight")

df %&gt;% select(id,group, weight) %&gt;% spread(key = &quot;group&quot;, value = &quot;weight&quot;)

Now, I want to call the new variables, A and B, and test homogeneity of variances between them:

df %&gt;% select(id,group, weight) %&gt;% spread(key = &quot;group&quot;, value = &quot;weight&quot;) %&gt;% var.test(.$A,.$B)

However, the only variables I have access to when using the last command (var.test(.$)) are the originally selected variables in my df (e.g., id and group)

If I save this to a new data frame:

t_frame <- df %>% select(id,group, weight) %>% spread(key = "group", value = "weight") var.test(t_frame$A,t_frame$B)

Then everything works. How can I get the newly created A and B variables to populate in var.test within the pipe?

答案1

得分: 3

将你的最后一个管道替换为：

%>% {var.test(.$A, .$B)}

没有{}，你的代码将整个数据框作为第一个参数传递。花括号可以抑制这一行为，让你只选择使用$的子集。

英文:

Replace your last pipe with:

%&gt;% {var.test(.$A, .$B)}

Without the {} your code passes the whole data frame as the first argument. the curly braces suppress this allowing you to select just the subsets with $.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

调用在dplyr管道中创建的变量 – R

问题

答案1

盒图的数量不准确。如何修复？

使用R语言和nloptr包解决非线性优化问题。

改变闪亮选项卡的颜色，取决于另一个选项卡是否处于活动状态。

在特定试验中进行平均，有一些重叠的标签。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。