2023年3月3日 21:04:55go评论100阅读模式

英文:

Adding a factor into the cut() function

问题

我有一个名为`Base`的数据框，如下所示：
      ID Gender Strength
    1  1      0      230
    2  2      1       20
    3  3      1       30
    4  4      0       40
    5  5      0       40
我想使用`cut`函数创建一个新变量，将人们按照力量的多少进行分类，但是根据不同性别使用不同的分割点。在性别为1时，力量大于28被视为更强；在性别为0时，力量大于10被视为更强。
我可以创建一个新变量，但我不知道在哪里放入另一个向量，以便根据这两个变量创建新变量。我正在使用以下代码行，但不知道如何继续：
    vec1 <- Base$Strength
    vec2 <- Base$Gender
    Base$newvariable <- cut(vec1, breaks=c(0.00, 29.00, 60.00), labels=c("Stronger", "Weaker"))

英文:

I have a data frame Base such as

  ID Gender Strength
1  1      0      230
2  2      1       20
3  3      1       30
4  4      0       40
5  5      0       40

I want to create a new variable with the cut function to categorise people with more strength vs lower but divided by gender by different cut-off points. Cut-off point for more strength in 1 is 28 and for 0 is 10.

I can create a new variable but I don´t know where I can put the other vec for creating the variable according the two variables. I´m using this line of code but I don´t know how to go forward:

vec1 &lt;- Base$Strength
vec2 &lt;- Base$Gender
Base$newvariable &lt;- cut(vec1, breaks=c(0.00, 29.00, 60.00), labels=c(&quot;Stronger&quot;, &quot;Weaker&quot;))

答案1

得分: 2

你可以通过cur_group()进行分组，然后使用组值。

df %>%
  group_by(Gender) %>%
  mutate(newVariable = factor(Strength > (if (cur_group() == 1) 29 else 10), labels = c("Weaker", "Stronger")))

英文:

You can group by and the use the group value via cur_group()

df %&gt;% 
  group_by(Gender) %&gt;% 
  mutate(newVariable = factor(Strength&gt;(if(cur_group()==1) 29 else 10),labels = c(&quot;Weaker&quot;, &quot;Stronger&quot;)))

答案2

得分: 0

不确定的是60，但是你需要添加max或Inf。

transform(Base, str_cat=cut(Strength, c(0, 29, max(Strength)), labels=c('weaker', 'strong')))
#   ID Gender Strength str_cat
# 1  1      0      230  strong
# 2  2      1       20  weaker
# 3  3      1       30  strong
# 4  4      0       40  strong
# 5  5      0       40  strong

如果60表示你想要三个分界点，那么执行以下操作：

transform(Base, str_cat=cut(Strength, c(0, 29, 60, Inf), labels=c('weaker', 'normal', 'strong')))
#   ID Gender Strength str_cat
# 1  1      0      230  strong
# 2  2      1       20  weaker
# 3  3      1       30  normal
# 4  4      0       40  normal
# 5  5      0       40  normal

数据：

Base <- structure(list(ID = 1:5, Gender = c(0, 1, 1, 0, 0), Strength = c(230, 
20, 30, 40, 40)), class = "data.frame", row.names = c(NA, -5L
))

英文:

Not sure with the 60, but you need to add the max or Inf.

transform(Base, str_cat=cut(Strength, c(0, 29, max(Strength)), labels=c(&#39;weaker&#39;, &#39;strong&#39;)))
#   ID Gender Strength str_cat
# 1  1      0      230  strong
# 2  2      1       20  weaker
# 3  3      1       30  strong
# 4  4      0       40  strong
# 5  5      0       40  strong

If the 60 meant you want three cuttofs, do

transform(Base, str_cat=cut(Strength, c(0, 29, 60, Inf), labels=c(&#39;weaker&#39;, &#39;normal&#39;, &#39;strong&#39;)))
#   ID Gender Strength str_cat
# 1  1      0      230  strong
# 2  2      1       20  weaker
# 3  3      1       30  normal
# 4  4      0       40  normal
# 5  5      0       40  normal

Data:

Base &lt;- structure(list(ID = 1:5, Gender = c(0, 1, 1, 0, 0), Strength = c(230, 
20, 30, 40, 40)), class = &quot;data.frame&quot;, row.names = c(NA, -5L
))

答案3

得分: 0

以下是翻译好的部分：

这里为每个gender组准备了不同的cut()。请注意，factor是R中用于表示分类变量的特定术语，而你的gender虚拟变量只是0和1。因此，我们可以为每个值筛选并分配特定的切割点：

df <- data.frame(id = c(1,2,3,4,5),
                 gender = c(0,1,1,0,0),
                 strength = c(30,20,30,40,40))
library(tidyverse)
df %>% 
  mutate(cut_group = 
           ifelse(gender == 1, 
                  cut(strength, breaks=c(0.00, 20.00, 60.00), labels = c("较弱", "较强")) %>% as.character,
                  cut(strength, breaks=c(0.00, 39.00, 60.00), labels = c("较弱", "较强")) %>% as.character)
  )

输出结果如下：

  id gender strength cut_group
1  1      0       30      较弱
2  2      1       20      较弱
3  3      1       30      较强
4  4      0       40      较强
5  5      0       40      较强

对于gender == 0，强度为30表示弱点，而对于gender == 1，强度值30表示一个强壮的人。

英文:

Here is a different cut() for each gender group. Note that factor is a specific R term for a categorical variable, whereas your gender dummy is simply 0 and 1. So we can filter for each value and assign a specific cut break:

df &lt;- data.frame(id = c(1,2,3,4,5),
                 gender = c(0,1,1,0,0),
                 strength = c(30,20,30,40,40))
library(tidyverse)
df %&gt;% 
  mutate(cut_group = 
           ifelse(gender == 1, 
                  cut(strength, breaks=c(0.00, 20.00, 60.00), labels = c(&quot;Weaker&quot;, &quot;Stronger&quot;)) %&gt;% as.character,
                  cut(strength, breaks=c(0.00, 39.00, 60.00), labels = c(&quot;Weaker&quot;, &quot;Stronger&quot;)) %&gt;% as.character)
  )
  id gender strength cut_group
1  1      0       30    Weaker
2  2      1       20    Weaker
3  3      1       30  Stronger
4  4      0       40  Stronger
5  5      0       40  Stronger

For gender == 0 strength of 30 indicates weakness, whereas gender == 1 strength value 30 is a strong person.

答案4

得分: 0

I usually prefer cut but since you simply have two factor levels and combinations you can consider this as well. Your conditions either return TRUE or FALSE, then convert that to a factor with the labels you want.

df %>%
  mutate(grp = factor((gender == 0 & strength > 10) | (gender == 1 & strength > 28), levels = c(T, F), labels = c("Strong", "Weak")))

英文:

df %&gt;%
  mutate(grp = factor((gender == 0 &amp; strength &gt; 10) | (gender == 1 &amp; strength &gt; 28), levels = c(T, F), labels = c(&quot;Strong&quot;, &quot;Weak&quot;)))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将一个因素添加到cut()函数中。

问题

答案1

答案2

答案3

答案4

如何在R中使用相邻列的值和附加文本来替换数据框中的NA值

如何转置排名数据并将单元格名称转换为列名称？

将雷达数据重新投影到不同的坐标系。

调整刻度标记

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论