为DataFrame创建一个分组索引。

huangapple go评论61阅读模式
英文:

make a group index for df

问题

You can achieve this in R using the mutate function from the dplyr package. Here's the code to add the new "group" column to your tibble:

library(dplyr)

# Assuming your tibble is named "df"
df <- df %>%
  mutate(group = rep(1:4, each = 5))

This code will add a new column named "group" with values 1, 2, 3, and 4 in the pattern you described.

英文:

Say I have a df with 20 addresses, how do I add an index of four groups of five to it? The new column should be 1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4.

E.g. how do I turn this tibble:

tibble::tribble(
              ~num_street,           ~city, ~sate, ~zip_code,
        &quot;976 FAIRVIEW DR&quot;,   &quot;SPRINGFIELD&quot;,  &quot;OR&quot;,    97477L,
          &quot;19843 HWY 213&quot;,   &quot;OREGON CITY&quot;,  &quot;OR&quot;,    97045L,
            &quot;402 CARL ST&quot;,         &quot;DRAIN&quot;,  &quot;OR&quot;,    97435L,
           &quot;304 WATER ST&quot;,        &quot;WESTON&quot;,  &quot;OR&quot;,    97886L,
   &quot;5054 TECHNOLOGY LOOP&quot;,     &quot;CORVALLIS&quot;,  &quot;OR&quot;,    97333L,
         &quot;3401 YACHT AVE&quot;,  &quot;LINCOLN CITY&quot;,  &quot;OR&quot;,    97367L,
      &quot;135 ROOSEVELT AVE&quot;,          &quot;BEND&quot;,  &quot;OR&quot;,    97702L,
         &quot;3631 FENWAY ST&quot;,  &quot;FOREST GROVE&quot;,  &quot;OR&quot;,    97116L,
       &quot;92250 HILLTOP LN&quot;,      &quot;COQUILLE&quot;,  &quot;OR&quot;,    97423L,
          &quot;6920 92ND AVE&quot;,        &quot;TIGARD&quot;,  &quot;OR&quot;,    97223L,
          &quot;591 LAUREL ST&quot;, &quot;JUNCTION CITY&quot;,  &quot;OR&quot;,    97448L,
   &quot;32035 LYNX HOLLOW RD&quot;,      &quot;CRESWELL&quot;,  &quot;OR&quot;,    97426L,
          &quot;6280 ASTER ST&quot;,   &quot;SPRINGFIELD&quot;,  &quot;OR&quot;,    97478L,
      &quot;17533 VANGUARD LN&quot;,     &quot;BEAVERTON&quot;,  &quot;OR&quot;,    97007L,
      &quot;59937 CHEYENNE RD&quot;,          &quot;BEND&quot;,  &quot;OR&quot;,    97702L,
          &quot;2232 42ND AVE&quot;,         &quot;SALEM&quot;,  &quot;OR&quot;,    97317L,
         &quot;3100 TURNER RD&quot;,         &quot;SALEM&quot;,  &quot;OR&quot;,    97302L,
       &quot;3495 CHAMBERS ST&quot;,        &quot;EUGENE&quot;,  &quot;OR&quot;,    97405L,
          &quot;585 WINTER ST&quot;,         &quot;SALEM&quot;,  &quot;OR&quot;,    97301L,
        &quot;23985 VAUGHN RD&quot;,        &quot;VENETA&quot;,  &quot;OR&quot;,    97487L
  )

Into this:

tibble::tribble(
~group,             ~num_street,           ~city, ~state, ~zip_code,
    1L,       &quot;976 FAIRVIEW DR&quot;,   &quot;SPRINGFIELD&quot;,   &quot;OR&quot;,    97477L,
    1L,         &quot;19843 HWY 213&quot;,   &quot;OREGON CITY&quot;,   &quot;OR&quot;,    97045L,
    1L,           &quot;402 CARL ST&quot;,         &quot;DRAIN&quot;,   &quot;OR&quot;,    97435L,
    1L,          &quot;304 WATER ST&quot;,        &quot;WESTON&quot;,   &quot;OR&quot;,    97886L,
    1L,  &quot;5054 TECHNOLOGY LOOP&quot;,     &quot;CORVALLIS&quot;,   &quot;OR&quot;,    97333L,
    2L,        &quot;3401 YACHT AVE&quot;,  &quot;LINCOLN CITY&quot;,   &quot;OR&quot;,    97367L,
    2L,     &quot;135 ROOSEVELT AVE&quot;,          &quot;BEND&quot;,   &quot;OR&quot;,    97702L,
    2L,        &quot;3631 FENWAY ST&quot;,  &quot;FOREST GROVE&quot;,   &quot;OR&quot;,    97116L,
    2L,      &quot;92250 HILLTOP LN&quot;,      &quot;COQUILLE&quot;,   &quot;OR&quot;,    97423L,
    2L,         &quot;6920 92ND AVE&quot;,        &quot;TIGARD&quot;,   &quot;OR&quot;,    97223L,
    3L,         &quot;591 LAUREL ST&quot;, &quot;JUNCTION CITY&quot;,   &quot;OR&quot;,    97448L,
    3L,  &quot;32035 LYNX HOLLOW RD&quot;,      &quot;CRESWELL&quot;,   &quot;OR&quot;,    97426L,
    3L,         &quot;6280 ASTER ST&quot;,   &quot;SPRINGFIELD&quot;,   &quot;OR&quot;,    97478L,
    3L,     &quot;17533 VANGUARD LN&quot;,     &quot;BEAVERTON&quot;,   &quot;OR&quot;,    97007L,
    3L,     &quot;59937 CHEYENNE RD&quot;,          &quot;BEND&quot;,   &quot;OR&quot;,    97702L,
    4L,         &quot;2232 42ND AVE&quot;,         &quot;SALEM&quot;,   &quot;OR&quot;,    97317L,
    4L,        &quot;3100 TURNER RD&quot;,         &quot;SALEM&quot;,   &quot;OR&quot;,    97302L,
    4L,      &quot;3495 CHAMBERS ST&quot;,        &quot;EUGENE&quot;,   &quot;OR&quot;,    97405L,
    4L,         &quot;585 WINTER ST&quot;,         &quot;SALEM&quot;,   &quot;OR&quot;,    97301L,
    4L,       &quot;23985 VAUGHN RD&quot;,        &quot;VENETA&quot;,   &quot;OR&quot;,    97487L
)

I know this is incredibly easy; i'm still getting to grips with R though and I have gaps on some of this basic stuff...

答案1

得分: 2

我们可以使用 gl() 函数:

library(dplyr)

df %>%

  mutate(group = as.integer(gl(n(), 5, n())))
# 一个 tibble: 20 × 5
   num_street           city          sate  zip_code group
   <chr>                <chr>         <chr>    <int> <int>
 1 976 FAIRVIEW DR      SPRINGFIELD   OR       97477     1
 2 19843 HWY 213        OREGON CITY   OR       97045     1
 3 402 CARL ST          DRAIN         OR       97435     1
 4 304 WATER ST         WESTON        OR       97886     1
 5 5054 TECHNOLOGY LOOP CORVALLIS     OR       97333     1
 6 3401 YACHT AVE       LINCOLN CITY  OR       97367     2
 7 135 ROOSEVELT AVE    BEND          OR       97702     2
 8 3631 FENWAY ST       FOREST GROVE  OR       97116     2
 9 92250 HILLTOP LN     COQUILLE      OR       97423     2
10 6920 92ND AVE        TIGARD        OR       97223     2
11 591 LAUREL ST        JUNCTION CITY OR       97448     3
12 32035 LYNX HOLLOW RD CRESWELL      OR       97426     3
13 6280 ASTER ST        SPRINGFIELD   OR       97478     3
14 17533 VANGUARD LN    BEAVERTON     OR       97007     3
15 59937 CHEYENNE RD    BEND          OR       97702     3
16 2232 42ND AVE        SALEM         OR       97317     4
17 3100 TURNER RD       SALEM         OR       97302     4
18 3495 CHAMBERS ST     EUGENE        OR       97405     4
19 585 WINTER ST        SALEM         OR       97301     4
20 23985 VAUGHN RD      VENETA        OR       97487     4
英文:

We could use gl() function:

library(dplyr)

df %&gt;% 

  mutate(group =as.integer(gl(n(),5,n())))
# A tibble: 20 &#215; 5
   num_street           city          sate  zip_code group
   &lt;chr&gt;                &lt;chr&gt;         &lt;chr&gt;    &lt;int&gt; &lt;int&gt;
 1 976 FAIRVIEW DR      SPRINGFIELD   OR       97477     1
 2 19843 HWY 213        OREGON CITY   OR       97045     1
 3 402 CARL ST          DRAIN         OR       97435     1
 4 304 WATER ST         WESTON        OR       97886     1
 5 5054 TECHNOLOGY LOOP CORVALLIS     OR       97333     1
 6 3401 YACHT AVE       LINCOLN CITY  OR       97367     2
 7 135 ROOSEVELT AVE    BEND          OR       97702     2
 8 3631 FENWAY ST       FOREST GROVE  OR       97116     2
 9 92250 HILLTOP LN     COQUILLE      OR       97423     2
10 6920 92ND AVE        TIGARD        OR       97223     2
11 591 LAUREL ST        JUNCTION CITY OR       97448     3
12 32035 LYNX HOLLOW RD CRESWELL      OR       97426     3
13 6280 ASTER ST        SPRINGFIELD   OR       97478     3
14 17533 VANGUARD LN    BEAVERTON     OR       97007     3
15 59937 CHEYENNE RD    BEND          OR       97702     3
16 2232 42ND AVE        SALEM         OR       97317     4
17 3100 TURNER RD       SALEM         OR       97302     4
18 3495 CHAMBERS ST     EUGENE        OR       97405     4
19 585 WINTER ST        SALEM         OR       97301     4
20 23985 VAUGHN RD      VENETA        OR       97487     4

答案2

得分: 2

我们也可以使用%/%(整数除法):

df %>%
  mutate(group = (row_number() - 1) %/% 5 + 1)
# # 一个 tibble:20 × 5
#    num_street           city          sate  zip_code group
#    <chr>                <chr>         <chr>    <int> <dbl>
#  1 976 FAIRVIEW DR      SPRINGFIELD   OR       97477     1
#  2 19843 HWY 213        OREGON CITY   OR       97045     1
#  3 402 CARL ST          DRAIN         OR       97435     1
#  4 304 WATER ST         WESTON        OR       97886     1
#  5 5054 TECHNOLOGY LOOP CORVALLIS     OR       97333     1
#  6 3401 YACHT AVE       LINCOLN CITY  OR       97367     2
#  7 135 ROOSEVELT AVE    BEND          OR       97702     2
#  8 3631 FENWAY ST       FOREST GROVE  OR       97116     2
#  9 92250 HILLTOP LN     COQUILLE      OR       97423     2
# 10 6920 92ND AVE        TIGARD        OR       97223     2
# 11 591 LAUREL ST        JUNCTION CITY OR       97448     3
# 12 32035 LYNX HOLLOW RD CRESWELL      OR       97426     3
# 13 6280 ASTER ST        SPRINGFIELD   OR       97478     3
# 14 17533 VANGUARD LN    BEAVERTON     OR       97007     3
# 15 59937 CHEYENNE RD    BEND          OR       97702     3
# 16 2232 42ND AVE        SALEM         OR       97317     4
# 17 3100 TURNER RD       SALEM         OR       97302     4
# 18 3495 CHAMBERS ST     EUGENE        OR       97405     4
# 19 585 WINTER ST        SALEM         OR       97301     4
# 20 23985 VAUGHN RD      VENETA        OR       97487     4
英文:

We can use %/% (integer-division) as well:

df %&gt;%
  mutate(group = (row_number() - 1) %/% 5 + 1)
# # A tibble: 20 &#215; 5
#    num_street           city          sate  zip_code group
#    &lt;chr&gt;                &lt;chr&gt;         &lt;chr&gt;    &lt;int&gt; &lt;dbl&gt;
#  1 976 FAIRVIEW DR      SPRINGFIELD   OR       97477     1
#  2 19843 HWY 213        OREGON CITY   OR       97045     1
#  3 402 CARL ST          DRAIN         OR       97435     1
#  4 304 WATER ST         WESTON        OR       97886     1
#  5 5054 TECHNOLOGY LOOP CORVALLIS     OR       97333     1
#  6 3401 YACHT AVE       LINCOLN CITY  OR       97367     2
#  7 135 ROOSEVELT AVE    BEND          OR       97702     2
#  8 3631 FENWAY ST       FOREST GROVE  OR       97116     2
#  9 92250 HILLTOP LN     COQUILLE      OR       97423     2
# 10 6920 92ND AVE        TIGARD        OR       97223     2
# 11 591 LAUREL ST        JUNCTION CITY OR       97448     3
# 12 32035 LYNX HOLLOW RD CRESWELL      OR       97426     3
# 13 6280 ASTER ST        SPRINGFIELD   OR       97478     3
# 14 17533 VANGUARD LN    BEAVERTON     OR       97007     3
# 15 59937 CHEYENNE RD    BEND          OR       97702     3
# 16 2232 42ND AVE        SALEM         OR       97317     4
# 17 3100 TURNER RD       SALEM         OR       97302     4
# 18 3495 CHAMBERS ST     EUGENE        OR       97405     4
# 19 585 WINTER ST        SALEM         OR       97301     4
# 20 23985 VAUGHN RD      VENETA        OR       97487     4

huangapple
  • 本文由 发表于 2023年6月1日 01:40:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76376060.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定