英文:
Use column names from dataframe and create a new one with one column with those column names as values
问题
我有以下数据框,我想创建一个新的数据框,其中包含两列,一列名为“设施”(Amenities),其值将包括“nn_bank”或“nn_hospital”,另一列名为“名称”(Name),其值将是“nn_bank”或“nn_hospital”的名称。
df <- structure(list(state = c("West Bengal", "West Bengal", "West Bengal",
"West Bengal", "West Bengal"), nn_hospital = c("Khundkuri Hospital",
"Khundkuri Hospital", "Mankar Rural Hospital", "Khundkuri Hospital",
"Khundkuri Hospital"), distance_nn_hospital = c(8949.68646563084,
17217.1419457099, 16939.2318150416, 15812.9872649418, 1408.372117616
), nn_bank = c("contai", "contai", "Allahabad Bank", "contai",
"contai"), distance_nn_bank = c(13959.9950089655, 20598.4763042432,
19688.6296071566, 20537.3799009137, 11385.8738290783)), class = "data.frame", row.names = c(NA,
5L))
结果应该如下:
英文:
I have the dataframe below and I want to create a new one with 2 columns one named Amenities
which will include either "nn_bank" or "nn_hospital" as value and the other column "Name" with the name of the nn_bank
or nn_hospital
.
df<-structure(list(state = c("West Bengal", "West Bengal", "West Bengal",
"West Bengal", "West Bengal"), nn_hospital = c("Khundkuri Hospital",
"Khundkuri Hospital", "Mankar Rural Hospital", "Khundkuri Hospital",
"Khundkuri Hospital"), distance_nn_hospital = c(8949.68646563084,
17217.1419457099, 16939.2318150416, 15812.9872649418, 1408.372117616
), nn_bank = c("contai", "contai", "Allahabad Bank", "contai",
"contai"), distance_nn_bank = c(13959.9950089655, 20598.4763042432,
19688.6296071566, 20537.3799009137, 11385.8738290783)), class = "data.frame", row.names = c(NA,
5L))
result should be like
答案1
得分: 1
Here is the translated content:
当你有多个值列时,一种选择是使用 tidyr::pivot_longer
的 names_pattern
参数以及特殊的 .value
来重塑你的数据。之后你需要进行一些额外的清理。
library(tidyr)
library(dplyr, warn = FALSE)
library(stringr)
df |>
pivot_longer(-state,
names_to = c(".value", "Amenities"),
names_pattern = "(.*?nn)_(.*)"
) |>
select(-c(state, distance_nn), Name = nn) |>
mutate(across(c(Amenities, Name), str_to_title))
#> # A tibble: 10 × 2
#> Amenities Name
#> <chr> <chr>
#> 1 Hospital Khundkuri医院
#> 2 Bank Contai
#> 3 Hospital Khundkuri医院
#> 4 Bank Contai
#> 5 Hospital Mankar Rural医院
#> 6 Bank Allahabad银行
#> 7 Hospital Khundkuri医院
#> 8 Bank Contai
#> 9 Hospital Khundkuri医院
#> 10 Bank Contai
(Note: I have left the code part unchanged as per your request.)
英文:
When you have multiple value columns one option would be to use the names_pattern
argument of tidyr::pivot_longer
along with the special .value
to reshape your data. Afterwards you have to do some additional cleaning.
library(tidyr)
library(dplyr, warn = FALSE)
library(stringr)
df |>
pivot_longer(-state,
names_to = c(".value", "Amenities"),
names_pattern = "(.*?nn)_(.*)"
) |>
select(-c(state, distance_nn), Name = nn) |>
mutate(across(c(Amenities, Name), str_to_title))
#> # A tibble: 10 × 2
#> Amenities Name
#> <chr> <chr>
#> 1 Hospital Khundkuri Hospital
#> 2 Bank Contai
#> 3 Hospital Khundkuri Hospital
#> 4 Bank Contai
#> 5 Hospital Mankar Rural Hospital
#> 6 Bank Allahabad Bank
#> 7 Hospital Khundkuri Hospital
#> 8 Bank Contai
#> 9 Hospital Khundkuri Hospital
#> 10 Bank Contai
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论