英文:
R: is it possible to merge to 2 variables in one?
问题
你想要在“Nominal Difference”列中加入百分比(带有加号和括号)。以下是你所需的格式:
`City Name` `Average amount` `Nominal Difference` `Real Wage` `Real Difference`
Barletta 2457 1007 (+41.0%) 27.5 11.3
Caserta 2445 910 (+37.2%) 27.4 10.2
Avellino 2363 1016 (+43.0%) 26.5 11.4
Lecce 2342 981 (+41.9%) 26.2 11.0
Benevento 2335 1157 (+49.6%) 26.1 13.0
Isernia 2334 1078 (+46.2%) 26.1 12.1
L'Aquila 2324 1010 (+43.5%) 26.0 11.3
Catanzaro 2310 1533 (+66.3%) 25.9 17.2
Campobasso 2259 1106 (+49.0%) 25.3 12.4
Enna 2242 922 (+41.1%) 25.1 10.3
你可以使用数据处理工具(如Python中的Pandas)来实现这一更改。如果你需要关于具体代码的帮助,请提供你的编程语言和数据结构的信息。
英文:
Is it possible to merge 2 variables in a unique one?
This is my dataset
`City Name` `Average amount` `Nominal Difference` `%` `Real Wage` `Real Difference`
Barletta 2457 1007 41.0 27.5 11.3
Caserta 2445 910 37.2 27.4 10.2
Avellino 2363 1016 43.0 26.5 11.4
Lecce 2342 981 41.9 26.2 11.0
Benevento 2335 1157 49.6 26.1 13.0
Isernia 2334 1078 46.2 26.1 12.1
L'Aquila 2324 1010 43.5 26.0 11.3
Catanzaro 2310 1533 66.3 25.9 17.2
Campobasso 2259 1106 49.0 25.3 12.4
Enna 2242 922 41.1 25.1 10.3
I'd like to have the percentage (with a plus sign and inside parentesis) in the column of Nominal Difference.
What I'm looking for is
`City Name` `Average amount` `Nominal Difference` `Real Wage` `Real Difference`
Barletta 2457 1007 (+41.0%) 27.5 11.3
Caserta 2445 910 (+37.2%) 27.4 10.2
Avellino 2363 1016 (+43.0%) 26.5 11.4
Lecce 2342 981 (+41.9%) 26.2 11.0
Benevento 2335 1157 (+49.6%) 26.1 13.0
Isernia 2334 1078 (+46.2%) 26.1 12.1
L'Aquila 2324 1010 (+43.5%) 26.0 11.3
Catanzaro 2310 1533 (+66.3%) 25.9 17.2
Campobasso 2259 1106 (+49.0%) 25.3 12.4
Enna 2242 922 (+41.1%) 25.1 10.3
How can I do that?
UPDATE
> dput(f)
structure(list(`City Name` = c("Barletta -Andria-Trani", "Caserta",
"Avellino", "Lecce", "Benevento", "Isernia", "L'Aquila", "Catanzaro",
"Campobasso", "Enna"), `Average amount` = c(2456.92, 2444.58,
2363.48, 2341.57, 2334.63, 2334.01, 2323.97, 2310.46, 2259.03,
2242.38), `Nominal Difference` = c(1006.8, 909.62, 1016.28, 980.7,
1157.25, 1077.51, 1010.32, 1532.79, 1106.31, 922.35), `%` = c(40.97,
37.2, 42.99, 41.88, 49.56, 46.16, 43.47, 66.34, 48.97, 41.13),
`Real Wage` = c(27.51, 27.37, 26.46, 26.22, 26.14, 26.13,
26.02, 25.87, 25.29, 25.11), `Real Difference` = c(11.27,
10.18, 11.38, 10.98, 12.95, 12.06, 11.31, 17.16, 12.38, 10.32
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))
答案1
得分: 2
从我的评论中:
with(dataset, sprintf("%.0f (%+0.1f%%)", `Nominal Difference`, `Real Wage`))
# [1] "1007 (+27.5%)" "910 (+27.4%)" "1016 (+26.5%)" "981 (+26.2%)" "1157 (+26.1%)" "1078 (+26.1%)" "1010 (+26.0%)"
# [8] "1533 (+25.9%)" "1106 (+25.3%)" "922 (+25.1%)"
如果需要对每个子字符串进行对齐,sprintf
支持使用前导零进行空格填充,并在左/右填充之间切换:
'-' Left adjustment of converted argument in its field.
'0' For numbers, pad to the field width with leading zeros. For
characters, this zero-pads on some platforms and is ignored
on others.
我不认为我们可以在数字上使用空格填充,但如果我们分别格式化每个数字,我们可以再次在结果字符串上调用sprintf
。
with(dataset,
sprintf("%-6s %6s", sprintf("%.0f", `Nominal Difference`),
sprintf("(%0+.1f)", `Real Wage`))
)
# [1] "1007 (+27.5)" "910 (+27.4)" "1016 (+26.5)" "981 (+26.2)" "1157 (+26.1)" "1078 (+26.1)" "1010 (+26.0)"
# [8] "1533 (+25.9)" "1106 (+25.3)" "922 (+25.1)"
数据
dataset <- structure(list("City Name" = c("Barletta", "Caserta", "Avellino", "Lecce", "Benevento", "Isernia", "L'Aquila", "Catanzaro", "Campobasso", "Enna"), "Average amount" = c(2457L, 2445L, 2363L, 2342L, 2335L, 2334L, 2324L, 2310L, 2259L, 2242L), "Nominal Difference" = c(1007L, 910L, 1016L, 981L, 1157L, 1078L, 1010L, 1533L, 1106L, 922L), "%" = c(41, 37.2, 43, 41.9, 49.6, 46.2, 43.5, 66.3, 49, 41.1), "Real Wage" = c(27.5, 27.4, 26.5, 26.2, 26.1, 26.1, 26, 25.9, 25.3, 25.1), "Real Difference" = c(11.3, 10.2, 11.4, 11, 13, 12.1, 11.3, 17.2, 12.4, 10.3)), class = "data.frame", row names = c(NA, -10L))
英文:
From my comment:
with(dataset, sprintf("%.0f (%+0.1f%%)", `Nominal Difference`, `Real Wage`))
# [1] "1007 (+27.5%)" "910 (+27.4%)" "1016 (+26.5%)" "981 (+26.2%)" "1157 (+26.1%)" "1078 (+26.1%)" "1010 (+26.0%)"
# [8] "1533 (+25.9%)" "1106 (+25.3%)" "922 (+25.1%)"
If you need to align each substring, sprintf
supports space-padding with a leading zero, and change between left/right padding with a negation:
'-' Left adjustment of converted argument in its field.
'0' For numbers, pad to the field width with leading zeros. For
characters, this zero-pads on some platforms and is ignored
on others.
I don't think we can use space-padding on a number, but if we format each one individually, we can call sprintf
again on the resulting strings.
with(dataset,
sprintf("%-6s %6s", sprintf("%.0f", `Nominal Difference`),
sprintf("(%0+.1f)", `Real Wage`))
)
# [1] "1007 (+27.5)" "910 (+27.4)" "1016 (+26.5)" "981 (+26.2)" "1157 (+26.1)" "1078 (+26.1)" "1010 (+26.0)"
# [8] "1533 (+25.9)" "1106 (+25.3)" "922 (+25.1)"
Data
dataset <- structure(list("City Name" = c("Barletta", "Caserta", "Avellino", "Lecce", "Benevento", "Isernia", "L'Aquila", "Catanzaro", "Campobasso", "Enna"), "Average amount" = c(2457L, 2445L, 2363L, 2342L, 2335L, 2334L, 2324L, 2310L, 2259L, 2242L), "Nominal Difference" = c(1007L, 910L, 1016L, 981L, 1157L, 1078L, 1010L, 1533L, 1106L, 922L), "%" = c(41, 37.2, 43, 41.9, 49.6, 46.2, 43.5, 66.3, 49, 41.1), "Real Wage" = c(27.5, 27.4, 26.5, 26.2, 26.1, 26.1, 26, 25.9, 25.3, 25.1), "Real Difference" = c(11.3, 10.2, 11.4, 11, 13, 12.1, 11.3, 17.2, 12.4, 10.3)), class = "data.frame", row.names = c(NA, -10L))
答案2
得分: 1
我们可以在mutate()
内部使用paste0()
,并使用sprintf()
调整小数位数,使用stringr::str_pad()
调整填充:
library(dplyr)
library(stringr)
mydat <- tribble(~ `City Name`, ~ `Nominal Difference`, ~`%`,
"Barletta", 1007, 41.0,
"Caserta", 910, 37.2
)
mydat %>%
mutate(`Nominal Difference` = paste0(
str_pad(
`Nominal Difference`,
width = max(nchar(`Nominal Difference`))),
sprintf(" (+%.1f%%)",`%`))
)
#> # A tibble: 2 x 3
#> `City Name` `Nominal Difference` `%`
#> <chr> <chr> <dbl>
#> 1 Barletta "1007 (+41.0%)" 41
#> 2 Caserta " 910 (+37.2%)" 37.2
创建于2023-02-27,使用 reprex package (v2.0.1)
英文:
We can use paste0()
inside mutate()
and adjust the decimals with sprintf()
and the padding with stringr::str_pad()
:
library(dplyr)
library(stringr)
mydat <- tribble(~ `City Name`, ~ `Nominal Difference`, ~`%`,
"Barletta", 1007, 41.0,
"Caserta", 910, 37.2
)
mydat %>%
mutate(`Nominal Difference` = paste0(
str_pad(
`Nominal Difference`,
width = max(nchar(`Nominal Difference`))),
sprintf(" (+%.1f%%)",`%`))
)
#> # A tibble: 2 x 3
#> `City Name` `Nominal Difference` `%`
#> <chr> <chr> <dbl>
#> 1 Barletta "1007 (+41.0%)" 41
#> 2 Caserta " 910 (+37.2%)" 37.2
<sup>Created on 2023-02-27 by the reprex package (v2.0.1)</sup>
答案3
得分: 1
我认为你可以使用 unite
函数,例如:
df %>%
mutate(perc = sprintf("(+%.1f%%)", perc)) %>%
unite(val, val:perc, remove = TRUE, sep = " ")
这将输出如下结果:
id val
1 1 68 (+68.7%)
2 2 39 (+38.4%)
3 3 1 (+77.0%)
4 4 34 (+49.8%)
5 5 87 (+71.8%)
6 6 43 (+99.2%)
7 7 14 (+38.0%)
8 8 82 (+77.7%)
9 9 59 (+93.5%)
10 10 51 (+21.2%)
虚拟数据
set.seed(1)
df <- data.frame(
id = 1:10,
val = sample(100, 10),
perc = 100 * runif(10)
)
希望这对你有帮助。
英文:
I think you can use unite
, for example
df %>%
mutate(perc = sprintf("(+%.1f%%)", perc)) %>%
unite(val, val:perc, remove = TRUE, sep = " ")
which outputs like
id val
1 1 68 (+68.7%)
2 2 39 (+38.4%)
3 3 1 (+77.0%)
4 4 34 (+49.8%)
5 5 87 (+71.8%)
6 6 43 (+99.2%)
7 7 14 (+38.0%)
8 8 82 (+77.7%)
9 9 59 (+93.5%)
10 10 51 (+21.2%)
Dummy Data
set.seed(1)
df <- data.frame(
id = 1:10,
val = sample(100, 10),
perc = 100 * runif(10)
)
# > df
# id val perc
# 1 1 68 68.70228
# 2 2 39 38.41037
# 3 3 1 76.98414
# 4 4 34 49.76992
# 5 5 87 71.76185
# 6 6 43 99.19061
# 7 7 14 38.00352
# 8 8 82 77.74452
# 9 9 59 93.47052
# 10 10 51 21.21425
答案4
得分: 1
df %>%
mutate(across(percent, ~ str_c("(", .x, "%)")) %>%
unite("nominal_difference", c(nominal_difference, percent), sep = " ")
A tibble: 10 × 5
city_name average_amount nominal_difference real_wage real_difference
1 Barletta 2457 1007 (41%) 27.5 11.3
2 Caserta 2445 910 (37.2%) 27.4 10.2
3 Avellino 2363 1016 (43%) 26.5 11.4
4 Lecce 2342 981 (41.9%) 26.2 11
5 Benevento 2335 1157 (49.6%) 26.1 13
6 Isernia 2334 1078 (46.2%) 26.1 12.1
7 L'Aquila 2324 1010 (43.5%) 26 11.3
8 Catanzaro 2310 1533 (66.3%) 25.9 17.2
9 Campobasso 2259 1106 (49%) 25.3 12.4
10 Enna 2242 922 (41.1%) 25.1 10.3
<details>
<summary>英文:</summary>
df %>%
mutate(across(percent, ~ str_c("(", .x, "%)"))) %>%
unite("nominal_difference", c(nominal_difference, percent), sep = " ")
# A tibble: 10 × 5
city_name average_amount nominal_difference real_wage real_difference
<chr> <dbl> <chr> <dbl> <dbl>
1 Barletta 2457 1007 (41%) 27.5 11.3
2 Caserta 2445 910 (37.2%) 27.4 10.2
3 Avellino 2363 1016 (43%) 26.5 11.4
4 Lecce 2342 981 (41.9%) 26.2 11
5 Benevento 2335 1157 (49.6%) 26.1 13
6 Isernia 2334 1078 (46.2%) 26.1 12.1
7 L'Aquila 2324 1010 (43.5%) 26 11.3
8 Catanzaro 2310 1533 (66.3%) 25.9 17.2
9 Campobasso 2259 1106 (49%) 25.3 12.4
10 Enna 2242 922 (41.1%) 25.1 10.3
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论