如何在不丢失其他列的情况下将函数应用于特定列?

huangapple go评论95阅读模式
英文:

How to apply a function to specific columns without losing other columns?

问题

我有一个包含日期在第一列和数据在接下来的几列的数据集。我想对数据执行arcsin转换,但不影响日期列。我一直将其作为函数并使用apply,但无法弄清如何创建一个包含转换后数据的新数据框而不丢失日期列。以下是我有的代码:

  1. df <- read.csv("2018data.csv")
  2. fun_asin <- function(x){
  3. return (asin(x/100))
  4. }
  5. df <- data.frame(lapply(df[, 2:4], fun_asin))

这将返回一个包含转换后数据的新数据框,但不包含日期列。你如何更改它以保留未被更改的列?

英文:

I have a dataset with date in the first column and then my data in the next few columns. I want to perform an arcsin transformation on the data without affecting the date column. I have been doing this as a function and using apply but can't figure out how to get it to create a new dataframe of the transformed data without losing the date column. This is the code I have:

  1. df &lt;- read.csv(&quot;2018data.csv&quot;)
  2. fun_asin &lt;- function(x){
  3. return (asin(x/100))}
  4. df &lt;- data.frame(lapply(df[ ,2:4],fun_asin))

This returns a new dataframe with the transformed data but without the date column. How do I change this to keep columns there that aren't being altered?

Here is the dput for a subset of my dataset:

  1. df &lt;- structure(list(Date.Time = c(&quot;9/1/18 12:00&quot;, &quot;9/2/18 0:00&quot;, &quot;9/2/18 12:00&quot;,
  2. &quot;9/3/18 0:00&quot;, &quot;9/3/18 12:00&quot;, &quot;9/4/18 0:00&quot;, &quot;9/4/18 12:00&quot;,
  3. &quot;9/5/18 0:00&quot;, &quot;9/5/18 12:00&quot;, &quot;9/6/18 0:00&quot;, &quot;9/6/18 12:00&quot;,
  4. &quot;9/7/18 0:00&quot;, &quot;9/7/18 12:00&quot;, &quot;9/8/18 0:00&quot;, &quot;9/8/18 12:00&quot;,
  5. &quot;9/9/18 0:00&quot;, &quot;9/9/18 12:00&quot;, &quot;9/10/18 0:00&quot;, &quot;9/10/18 12:00&quot;,
  6. &quot;9/11/18 0:00&quot;), Narraguagus.R = c(26.38297872, 29.79214781,
  7. 25.06265664, 29.27400468, 29.23433875, 31.89066059, 31.97115385,
  8. 30.71748879, 32.13429257, 27.20930233, 30.21390374, 28.07017544,
  9. 27.68361582, 29.76878613, 31.65680473, 28.61952862, 30.42168675,
  10. 30.37634409, 24.56896552, 24.56140351), Bluehill.R = c(69.48775056,
  11. 73.01401869, 68.46071044, 70.51886792, 73.29545455, 69.72972973,
  12. 68.95459345, 70.28451001, 65.48076923, 63.41929322, 64.20454545,
  13. 66.23246493, 68.88412017, 73.6196319, 75.06112469, 76.06318348,
  14. 76.05839416, 72.01591512, 69.98556999, 69.828722), Jericho.R = c(4.761904762,
  15. 0, 0, 7.692307692, 0, 0, 0, 0, 0, 0, 0, 5.882352941, 0, 0, 0,
  16. 0, 0, 5.882352941, 0, 3.448275862)), row.names = c(NA, 20L), class = &quot;data.frame&quot;)

答案1

得分: 2

With base R, you just have to manually re-include the column in your output:

  1. df_out <- data.frame(df[1], lapply(df[, 2:4], fun_asin))

You could match that more generally with:

  1. cols_to_transform <- 2:4
  2. df_out <- data.frame(
  3. df[setdiff(seq_along(df), cols_to_transform)],
  4. lapply(df[cols_to_transform], fun_asin)
  5. )

With dplyr, you can use across to apply a function to just a subset of columns:

  1. library(dplyr)
  2. df_out <- mutate(df, across(-1, fun_asin))
英文:

With base R, you just have to manually re-include the column in your output

  1. df_out &lt;- data.frame(df[1], lapply(df[ ,2:4],fun_asin))

You could match that more general with

  1. cols_to_transform &lt;- 2:4
  2. df_out &lt;- data.frame(
  3. df[setdiff(seq_along(df), cols_to_transform)],
  4. lapply(df[cols_to_transform], fun_asin)
  5. )

With dplyr you can use across to apply a function to just a subset of columns

  1. library(dplyr)
  2. df_out &lt;- mutate(df, across(-1, fun_asin))

答案2

得分: 0

Two more approaches, one with map_df and one with nest and map:

  1. library(tidyr)
  2. library(purrr)
  3. df %>%
  4. nest(-1) %>%
  5. mutate(fun_asin = map(data, ~fun_asin(.x))) %>%
  6. unnest_wider(fun_asin) %>%
  7. select(-data)
  1. library(purrr)
  2. map_df(df[-1], fun_asin) %>%
  3. cbind(df[1])
英文:

Two more approaches, one with map_df and one with nest and map:

  1. library(tidyr)
  2. library(purrr)
  3. df %&gt;%
  4. nest(-1) %&gt;%
  5. mutate(fun_asin = map(data, ~fun_asin(.x))) %&gt;%
  6. unnest_wider(fun_asin) %&gt;%
  7. select(-data)
  1. library(purrr)
  2. map_df(df[-1], fun_asin) %&gt;%
  3. cbind(df[1])

答案3

得分: 0

使用dplyr中的across,这是一种稍微更灵活的方法。

  1. newdf <- df %>%
  2. mutate(across(-where(is.character),
  3. fun_asin))
英文:

Slightly more flexible approach using across from dplyr.

  1. newdf &lt;- df %&gt;%
  2. mutate(across(-where(is.character),
  3. fun_asin))

huangapple
  • 本文由 发表于 2023年4月10日 23:04:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/75978238.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定