使用lapply在R中出现问题,该函数在全局环境中使用常量。

huangapple go评论101阅读模式
英文:

Problem using lapply in R with a function that uses constants in the global environment

问题

我正在尝试应用一个函数,该函数从全局环境中获取一个向量(BASELINE_CLASSIFICATION_THRESHOLDS)中的一些输入,并使用lapply将其应用于数据框。实质上,它将数字转换为级别(轻微、中等、严重、极端):

  1. BASELINE_CLASSIFICATION_THRESHOLDS <- c(0, 3.5, 6.5, 10.0000001)
  2. value_to_classification <- function(x){
  3. if((x >= BASELINE_CLASSIFICATION_THRESHOLDS[1]) && (x < BASELINE_CLASSIFICATION_THRESHOLDS[2])){
  4. classification <- "轻微"
  5. } else if((x >= BASELINE_CLASSIFICATION_THRESHOLDS[2]) && (x < BASELINE_CLASSIFICATION_THRESHOLDS[3])){
  6. classification <- "中等"
  7. } else if((x >= BASELINE_CLASSIFICATION_THRESHOLDS[3]) && (x < round(BASELINE_CLASSIFICATION_THRESHOLDS[4]))){
  8. classification <- "严重"
  9. } else {
  10. classification <- "极端"
  11. }
  12. return(classification)
  13. }
  14. df <- data.frame(x = runif(10, min = 0, max = 10),
  15. y = runif(10, min = 0, max = 10),
  16. z = runif(10, min = 0, max = 10))

但是,当我尝试将value_to_classification应用于x列时,我遇到了一个错误:

  1. lapply(df["x"], value_to_classification)
  2. $x
  3. [1] "轻微"
  4. 警告信息:
  5. 1: (x >= BASELINE_CLASSIFICATION_THRESHOLDS[1]) && (x < BASELINE_CLASSIFICATION_THRESHOLDS[2]) 中:
  6. 'length(x) = 10 > 1' 在强制类型转换为 'logical(1)'
  7. 2: (x >= BASELINE_CLASSIFICATION_THRESHOLDS[1]) && (x < BASELINE_CLASSIFICATION_THRESHOLDS[2]) 中:
  8. 'length(x) = 10 > 1' 在强制类型转换为 'logical(1)'

另一方面,如果我写成:

  1. lapply(df[["x"]], value_to_classification)

它可以工作。最终我想做的是类似于:

  1. df[c("x1", "x2")] <- lapply(df[c("x", "y")], value_to_classification)

一些搜索似乎表明我的语法是正确的,但我显然做错了什么。我做错了什么,该如何修复?

诚挚地感谢您提前的帮助。

Thomas Philips

英文:

I'm trying to apply a function that takes some inputs from the global environment held in a vector (BASELINE_CLASSIFICATION_THRESHOLDS) to a dataframe using lapply. In essence it transforms numbers to levels (Mild, Moderate, Severe, Extreme):

  1. BASELINE_CLASSIFICATION_THRESHOLDS &lt;- c(0, 3.5, 6.5, 10.0000001)
  2. value_to_classification &lt;- function(x){
  3. if((x &gt;= BASELINE_CLASSIFICATION_THRESHOLDS[1]) &amp;&amp; (x &lt; BASELINE_CLASSIFICATION_THRESHOLDS[2])){
  4. classification &lt;- &quot;Mild&quot;
  5. } else if((x &gt;= BASELINE_CLASSIFICATION_THRESHOLDS[2]) &amp;&amp; (x &lt; BASELINE_CLASSIFICATION_THRESHOLDS[3])){
  6. classification &lt;- &quot;Moderate&quot;
  7. } else if((x &gt;= BASELINE_CLASSIFICATION_THRESHOLDS[3]) &amp;&amp; (x &lt; round(BASELINE_CLASSIFICATION_THRESHOLDS[4]))){
  8. classification &lt;- &quot;Severe&quot;
  9. } else {
  10. classification &lt;- &quot;Extreme&quot;
  11. }
  12. return(classification)
  13. }
  14. df &lt;- data.frame(x = runif(10, min = 0, max = 10),
  15. y = runif(10, min = 0, max = 10),
  16. z = runif(10, min = 0, max = 10))

But when I try to lapply value_to_classification to a column of x, I get an error:

  1. lapply(df[&quot;x&quot;], value_to_classification)
  2. $x
  3. [1] &quot;Mild&quot;
  4. Warning messages:
  5. 1: In (x &gt;= BASELINE_CLASSIFICATION_THRESHOLDS[1]) &amp;&amp; (x &lt; BASELINE_CLASSIFICATION_THRESHOLDS[2]) :
  6. &#39;length(x) = 10 &gt; 1&#39; in coercion to &#39;logical(1)&#39;
  7. 2: In (x &gt;= BASELINE_CLASSIFICATION_THRESHOLDS[1]) &amp;&amp; (x &lt; BASELINE_CLASSIFICATION_THRESHOLDS[2]) :
  8. &#39;length(x) = 10 &gt; 1&#39; in coercion to &#39;logical(1)&#39;

On the other hand, if I write

  1. lapply(df[[&quot;x&quot;]], value_to_classification)

it works. What I eventually want to do is to write something like

  1. df[c(&quot;x1&quot;, &quot;x2&quot;)] &lt;- lapply(df[c(&quot;x&quot;, &quot;y&quot;)], value_to_classification)

Some searching seems to suggest that my syntax is OK, but I'm clearly getting something wrong. What am I doing wrong, and how can I fix this?

Sincerely and with many thanks in advance

Thomas Philips

答案1

得分: 1

问题是value_to_classification不适用于向量。您可以运行value_to_classification(c(1,2,3)),它将只返回一个值(而不是3个)。

一种解决方法是将函数向量化:

  1. vectorized_value_to_classification <- Vectorize(value_to_classification)
  2. df[c("x1", "x2")] <- lapply(df[c("x", "y")], vectorized_value_to_classification)
  3. df
  4. x y z x1 x2
  5. 1 3.233599 5.612147 2.9525939 轻度 中度
  6. 2 5.453014 3.659298 8.1642952 中度 中度
  7. 3 7.104259 6.333049 7.1706136 重度 中度
  8. 4 4.199447 3.277607 8.9458447 中度 轻度
  9. 5 9.352140 7.135801 2.6721405 重度 重度
  10. 6 7.682951 1.358830 4.2102313 重度 轻度
  11. 7 6.551999 9.986188 1.9995422 重度 重度
  12. 8 7.436272 9.260056 0.1093833 重度 重度
  13. 9 5.163593 7.689474 0.2999034 中度 重度
  14. 10 7.500994 4.599129 8.5266752 重度 中度

(Note: I've translated the variable names in the code as well.)

英文:

The issue is value_to_classification does not work for a vector. You can run value_to_classification(c(1,2,3)) and it would only return one value (instead of 3).

One solution is to vectorize the function:

  1. vectorized_value_to_classification &lt;- Vectorize(value_to_classification)
  2. df[c(&quot;x1&quot;, &quot;x2&quot;)] &lt;- lapply(df[c(&quot;x&quot;, &quot;y&quot;)], vectorized_value_to_classification)
  3. df
  4. x y z x1 x2
  5. 1 3.233599 5.612147 2.9525939 Mild Moderate
  6. 2 5.453014 3.659298 8.1642952 Moderate Moderate
  7. 3 7.104259 6.333049 7.1706136 Severe Moderate
  8. 4 4.199447 3.277607 8.9458447 Moderate Mild
  9. 5 9.352140 7.135801 2.6721405 Severe Severe
  10. 6 7.682951 1.358830 4.2102313 Severe Mild
  11. 7 6.551999 9.986188 1.9995422 Severe Severe
  12. 8 7.436272 9.260056 0.1093833 Severe Severe
  13. 9 5.163593 7.689474 0.2999034 Moderate Severe
  14. 10 7.500994 4.599129 8.5266752 Severe Moderate

huangapple
  • 本文由 发表于 2023年6月15日 06:01:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/76477861.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定