2023年6月15日 06:01:45go评论101阅读模式

英文:

Problem using lapply in R with a function that uses constants in the global environment

问题

我正在尝试应用一个函数，该函数从全局环境中获取一个向量（BASELINE_CLASSIFICATION_THRESHOLDS）中的一些输入，并使用lapply将其应用于数据框。实质上，它将数字转换为级别（轻微、中等、严重、极端）：

BASELINE_CLASSIFICATION_THRESHOLDS <- c(0, 3.5, 6.5, 10.0000001)
value_to_classification <- function(x){
  if((x >= BASELINE_CLASSIFICATION_THRESHOLDS[1]) && (x < BASELINE_CLASSIFICATION_THRESHOLDS[2])){
    classification <- "轻微"
  } else if((x >= BASELINE_CLASSIFICATION_THRESHOLDS[2]) && (x < BASELINE_CLASSIFICATION_THRESHOLDS[3])){
    classification <- "中等"
  } else if((x >= BASELINE_CLASSIFICATION_THRESHOLDS[3]) && (x < round(BASELINE_CLASSIFICATION_THRESHOLDS[4]))){
    classification <- "严重"
  } else {
    classification <- "极端"
  }
  return(classification)
}
df <- data.frame(x = runif(10, min = 0, max = 10),
                 y = runif(10, min = 0, max = 10),
                 z = runif(10, min = 0, max = 10))

但是，当我尝试将value_to_classification应用于x列时，我遇到了一个错误：

lapply(df["x"], value_to_classification)
$x
[1] "轻微"
警告信息：
1: 在 (x >= BASELINE_CLASSIFICATION_THRESHOLDS[1]) && (x < BASELINE_CLASSIFICATION_THRESHOLDS[2]) 中：
  'length(x) = 10 > 1' 在强制类型转换为 'logical(1)' 时
2: 在 (x >= BASELINE_CLASSIFICATION_THRESHOLDS[1]) && (x < BASELINE_CLASSIFICATION_THRESHOLDS[2]) 中：
  'length(x) = 10 > 1' 在强制类型转换为 'logical(1)' 时

另一方面，如果我写成：

lapply(df[["x"]], value_to_classification)

它可以工作。最终我想做的是类似于：

df[c("x1", "x2")] <-  lapply(df[c("x", "y")], value_to_classification)

一些搜索似乎表明我的语法是正确的，但我显然做错了什么。我做错了什么，该如何修复？

诚挚地感谢您提前的帮助。

Thomas Philips

英文:

I'm trying to apply a function that takes some inputs from the global environment held in a vector (BASELINE_CLASSIFICATION_THRESHOLDS) to a dataframe using lapply. In essence it transforms numbers to levels (Mild, Moderate, Severe, Extreme):

BASELINE_CLASSIFICATION_THRESHOLDS  &lt;- c(0, 3.5, 6.5, 10.0000001)
value_to_classification &lt;- function(x){
  if((x &gt;= BASELINE_CLASSIFICATION_THRESHOLDS[1]) &amp;&amp; (x &lt; BASELINE_CLASSIFICATION_THRESHOLDS[2])){
    classification &lt;- &quot;Mild&quot;
  } else if((x &gt;= BASELINE_CLASSIFICATION_THRESHOLDS[2]) &amp;&amp; (x &lt; BASELINE_CLASSIFICATION_THRESHOLDS[3])){
    classification &lt;- &quot;Moderate&quot;
  } else if((x &gt;= BASELINE_CLASSIFICATION_THRESHOLDS[3]) &amp;&amp; (x &lt; round(BASELINE_CLASSIFICATION_THRESHOLDS[4]))){
    classification &lt;- &quot;Severe&quot;
  } else {
    classification &lt;- &quot;Extreme&quot;
  }
  return(classification)
}
df &lt;- data.frame(x = runif(10, min = 0, max = 10),
                 y = runif(10, min = 0, max = 10),
                 z = runif(10, min = 0, max = 10))

But when I try to lapply value_to_classification to a column of x, I get an error:

lapply(df[&quot;x&quot;], value_to_classification)
$x
[1] &quot;Mild&quot;
Warning messages:
1: In (x &gt;= BASELINE_CLASSIFICATION_THRESHOLDS[1]) &amp;&amp; (x &lt; BASELINE_CLASSIFICATION_THRESHOLDS[2]) :
  &#39;length(x) = 10 &gt; 1&#39; in coercion to &#39;logical(1)&#39;
2: In (x &gt;= BASELINE_CLASSIFICATION_THRESHOLDS[1]) &amp;&amp; (x &lt; BASELINE_CLASSIFICATION_THRESHOLDS[2]) :
  &#39;length(x) = 10 &gt; 1&#39; in coercion to &#39;logical(1)&#39;

On the other hand, if I write

lapply(df[[&quot;x&quot;]], value_to_classification)

it works. What I eventually want to do is to write something like

df[c(&quot;x1&quot;, &quot;x2&quot;)] &lt;-  lapply(df[c(&quot;x&quot;, &quot;y&quot;)], value_to_classification)

Some searching seems to suggest that my syntax is OK, but I'm clearly getting something wrong. What am I doing wrong, and how can I fix this?

Sincerely and with many thanks in advance

Thomas Philips

答案1

得分: 1

问题是value_to_classification不适用于向量。您可以运行value_to_classification(c(1,2,3))，它将只返回一个值（而不是3个）。

一种解决方法是将函数向量化：

vectorized_value_to_classification <- Vectorize(value_to_classification)
df[c("x1", "x2")] <-  lapply(df[c("x", "y")], vectorized_value_to_classification)
df
          x        y         z       x1       x2
1  3.233599 5.612147 2.9525939     轻度     中度
2  5.453014 3.659298 8.1642952     中度     中度
3  7.104259 6.333049 7.1706136     重度     中度
4  4.199447 3.277607 8.9458447     中度     轻度
5  9.352140 7.135801 2.6721405     重度     重度
6  7.682951 1.358830 4.2102313     重度     轻度
7  6.551999 9.986188 1.9995422     重度     重度
8  7.436272 9.260056 0.1093833     重度     重度
9  5.163593 7.689474 0.2999034     中度     重度
10 7.500994 4.599129 8.5266752     重度     中度

(Note: I've translated the variable names in the code as well.)

英文:

The issue is value_to_classification does not work for a vector. You can run value_to_classification(c(1,2,3)) and it would only return one value (instead of 3).

One solution is to vectorize the function:

vectorized_value_to_classification &lt;- Vectorize(value_to_classification)
df[c(&quot;x1&quot;, &quot;x2&quot;)] &lt;-  lapply(df[c(&quot;x&quot;, &quot;y&quot;)], vectorized_value_to_classification)
df
          x        y         z       x1       x2
1  3.233599 5.612147 2.9525939     Mild Moderate
2  5.453014 3.659298 8.1642952 Moderate Moderate
3  7.104259 6.333049 7.1706136   Severe Moderate
4  4.199447 3.277607 8.9458447 Moderate     Mild
5  9.352140 7.135801 2.6721405   Severe   Severe
6  7.682951 1.358830 4.2102313   Severe     Mild
7  6.551999 9.986188 1.9995422   Severe   Severe
8  7.436272 9.260056 0.1093833   Severe   Severe
9  5.163593 7.689474 0.2999034 Moderate   Severe
10 7.500994 4.599129 8.5266752   Severe Moderate

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用lapply在R中出现问题，该函数在全局环境中使用常量。

问题

答案1

在R中，使用一个函数引用另一个数据框，向数据框添加一列。

在ggplot2中，对于因子数据，计算的误差条不会绘制。

如何在Python中通过ID查找过去的数值

使用正则表达式组来在pandas数据框中通过同时匹配多个模式来重命名列。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。