2020年1月6日 20:25:39go评论173阅读模式

英文:

Applying a custom function to every row in r

问题

I created a function to calculate the rollmean of a row in a dataframe:

rollmean_circular <- function(x) {t(rollmean(t(cbind(x[9:10],x,x[1:2])),5))}

这是一个用来计算数据框中一行的滚动均值的函数：

rollmean_circular <- function(x) {t(rollmean(t(cbind(x[9:10],x,x[1:2])),5))}

What this function does is given a vector, it will append the last 2 element to the front and first 2 element to the back and then do a rollmean so there will not be any NAs at the front or back.

这个函数的作用是给定一个向量，它会将最后两个元素添加到前面，将前两个元素添加到后面，然后进行滚动均值计算，以确保前面和后面没有任何NA值。

It works perfectly when I apply to 1 row of a df.

当我将其应用于数据框的一行时，它可以完美运行。

r = df[1,]
rollmean_circular(r)

r = df[1,]
rollmean_circular(r)

However, when I use apply to apply this function to every row of my dataframe, it returns a logical(0).

然而，当我使用apply将此函数应用于数据框的每一行时，它返回一个logical(0)。

apply(df,1,rollmean_circular)

apply(df,1,rollmean_circular)

Can I know what I am missing?

我能知道我漏掉了什么吗？

When I apply another function that gives the same output for a single row, it works:

当我应用另一个函数，该函数对单行产生相同的输出时，它可以正常工作：

stdize <- function(x, na.rm=T) {(x - min(x, na.rm=T)) / (max(x, na.rm=T) - min(x, na.rm=T))}

stdize <- function(x, na.rm=T) {(x - min(x, na.rm=T)) / (max(x, na.rm=T) - min(x, na.rm=T))}

stdize(r)

stdize(r)

apply(df,1,stdize)

apply(df,1,stdize)

英文:

I created a function to calculate the rollmean of a row in a dataframe:

rollmean_circular &lt;- function(x) {t(rollmean(t(cbind(x[9:10],x,x[1:2])),5))}

df &lt;- structure(list(X1 = c(5L, 5L, 9L, 0L, 9L, 10L, 10L, 1L, 0L, 10L
), X2 = c(6L, 8L, 6L, 9L, 7L, 5L, 0L, 7L, 5L, 8L), X3 = c(10L, 
7L, 2L, 1L, 2L, 10L, 2L, 9L, 6L, 4L), X4 = c(6L, 0L, 9L, 1L, 
6L, 8L, 3L, 7L, 8L, 1L), X5 = c(0L, 9L, 8L, 3L, 1L, 8L, 3L, 9L, 
5L, 2L), X6 = c(0L, 10L, 9L, 10L, 3L, 1L, 6L, 0L, 6L, 9L), X7 = c(9L, 
10L, 0L, 10L, 10L, 9L, 0L, 1L, 10L, 2L), X8 = c(2L, 6L, 3L, 7L, 
7L, 9L, 8L, 9L, 1L, 0L), X9 = c(0L, 8L, 8L, 9L, 0L, 5L, 9L, 9L, 
4L, 8L), X10 = c(1L, 4L, 3L, 0L, 1L, 7L, 3L, 6L, 5L, 0L)), class = &quot;data.frame&quot;, row.names = c(NA, 
-10L))

   X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1   5  6 10  6  0  0  9  2  0   1
2   5  8  7  0  9 10 10  6  8   4
3   9  6  2  9  8  9  0  3  8   3
4   0  9  1  1  3 10 10  7  9   0
5   9  7  2  6  1  3 10  7  0   1
6  10  5 10  8  8  1  9  9  5   7
7  10  0  2  3  3  6  0  8  9   3
8   1  7  9  7  9  0  1  9  9   6
9   0  5  6  8  5  6 10  1  4   5
10 10  8  4  1  2  9  2  0  8   0

What this function does is given a vector, it will append the last 2 element to the front and first 2 element to the back and then do a rollmean so there will not be any NAs at the front or back.

It works perfectly when I apply to 1 row of a df.

r = df[1,]
rollmean_circular[r]

  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
1  4.4  5.6  5.4  4.4    5  3.4  2.2  2.4  3.4   2.8

However, when I use apply to apply this function to every row of my dataframe, it returns a logical(0).

apply(df,1,rollmean_circular)

logical(0)

Can I know what I am missing?

When I apply another function that gives the same output for a single row, it works:

stdize &lt;- function(x, na.rm=T) {(x - min(x, na.rm=T)) / (max(x, na.rm=T) - min(x, na.rm=T))}

stdize(r)

   X1  X2 X3  X4 X5 X6  X7  X8 X9 X10
1 0.5 0.6  1 0.6  0  0 0.9 0.2  0 0.1

apply(df,1,stdize)

    [,1] [,2]      [,3] [,4] [,5]      [,6] [,7]      [,8] [,9] [,10]
X1   0.5  0.5 1.0000000  0.0  0.9 1.0000000  1.0 0.1111111  0.0   1.0
X2   0.6  0.8 0.6666667  0.9  0.7 0.4444444  0.0 0.7777778  0.5   0.8
X3   1.0  0.7 0.2222222  0.1  0.2 1.0000000  0.2 1.0000000  0.6   0.4
X4   0.6  0.0 1.0000000  0.1  0.6 0.7777778  0.3 0.7777778  0.8   0.1
X5   0.0  0.9 0.8888889  0.3  0.1 0.7777778  0.3 1.0000000  0.5   0.2
X6   0.0  1.0 1.0000000  1.0  0.3 0.0000000  0.6 0.0000000  0.6   0.9
X7   0.9  1.0 0.0000000  1.0  1.0 0.8888889  0.0 0.1111111  1.0   0.2
X8   0.2  0.6 0.3333333  0.7  0.7 0.8888889  0.8 1.0000000  0.1   0.0
X9   0.0  0.8 0.8888889  0.9  0.0 0.4444444  0.9 1.0000000  0.4   0.8
X10  0.1  0.4 0.3333333  0.0  0.1 0.6666667  0.3 0.6666667  0.5   0.0

答案1

得分: 2

看起来你在你的函数中混淆了向量和矩阵。你可以在函数中使用 unlist，然后稍后再进行转置。

rollmean_circular <- function(x) zoo::rollmean(unlist(c(x[9:10], x, x[1:2])), 5)

使用 apply 函数和上述函数来处理数据框 df，然后进行转置：

t(apply(df, 1, rollmean_circular))

这将产生与你提供的示例相似的结果，不过这只是翻译了你的代码，没有其他内容。

英文:

Seems you're confusing vectors and matrices in your function. You could unlist in the function and transpose later.

rollmean_circular &lt;- function(x) zoo::rollmean(unlist(c(x[9:10], x, x[1:2])),5)

t(apply(df, 1, rollmean_circular))
#       X1  X2  X3  X4  X5  X6  X7  X8  X9 X10
#  [1,] 4.4 5.6 5.4 4.4 5.0 3.4 2.2 2.4 3.4 2.8
#  [2,] 6.4 4.8 5.8 6.8 7.2 7.0 8.6 7.6 6.6 6.2
#  [3,] 5.6 5.8 6.8 6.8 5.6 5.8 5.6 4.6 4.6 5.8
#  [4,] 3.8 2.2 2.8 4.8 5.0 6.2 7.8 7.2 5.2 5.0
#  [5,] 3.8 5.0 5.0 3.8 4.4 5.4 4.2 4.2 5.4 4.8
#  [6,] 7.4 8.0 8.2 6.4 7.2 7.0 6.4 6.2 8.0 7.2
#  [7,] 4.8 3.6 3.6 2.8 2.8 4.0 5.2 5.2 6.0 6.0
#  [8,] 6.4 6.0 6.6 6.4 5.2 5.2 5.6 5.0 5.2 6.4
#  [9,] 4.0 4.8 4.8 6.0 7.0 6.0 5.2 5.2 4.0 3.0
# [10,] 6.0 4.6 5.0 4.8 3.6 2.8 4.2 3.8 4.0 5.2

This can also be done in base R (w/ most of the credits to @MattiPastell):

fun &lt;- function(x, n=5) na.omit(filter(c(tail(x, 2), x, head(x, 2)), rep(1 / n, n), sides=2))
t(apply(df, 1, fun))
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#  [1,]  4.4  5.6  5.4  4.4  5.0  3.4  2.2  2.4  3.4   2.8
#  [2,]  6.4  4.8  5.8  6.8  7.2  7.0  8.6  7.6  6.6   6.2
#  [3,]  5.6  5.8  6.8  6.8  5.6  5.8  5.6  4.6  4.6   5.8
#  [4,]  3.8  2.2  2.8  4.8  5.0  6.2  7.8  7.2  5.2   5.0
#  [5,]  3.8  5.0  5.0  3.8  4.4  5.4  4.2  4.2  5.4   4.8
#  [6,]  7.4  8.0  8.2  6.4  7.2  7.0  6.4  6.2  8.0   7.2
#  [7,]  4.8  3.6  3.6  2.8  2.8  4.0  5.2  5.2  6.0   6.0
#  [8,]  6.4  6.0  6.6  6.4  5.2  5.2  5.6  5.0  5.2   6.4
#  [9,]  4.0  4.8  4.8  6.0  7.0  6.0  5.2  5.2  4.0   3.0
# [10,]  6.0  4.6  5.0  4.8  3.6  2.8  4.2  3.8  4.0   5.2

答案2

得分: 0

rollmean将自动在其输入的每一列上工作，因此可以直接进行操作，而不需要使用apply：

library(zoo)
t(rollmean(t(cbind(df[9:10], df, df[1:2])), 5))

或者在R的基础上使用stats::filter，它也可以在每一列上工作：

t(filter(t(df), rep(1, 5)/5, circular = TRUE))

这两种方法都会得到这个矩阵：

          [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
     [1,]  4.4  5.6  5.4  4.4  5.0  3.4  2.2  2.4  3.4   2.8
     [2,]  6.4  4.8  5.8  6.8  7.2  7.0  8.6  7.6  6.6   6.2
     [3,]  5.6  5.8  6.8  6.8  5.6  5.8  5.6  4.6  4.6   5.8
     [4,]  3.8  2.2  2.8  4.8  5.0  6.2  7.8  7.2  5.2   5.0
     [5,]  3.8  5.0  5.0  3.8  4.4  5.4  4.2  4.2  5.4   4.8
     [6,]  7.4  8.0  8.2  6.4  7.2  7.0  6.4  6.2  8.0   7.2
     [7,]  4.8  3.6  3.6  2.8  2.8  4.0  5.2  5.2  6.0   6.0
     [8,]  6.4  6.0  6.6  6.4  5.2  5.2  5.6  5.0  5.2   6.4
     [9,]  4.0  4.8  4.8  6.0  7.0  6.0  5.2  5.2  4.0   3.0
    [10,]  6.0  4.6  5.0  4.8  3.6  2.8  4.2  3.8  4.0   5.2

根据您的应用需求，您可以考虑将这些序列存储在列中，而不是行，这样就不需要进行转置操作。

英文:

rollmean will automatically work on every column of its input so this can be done directly eliminating the apply:

library(zoo)
t(rollmean(t(cbind(df[9:10], df, df[1:2])), 5))

or using stats::filter in the base of R which will also work on every column:

t(filter(t(df), rep(1, 5)/5, circular = TRUE))

Either of tehse give this matrix:

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]  4.4  5.6  5.4  4.4  5.0  3.4  2.2  2.4  3.4   2.8
 [2,]  6.4  4.8  5.8  6.8  7.2  7.0  8.6  7.6  6.6   6.2
 [3,]  5.6  5.8  6.8  6.8  5.6  5.8  5.6  4.6  4.6   5.8
 [4,]  3.8  2.2  2.8  4.8  5.0  6.2  7.8  7.2  5.2   5.0
 [5,]  3.8  5.0  5.0  3.8  4.4  5.4  4.2  4.2  5.4   4.8
 [6,]  7.4  8.0  8.2  6.4  7.2  7.0  6.4  6.2  8.0   7.2
 [7,]  4.8  3.6  3.6  2.8  2.8  4.0  5.2  5.2  6.0   6.0
 [8,]  6.4  6.0  6.6  6.4  5.2  5.2  5.6  5.0  5.2   6.4
 [9,]  4.0  4.8  4.8  6.0  7.0  6.0  5.2  5.2  4.0   3.0
[10,]  6.0  4.6  5.0  4.8  3.6  2.8  4.2  3.8  4.0   5.2

Depending on the needs of your application you could consider storing these series in columns rather than rows in which case the transposes would not be needed.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将自定义函数应用于r中的每一行。

问题

答案1

答案2

你可以根据在R中的字符串是否包含特定值来改变数据。

如何为apply()函数格式化我的函数以计算特定列？

根据每行的值进行新列的变异。

如何使饼图标签与数据框中的正确数值对齐？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论