英文:
Issue outputting data generated from a user-defined function that uses sapply()
问题
I am attempting to summarize the data generated from looping through data using sapply() and am not sure why I can't access the dataframe that's generated. I'm setting up a monte-carlo simulation, and I have a function that sets up parameters and an estimate, and I wish to apply that function a set number of times per datapoint in a data set. So, I am using replicate() for my function, for which I have within a function that uses sapply(). It appears to work, however the data cannot be accessed in order to describe the resulting distributions of estimates; the data frame is not output, but is printed. Nice, but I need to now take them and calculate means and confidence intervals, and probably plot some.
I'm not super experienced with R, but here is basically what I'm trying to do:
repapply <- function(iter, datapoint){
estimates <- sapply(datapoint, function(datapoint){
replicate(n=iter, expr=generate_data(datapoint))
})
estimatesdf <- data.frame(estimates)
return(estimatesdf)
}
#test:
repapply(1000, measure)
Could anyone explain why there isn't any output dataframe? It prints the information below:
X1 X2
1 476.454335, 6.240725, 4.433396, 24.017384, 36.900104 594.890067, 2.310075, 7.210158, 21.379092, 30.256849
X3 X4
1 359.167706, 5.817891, 7.276368, 20.776742, 23.539489 459.848602, 3.826445, 4.319803, 23.774576, 52.130509
X5 X6
1 504.624220, 8.159456, 4.110860, 23.805009, 42.983076 578.252014, 6.749054, 5.880862, 23.312351, 42.320465
X7 X8
1 427.196750, 7.458934, 3.295953, 24.764725, 45.647360 284.724297, 5.234101, 6.481678, 20.159478, 42.160186
X9 X10
1 307.605356, 4.386591, 5.562230, 22.711697, 3.675961 418.109465, 5.618156, 3.135784, 24.503502, 34.891379
...
英文:
I am attempting to summarize the data generated from looping through data using sapply() and am not sure why I can't access the dataframe that's generated. I'm setting up a monte-carlo simulation, and I have a function that sets up parameters and an estimate, and I wish to apply that function a set number of times per datapoint in a data set. So, I am using replicate() for my function, for which I have within a function that uses sapply(). It appears to work, however the data cannot be accessed in order to describe the resulting distributions of estimates; the data frame is not output, but is printed. Nice, but I need to now take them and calculate means and confidence intervals, and probably plot some.
I'm not super experienced with R, but here is basically what I'm trying to do:
repapply <- function(iter, datapoint){
estimates <- sapply(datapoint, function(datapoint){
replicate(n=iter, expr=generate_data(datapoint))
})
estimatesdf <- data.frame(estimates)
return(estimatesdf)
}
#test:
repapply(1000, measure)
Could anyone explain why there isn't any output dataframe? It prints the information below:
X1 X2
1 476.454335, 6.240725, 4.433396, 24.017384, 36.900104 594.890067, 2.310075, 7.210158, 21.379092, 30.256849
X3 X4
1 359.167706, 5.817891, 7.276368, 20.776742, 23.539489 459.848602, 3.826445, 4.319803, 23.774576, 52.130509
X5 X6
1 504.624220, 8.159456, 4.110860, 23.805009, 42.983076 578.252014, 6.749054, 5.880862, 23.312351, 42.320465
X7 X8
1 427.196750, 7.458934, 3.295953, 24.764725, 45.647360 284.724297, 5.234101, 6.481678, 20.159478, 42.160186
X9 X10
1 307.605356, 4.386591, 5.562230, 22.711697, 3.675961 418.109465, 5.618156, 3.135784, 24.503502, 34.891379
...
答案1
得分: 1
欢迎来到SO!
简短回答:
你必须将 repapply(1000, measure)
赋值给一个值或对象。换句话说,你必须给它取个名字。例如:
df <- repapply(1000, measure)
原因:
当你在函数环境中定义对象时,它们局限于该函数的范围。所以,当你返回 estimatesdf
时,实际上只是返回它所指向的字面上的数据帧。因此,你甚至可以将函数的最后两行压缩成 return(data.frame(estimates))
,你将得到相同的结果。
或者:
与函数中定义的对象不同,函数中修改的函数之外已存在的对象会在函数范围之外保留其值。如果你在函数外部定义了 estimatesdf
(例如,将其设置为0),并且消除了 return()
调用,那么运行 repapply(1000, measure)
会将 estimatesdf
设置为所需的数据帧。
英文:
welcome to SO!
Short Answer:
You have to assign repapply(1000, measure)
to a value/object. In other words, you have to name it. For instance:
df <- repapply(1000, measure)
Why:
When you define objects in a function environment, they are local to that function's scope. So, when you return estimatesdf
, you are really just returning the literal data.frame that it points to. Hence, you could even compress the last two lines of your function into return(data.frame(estimates))
, and you would get the same result.
Alternatively:
Unlike objects defined in a function, pre-existing (outside the scope of the function) objects which are modified in a function do retain their value outside of the function's scope. If you define estimatesdf
(e.g., by setting it equal to 0) outside the function, and eliminated the return()
call, then running repapply(1000, measure)
would set the estimatesdf
to the desired data.frame.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论