使用readRDS()和哈希检索缓存的对象。

huangapple go评论58阅读模式
英文:

Retrieve memoised Object using readRDS() and hash

问题

以下是代码部分的翻译:

library(cachem)
library(memoise)

cache.dir <- "/Users/abcd/Desktop/temp_cache/"
cache <- cachem::cache_disk(dir = cache.dir, max_size = 1024^2)

fun <- function (x) {x^2}

fun.memo <- memoise(f = fun, cache = cache)

res.1 <- fun.memo(x = 2)
res.2 <- fun.memo(x = 3)

以下是问题的翻译:

到目前为止一切顺利。我可以计算fun.memo一次,并以后再次调用它以检索其结果。

现在我有以下的“问题”:我有一个很长的脚本,其中包含了几次记忆函数调用。最后,我只想进一步处理脚本中更高层次的记忆函数调用的输出。现在,如果我可以以某种方式直接从cache.dir中的.rds文件中检索记忆对象,那就太好了。这将避免顶部的冗长脚本(不是出于性能原因[memoise],而是为了避免冗长的代码)。我在考虑类似以下的事情:

setwd(cache.dir)
res.y <- readRDS(paste0(my.hash.1, ".rds"))
res.z <- readRDS(paste0(my.hash.2, ".rds"))

然而,我无法再次生成这些文件名中的哈希值:

rlang::hash(x = res.1)
rlang::hash(x = res.2)

rlang::hash(x = fun)
rlang::hash(x = fun.memo)

它们都产生不同的哈希值。似乎在memoise内部生成的哈希值不是写入.rds文件名的哈希值。

我知道以这种方式检索对象并不是最佳选择,因为这样无法确定它们是从哪些参数生成的。但仍然希望避免顶部的冗长代码。当然,我可以将所有先前的代码封装到一个函数或脚本中,然后使用source(),但这不是问题的关键。有什么建议吗?

英文:

Consider the following code:

library(cachem)
library(memoise)

cache.dir &lt;- &quot;/Users/abcd/Desktop/temp_cache/&quot;
cache &lt;- cachem::cache_disk(dir = cache.dir, max_size = 1024^2)

fun &lt;- function (x) {x^2}

fun.memo &lt;- memoise(f = fun, cache = cache)

res.1 &lt;- fun.memo(x = 2)
res.2 &lt;- fun.memo(x = 3)

So far so good. I can compute fun.memo once and retrieve its results later by calling it again.

Now I have the following "problem": I have a lengthy script with several memoised function calls. At the end I just want to further process the output of the last function call, which depends on the output of memoised function calls further up in the script. Now it would be nice if I can somehow retrieve the memoised objects directly from the .rds file in cache.dir. This would avoid the lengthy script on top (not for performance reasons [memoise], but to avoid lengthy code). I am thinking about something like:

setwd(cache.dir)
res.y &lt;- readRDS(paste0(my.hash.1, &quot;.rds&quot;))
res.z &lt;- readRDS(paste0(my.hash.2, &quot;.rds&quot;))

However, I can't generate those hashes in the filenames again:

rlang::hash(x = res.1)
rlang::hash(x = res.2)

rlang::hash(x = fun)
rlang::hash(x = fun.memo)

all yield different hashes. It seems that the hash generated within memoise is not the hash that gets written into the .rds filename.

I know that retrieving the objects like that is sub-optimal since then it is not clear what arguments they resulted from. Still it would be nice to avoid the lengthy code on top. Of course, I could wrap all the preceding code into a function or a script and source() it, but that's not the point here. Any advice?

答案1

得分: 1

I think you are somewhat wasting your time; but if you look inside memoise internals you can see how the keys are determined and you can hack your way to what the chosen key hashes are ...
because in this case there arent _additionals I can boil it down ...


library(cachem)
library(memoise)

cache.dir &lt;- tempdir(check=TRUE)
cache &lt;- cachem::cache_disk(dir = cache.dir, max_size = 1024^2)

fun &lt;- function (x) {x^2}

fun.memo &lt;- memoise(f = fun, cache = cache)

res.1 &lt;- fun.memo(x = 2)

list.files(path=cache.dir)
# [1] &quot;37513c63752949a0ae8d9befd52c6ad1.rds&quot;   ....

rlang::hash(c(
  rlang::hash(list(formals(fun), as.character(body(fun)))),
  list(x=2)))
# 37513c63752949a0ae8d9befd52c6ad1

请不要这样做:D

英文:

I think you are somewhat wasting your time; but if you look inside memoise internals you can see how the keys are determined and you can hack your way to what the chosen key hashes are ...
because in this case there arent _additionals I can boil it down ...


library(cachem)
library(memoise)

cache.dir &lt;- tempdir(check=TRUE)
cache &lt;- cachem::cache_disk(dir = cache.dir, max_size = 1024^2)

fun &lt;- function (x) {x^2}

fun.memo &lt;- memoise(f = fun, cache = cache)

res.1 &lt;- fun.memo(x = 2)

list.files(path=cache.dir)
# [1] &quot;37513c63752949a0ae8d9befd52c6ad1.rds&quot;   ....

rlang::hash(c(
  rlang::hash(list(formals(fun), as.character(body(fun)))),
  list(x=2)))
# 37513c63752949a0ae8d9befd52c6ad1

Please don't do this 使用readRDS()和哈希检索缓存的对象。

huangapple
  • 本文由 发表于 2023年2月16日 17:23:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/75470140.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定