为什么我在文件存在且路径正确的情况下读取RDS文件时会出现错误?

huangapple go评论83阅读模式
英文:

Why do I get an error reading an RDS file when the file exists and the path is correct?

问题

对于这个问题,很难为我创建一个可重现的示例,因为错误发生在高性能集群上。我无法在本地重现这个错误。

我有以下代码:

summaries <- list()

for (a in c("A", "B")) {
  for (b in c("C", "D", "E", "F")) {
    for (c in c("01", "02", "03", "04", "05")) {
      summaries[[a]][[b]][[c]] <- readRDS(paste0("/home/user5556/Proj1/Simulation/", a, "/", b, "/", c, "/summary.Rds"))
    }
  }
}

saveRDS(summaries, "summaries.Rds")

运行这段代码会返回以下错误:

Error in readRDS(paste0("/home/user5556/Proj1/Simulation/", a, "/", b, "/",  : 
  error reading from connection

然而,现在观察一下,如果在前面的代码块中,我们将 readRDS 替换为 file.exists() 并打印它。

for (a in c("A", "B")) {
  for (b in c("C", "D", "E", "F")) {
    for (c in c("01", "02", "03", "04", "05")) {
      print(file.exists(paste0("/home/user5556/Proj1/Simulation/", a, "/", b, "/", c, "/summary.Rds")))
    }
  }
}

这会打印出:

[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE

当手动键入文件路径时,文件可以成功读取:

```R
readRDS("/home/user5556/Proj1/Simulation/A/C/01/summary.Rds")

这会按预期读取RDS文件。那么到底发生了什么?这不可能是文件路径不正确,因为否则 file.exists() 会返回 false。也不可能是我没有权限查看文件,因为在手动键入路径时,我可以读取它。那么到底出了什么问题?

英文:

For this question, it's really hard for me to make a reproducible example because the error happens on the high performance cluster. I am not able to reproduce the error locally.

I have this code:

summaries &lt;- list()

for (a in c(&quot;A&quot;, &quot;B&quot;)) {
  for (b in c(&quot;C&quot;, &quot;D&quot;, &quot;E&quot;, &quot;F&quot;)) {
    for (c in c(&quot;01&quot;, &quot;02&quot;, &quot;03&quot;, &quot;04&quot;, &quot;05&quot;)) {
      summaries[[a]][[b]][[c]] &lt;- readRDS(paste0(&quot;/home/user5556/Proj1/Simulation/&quot;, a, &quot;/&quot;, b, &quot;/&quot;, c, &quot;/summary.Rds&quot;))
    }
  }
}

saveRDS(summaries, &quot;summaries.Rds&quot;)

Running this will return:

Error in readRDS(paste0(&quot;/home/user5556/Proj1/Simulation/&quot;, a, &quot;/&quot;,  :
  error reading from connection

However, now observe what happens if in the previous block of code we replace readRDS with file.exists() and print it.

for (a in c(&quot;A&quot;, &quot;B&quot;)) {
  for (b in c(&quot;C&quot;, &quot;D&quot;, &quot;E&quot;, &quot;F&quot;)) {
    for (c in c(&quot;01&quot;, &quot;02&quot;, &quot;03&quot;, &quot;04&quot;, &quot;05&quot;)) {
      print(file.exists(paste0(&quot;/home/user5556/Proj1/Simulation/&quot;, a, &quot;/&quot;, b, &quot;/&quot;, c, &quot;/summary.Rds&quot;)))
    }
  }
}

[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE

When manually typing out the file path, the file is successfully read:

readRDS(&quot;/home/user5556/Proj1/Simulation/A/C/01/summary.Rds&quot;)

This reads the RDS file as expected. So what exactly is happening here? It can't be the case that the file path is incorrectly specified, because then the file.exists() would return false. It's also not possible that I don't have the permissions to view the file, because then I wouldn't be able to read it in when typing out the path manually. So what's wrong?

答案1

得分: 3

Sure, here's the translated content:

由于我们无法运行代码,这只是一个猜测,但我猜其中一个文件可能出现了某种损坏。您已成功测试了第一个文件,但错误消息没有说明哪个文件出了问题。

要调试它,可以向循环添加调试输出。例如,

summaries <- list()

for (a in c("A", "B")) {
  for (b in c("C", "D", "E", "F")) {
    for (c in c("01", "02", "03", "04", "05")) {
      filename <- paste0("/home/user5556/Proj1/Simulation/", a, "/", b, "/", c, "/summary.Rds")
      cat("Reading ", filename)
      summaries[[a]][[b]][[c]] <- readRDS(filename)
      cat(" successfully!\n")
    }
  }
}

saveRDS(summaries, "summaries.Rds")

当出现错误时,您会看到"Reading"消息,但没有看到"successfully"部分,因此您将知道是哪个文件导致了问题。

如果事实证明是第一个文件出了问题,那么我错了。但我猜测不会是第一个文件。您可以在循环之外尝试readRDS(filename),看看错误是否重复。我猜测会重复。

如何修复它还不清楚,但您可以尝试仅测试特定文件名并跳过它,或重新运行模拟,或按照@Limey建议的方式修复文件权限。

英文:

Since we can't run the code, this is just a guess, but I'd guess one of the files is corrupted somehow. You tested the first one successfully, but the error doesn't say which one was problematic.

To debug it add debug output to the loop. For example,

summaries &lt;- list()

for (a in c(&quot;A&quot;, &quot;B&quot;)) {
  for (b in c(&quot;C&quot;, &quot;D&quot;, &quot;E&quot;, &quot;F&quot;)) {
    for (c in c(&quot;01&quot;, &quot;02&quot;, &quot;03&quot;, &quot;04&quot;, &quot;05&quot;)) {
      filename &lt;- paste0(&quot;/home/user5556/Proj1/Simulation/&quot;, a, &quot;/&quot;, b, &quot;/&quot;, c, &quot;/summary.Rds&quot;)
      cat(&quot;Reading &quot;, filename)
      summaries[[a]][[b]][[c]] &lt;- readRDS(filename)
      cat(&quot; successfully!\n&quot;)
    }
  }
}

saveRDS(summaries, &quot;summaries.Rds&quot;)

When the error occurs, you have seen the "Reading" message, but not the "successfully" part, so you will know which file caused the problem.

If it turns out to be the first file, then I was wrong. But I'm guessing it won't. You can try readRDS(filename) outside of the loop, and see if the error repeats. I'm guessing it will.

How to fix it isn't clear, but you could just test for that particular filename and skip it, or redo the simulation, or fix file permissions as @Limey suggested.

huangapple
  • 本文由 发表于 2023年6月26日 18:20:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76555758.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定