英文:
how to sort list.files() in correct date order?
问题
使用普通的 list.files() 在工作目录中返回文件列表,但数字顺序混乱。
f <- list.files(pattern="*.nc")
f
# [1] "te1971-1.nc"  "te1971-10.nc" "te1971-11.nc" "te1971-12.nc"
# [5] "te1971-2.nc"  "te1971-3.nc"  "te1971-4.nc"  "te1971-5.nc" 
# [9] "te1971-6.nc"  "te1971-7.nc"  "te1971-8.nc"  "te1971-9.nc"
其中 "-" 后面的数字表示月份。
我尝试使用以下方法进行排序:
myFiles <- paste("te", i, "-", c(1:12), ".nc", sep = "")
mixedsort(myFiles)
它返回排序后的文件,但是是倒序的:
[1] "te1971-12.nc" "te1971-11.nc" "tev1971-10.nc" "te1971-9.nc" 
[5] "te1971-8.nc"  "te1971-7.nc"  "te1971-6.nc"  "te1971-5.nc" 
[9] "te1971-4.nc"  "te1971-3.nc"  "te1971-2.nc"  "te1971-1.nc"
如何修复这个问题?
英文:
Using normal list.files() in the working directory return the file list but the numeric order is messed up.
f <- list.files(pattern="*.nc")
f
# [1] "te1971-1.nc"  "te1971-10.nc" "te1971-11.nc" "te1971-12.nc"
# [5] "te1971-2.nc"  "te1971-3.nc"  "te1971-4.nc"  "te1971-5.nc" 
# [9] "te1971-6.nc"  "te1971-7.nc"  "te1971-8.nc"  "te1971-9.nc"
where the number after "-" describes the month number.
I used the following to try to sort it
myFiles <- paste("te", i, "-", c(1:12), ".nc", sep = "")
mixedsort(myFiles)
it returns ordered files but in reverse:
[1] "te1971-12.nc" "te1971-11.nc" "tev1971-10.nc" "te1971-9.nc" 
[5] "te1971-8.nc"  "te1971-7.nc"  "te1971-6.nc"  "te1971-5.nc" 
[9] "te1971-4.nc"  "te1971-3.nc"  "te1971-2.nc"  "te1971-1.nc" 
How do I fix this?
答案1
得分: 0
问题是值被按字母顺序排序。
您可以使用 gsub 将年份和月份替换为组 (.),并将 "-1" 添加为月份的第一天以获得收益,然后使用 as.Date 强制转换并按照它进行排序。
x[order(as.Date(gsub('.*(\\d{4})-(\\d{,2}).*', '\-\-1', x)))]
# [1] "te1971-1.nc"  "te1971-2.nc"  "te1971-3.nc"  "te1971-4.nc"  "te1971-5.nc" 
# [6] "te1971-6.nc"  "te1971-7.nc"  "te1971-8.nc"  "te1971-9.nc"  "te1971-10.nc"
# [11] "te1971-11.nc" "te1971-12.nc"
数据:
x <- c("te1971-1.nc", "te1971-10.nc", "te1971-11.nc", "te1971-12.nc"?, 
       "te1971-2.nc", "te1971-3.nc", "te1971-4.nc", "te1971-5.nc", "te1971-6.nc"?, 
       "te1971-7.nc", "te1971-8.nc", "te1971-9.nc")
英文:
The issue is that the values get alphabetically sorted.
You could gsub out years and months as groups (.) and add "-1" as first day of the month to the yield, coerce it as.Date and order by that.
x[order(as.Date(gsub('.*(\\d{4})-(\\d{,2}).*', '\-\-1', x)))]
# [1] "te1971-1.nc"  "te1971-2.nc"  "te1971-3.nc"  "te1971-4.nc"  "te1971-5.nc" 
# [6] "te1971-6.nc"  "te1971-7.nc"  "te1971-8.nc"  "te1971-9.nc"  "te1971-10.nc"
# [11] "te1971-11.nc" "te1971-12.nc"
Data:
x <- c("te1971-1.nc", "te1971-10.nc", "te1971-11.nc", "te1971-12.nc", 
       "te1971-2.nc", "te1971-3.nc", "te1971-4.nc", "te1971-5.nc", "te1971-6.nc", 
       "te1971-7.nc", "te1971-8.nc", "te1971-9.nc")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论