Checking if file to be copied already exists in specified directory and if so skip the file and move onto next

huangapple go评论68阅读模式
英文:

Checking if file to be copied already exists in specified directory and if so skip the file and move onto next

问题

我正在遍历一个主目录,其中包含许多子目录,每个子目录都包含它们自己的子目录。我想要复制扩展名为 .xlsx 的文件从主目录到一个新目录,以汇总所有文件在一个地方。每个文件都有唯一的名称,每天都会添加新文件。

一旦文件被复制到新目录,我希望脚本通过比较文件名来防止其被覆盖,基于已经包含在主目录中的内容,例如:

今天主目录包含 test1.xlsx 和 test2.xlsx,它们被复制到我指定的新目录。

两天后,主目录包含 test1.xlsx、test2.xlsx 和 test3.xlsx。在这种情况下,一旦执行代码,我希望遍历主目录和子目录,并识别只有 test3.xlsx 是新的,基于主目录中的文件搜索和我复制文件到的指定目录之间的比较。

抱歉,我是 StackOverflow 和 Python 新手,英语是我的第二语言,所以不太确定我是否解释得够清楚,但希望有人能理解。

我尝试了以下代码,但它一直在覆盖我希望复制 .xlsx 文件的指定目录中的文件:

import os
import shutil
from os.path import isfile

#count = 0

for root, dirs, files in os.walk('Checklists'):
    for file in files:
        if file.endswith('.xlsx'):
            #print(file)
            if isfile('Checklist'):
                print("文件已存在")
            else:
                #print(os.path.join(root, file))
                #count +=1
                #print(count)
                #if not os.path.exists(os.path.join('Checklists', file)):
                shutil.copy(os.path.abspath(root + '/' + file), 'Checklist', follow_symlinks=True)

(注意:我已经删除了代码中的 HTML 实体字符以及多余的注释。)

英文:

I am iterating through a master directory with numerous sub-directories each containing their own sub-directories. I am looking to copy files of extension type .xlsx from the master directory to a new directory to collate all the files in a single locations. Each file has a unique name with new files being added daily.

Once a file is copied to the new directory I would like the script to prevent it from being over-written by comparing file names based on what is already contained within the master directory eg:

Master directory today contains test1.xlsx and test2.xlsx which is copied to the new directory I specified.

2 Days later the master directory contains test1.xlsx, test 2.xlsx and test 3.xlsx. In this instance once I execute the code, I would like to iterate through the master directory and sub dirs and identify that only test 3.xlsx is new based on a comparison between the file search in the master directory and the specified directory where I copy the files to.

Apologies new to StackOverFlow and Python with English being a second language so not too sure if I explained it too well but hopefully someone will get the gist.

I have tried the following code but it keeps overwriting my files in my specified directory where I wish to copy the found .xlsx files to.

import os
import shutil
from os.path import isfile

#count = 0

for root, dirs, files in os.walk('Checklists'):
    for file in files:
       if file.endswith('.xlsx'):
        #print(file)
        if isfile('Checklist'):
            print("File exists")
        else:
        #print(os.path.join(root, file))
        #count +=1
        #print(count)
        #if not os.path.exists(os.path.join('Checklists', file)):
            shutil.copy(os.path.abspath(root + '/' + file), 'Checklist', follow_symlinks=True)

答案1

得分: 0

我用以下的附加代码成功解决了这个谜题

for root, dirs, files in os.walk('Checklists'):
for file in files:
   if file.endswith('.xlsx'):
    #print(file)
    if os.path.exists('Checklist'):
        pass
        print("文件已存在")
    else:
    #print(os.path.join(root, file))
    #count +=1
    #print(count)
    #if not os.path.exists(os.path.join('Checklists', file)):
        shutil.copy(os.path.abspath(root + '/' + file), 'Checklist', follow_symlinks=True)

感谢那些抽出时间查看我的问题的人
英文:

I managed to solve this riddle myself with the following addition:

for root, dirs, files in os.walk('Checklists'):
for file in files:
   if file.endswith('.xlsx'):
    #print(file)
    **if os.path.exists('Checklist'):**
        pass
        print("File exists")
    else:
    #print(os.path.join(root, file))
    #count +=1
    #print(count)
    #if not os.path.exists(os.path.join('Checklists', file)):
        shutil.copy(os.path.abspath(root + '/' + file), 'Checklist', follow_symlinks=True)

Thanks to the folks who took the time out to look at my question at least

huangapple
  • 本文由 发表于 2023年2月14日 05:28:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/75441348.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定