英文:
How to extract a subdir with all it's subsequent files using zipfile
问题
是的,我已经阅读了关于这个主题的其他帖子,但我遇到了一个奇怪的问题:
当我从namelist
中提取特定项目时,它只给我一个空文件夹,而不是实际的文件。
我的zip文件具有以下层次结构:
myzip.zip -> FolderA -> FolderB -> FolderC -> FolderIWantA, FolderIWantB, ... FolderIWantN。
因此,有很多我不想提取的前导文件夹。我知道如何从namelist中识别我想要的文件夹:
import os
import sys
import zipfile
try:
zip_file_path = sys.argv[1]
except IndexError:
sys.exit('未提供zip文件。')
archive = zipfile.ZipFile(zip_file_path)
for i, file in enumerate(archive.namelist()):
if os.path.basename(file[:-1]).startswith('ABC-'): # 识别相关文件夹
old_name = os.path.basename(file[:-1])
new_name = 'new_%d'%i # 创建一个新名称
archive.extract(file, new_name)
这确实提取了我想要的文件夹,但出于某种原因,提取的文件夹是空的。而且不仅如此:当我提取新文件夹时,它们包含了前导的文件夹A、B和C,原因我不知道...
这里有一个测试zip文件以供您参考:
import os
import shutil
prefolders = r'testzip\FolderA\FolderB\FolderC'
try:
os.makedirs(prefolders)
except FileExistsError:
pass
for i in 'ABC':
try:
new_folder = 'ABC-Folder%s'%i
os.mkdir(os.path.join(prefolders, new_folder))
except FileExistsError:
pass
for j in range(2):
file_path = os.path.join(prefolders, new_folder, 'somefile%s.txt'%j)
with open(file_path, 'w'): pass
shutil.make_archive('testzip', 'zip', 'testzip')
shutil.rmtree('testzip')
我以为这会花费大约10分钟,但我正在为此疯狂...
英文:
Yes, I have read the other posts on this subject, but I am running into a weird problem:
When I extract a certain item from the namelist
, it only gives me an empty folder, not the actual files inside.
My zip file has the following hierarchy:
myzip.zip -> FolderA -> FolderB -> FolderC -> FolderIWantA, FolderIWantB, ... FolderIWantN.
So there are a lot of preceeding folders I do not wish to extract. I know how to identify the ones I want from the namelist:
import os
import sys
import zipfile
try:
zip_file_path = sys.argv[1]
except IndexError:
sys.exit('No zip file provided.')
archive = zipfile.ZipFile(zip_file_path)
for i,file in enumerate(archive.namelist()):
if os.path.basename(file[:-1]).startswith('ABC-'): # identify relevant folders
old_name = os.path.basename(file[:-1])
new_name = 'new_%d'%i # Create a new name
archive.extract(file, new_name)
This does extract the folders I want, however the extracted folders are empty for some reason. And not just that: When I extract the new folders, they contain the preceeding folders A,B and C for some reason.
I do not know why it does that...
Here's a test zip for your convenience:
import os
import shutil
prefolders = r'testzip\FolderA\FolderB\FolderC'
try:
os.makedirs(prefolders)
except FileExistsError:
pass
for i in 'ABC':
try:
new_folder = 'ABC-Folder%s'%i
os.mkdir(os.path.join(prefolders,new_folder))
except FileExistsError:
pass
for j in range(2):
file_path = os.path.join(prefolders,new_folder,'somefile%s.txt'%j)
with open(file_path,'w'): pass
shutil.make_archive('testzip', 'zip', 'testzip')
shutil.rmtree('testzip')
I thought this would take like 10 minutes and I am losing my mind over this...
答案1
得分: 1
你正在寻找以ABC-
开头的basename()
,这意味着你永远不会找到不以那个开头的文件。你示例中的文件以somefile
开头。extract()
只会提取以该名称命名的内容。在你的情况下,所有以ABC-
开头的内容都是目录。
要查找路径中某个位置有以ABC-
开头的目录的文件,你可以使用以下代码:
if os.path.basename(file) != '' and ('/ABC-' in os.path.dirname(file) or os.path.dirname(file).startswith('ABC-')):
(你可能需要在你的系统上将斜杠改为反斜杠。)
这仍然会提取文件和file
中命名的所有父目录。如果你只想要new_n
中的文件本身,那么你需要使用read()
来读取条目,然后将数据写入所需的目标文件。
英文:
You're looking for the basename()
to start with ABC-
, which means you never find files that don't start with that. The files in your example start with somefile
. extract()
will only extract the one thing that is named. In your case, all of the things that start with ABC-
are directories.
To find the files that have a directory somewhere in their path that starts with ABC-
, you could:
if os.path.basename(file) != '' and ('/ABC-' in os.path.dirname(file) or os.path.dirname(file).startswith('ABC-')):
(You may need to change the slash to a backslash on your system.)
This will still extract the file and all of the parent directories as named in file
. If you want just the file by itself in new_n
, then you will need to use read()
on the entry, and then write the data to the desired destination file.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论