How to modify JSON files inside a zip file using shell script/ ANT script/ java code/ python code? which of these is efficient and easy?

huangapple go评论90阅读模式
英文:

How to modify JSON files inside a zip file using shell script/ ANT script/ java code/ python code? which of these is efficient and easy?

问题

我需要修改位于 zip 文件中的 json 文件内部的 json 内容。

zip 文件的层次结构如下:

abc.zip

 - folder1
 - dbconfig
    - config1.xml
    - config2.xml
 - documentsfolder
     - jsonobjects.zip
         - JSONfolder
              - 包含 json 对象的所有 json 文件


我需要将 json 对象修改为与现有 json 格式不同的格式。

我现有 json 文件内部的现有 JSON 对象是:

    {
     "title":"xyz", 
     "type":"object",
     "properties":{ "id":"123", "name":"xyz"}
    }

应替换到 json 文件内部的新内容为:

    {
      "name":"xyz",
      "type":"JSON",
      "schema":{ // 现有 JSON 的整个内容应在 schema 中
      "title":"xyz", 
      "type":"object",
      "properties":{ "id":"123", "name":"xyz"}
       }
      "owner":"jack"
    }


用哪种方式(shell 脚本/Python 脚本/Java 代码)可以简单高效地完成这个任务?
英文:

I have a requirement to modify the json content inside json files which are inside a zip file.

The hierarchy of zip file is:

abc.zip

  • folder1
  • dbconfig
    • config1.xml
    • config2.xml
  • documentsfolder
    • jsonobjects.zip
      • JSONfolder
        • all the json files which contains json objects

I need to modify the json objects with a different format than the existing json format.

The existing JSON object that I have inside the json files is:

{
 "title":"xyz", 
 "type":"object",
 "properties":{ "id":"123", "name":"xyz"}
}

The new content that should be replaced inside the json files should be:

{
  "name":"xyz"
  "type":"JSON"
  "schema":{ // entire content of the existing JSON should be there in schema
  "title":"xyz", 
  "type":"object",
  "properties":{ "id":"123", "name":"xyz"}
   }
  "owner":"jack"
}

which is simple and efficient to complete this task(shellscript/python script/java code)?

答案1

得分: 1

考虑到文件很小(小于1MB),使用下面的简单Python脚本就足够了(对于这么小的数据,速度非常快)。

我使用了最常用于zip文件的压缩zipfile.ZIP_DEFLATED。如果需要未压缩的文件,请改用zipfile.ZIP_STORED。或者使用其他压缩算法,如zipfile.ZIP_BZIP2zipfile.ZIP_LZMA。压缩类型只需设置输出/处理的zip文件,输入zip的压缩会自动推导。zipfilejsonio是Python的标准模块,无需安装任何内容。

fin.zipfout.zip是示例输入/输出zip文件的名称。您需要一个单独的输出/处理zip文件,不能直接修改输入zip文件,因为JSON文件会改变大小,而且输入zip文件也可能被压缩,因此需要重新打包/重新压缩到另一个输出zip文件中。另外,输入和输出zip文件名可能相同,那么输入文件将被替换,但在这种情况下,请不要忘记在确定无误地转换zip文件之前备份它们。

您可以看到修改jdata对象的行,您可以根据您的任务需求进行更改。jdata从json解码,然后进行修改,然后再编码回json。还要注意,整个层次结构中的所有json和zip文件都将被修改,如果需要限制范围,请扩展条件elif fname.endswith('.json')

看起来您在另一个zip文件中嵌套了一个zip文件。这就是为什么我创建了一个单独的ProcessZip()函数,以便递归调用以处理嵌套的zip文件,它可以处理任何嵌套级别。

更新: 我添加了xml转json的示例。它需要通过命令python -m pip install xmltodict安装模块xmltodict。由于xml可能以不同的方式转换为json(例如,json没有属性),您可能还需要根据需要修复转换后的内容。此外,请注意,在从xml转换后,我将压缩文件的扩展名从.xml更改为.json

import zipfile, json, io
# 需要安装:python -m pip install xmltodict
import xmltodict

def ProcessZip(file_data):
    res = io.BytesIO()
    with zipfile.ZipFile(io.BytesIO(file_data), mode='r') as fin, \
         zipfile.ZipFile(res, mode='w', compression=zipfile.ZIP_DEFLATED) as fout:
        for fname in fin.namelist():
            data = fin.read(fname)
            if fname.endswith('.zip'):
                data = ProcessZip(data)
            elif fname.endswith('.json'):
                jdata = json.loads(data.decode('utf-8-sig'))
                # 在这里修改JSON内容
                jdata = {
                    'name': 'xyz',
                    'type': 'JSON',
                    'schema': jdata,
                    'owner': 'jack',
                }
                data = json.dumps(jdata, indent=4).encode('utf-8')
            elif fname.endswith('.xml'):
                jdata = xmltodict.parse(data)
                jdata = {
                    'name': 'xyz',
                    'type': 'JSON',
                    'schema': jdata,
                    'owner': 'jack',
                }
                data = json.dumps(jdata, indent=4).encode('utf-8')
                fname = fname[:fname.rfind('.')] + '.json'
            fout.writestr(fname, data)
    return res.getvalue()

with open('fin.zip', 'rb') as fin:
    res = ProcessZip(fin.read())
with open('fout.zip', 'wb') as fout:
    fout.write(res)
英文:

Taking into account that file is tiny (below 1MB) using next simple Python script is quite enough (will be very fast for such small data).

I've used compression zipfile.ZIP_DEFLATED which is most common for zip files. You may use zipfile.ZIP_STORED instead if you need uncompressed files inside. Or zipfile.ZIP_BZIP2 and zipfile.ZIP_LZMA for other compression algorithms. Compression type is needed to be set only for output/processed zip, input zip compression is derived automatically. zipfile, json and io are standard python modules, no need to install anything.

fin.zip and fout.zip are example input/output zip files names. You need a separate output/processed zip file and can't modify input zip in-place, because JSON files change their sizes and also input zip may be compressed too, hence needs repacking/recompression to another output zip file. Also input and output zip file name may be same then input file will be replaced, but in this case don't forget to backup zips until you sure that zips are transformed without mistakes.

You can see lines where jdata object is modified, you may change them the way you need for your task. jdata is decoded from json then modified and after that encoded back to json. Also note that all json and zip files in whole hierarchy will be modified, if you need to limit that extend condition elif fname.endswith('.json').

Looks like you have nested zip inside another zip. That's why I made a separate ProcessZip() function so that it is called recursivelly to process nested zips, it can process any nesting levels.

Update: I've added example of xml to json conversion. It needs module xmltodict to be installed via command python -m pip install xmltodict. Also as xml may be converted in different ways to json (e.g. json doesn't have attributes) you may also need to fix converted contents the way you need. Also note that after conversion from xml I change zipped file extension from .xml to .json.

import zipfile, json, io
# Needs: python -m pip install xmltodict
import xmltodict

def ProcessZip(file_data):
    res = io.BytesIO()
    with zipfile.ZipFile(io.BytesIO(file_data), mode = 'r') as fin, \
         zipfile.ZipFile(res, mode = 'w', compression = zipfile.ZIP_DEFLATED) as fout:
        for fname in fin.namelist():
            data = fin.read(fname)
            if fname.endswith('.zip'):
                data = ProcessZip(data)
            elif fname.endswith('.json'):
                jdata = json.loads(data.decode('utf-8-sig'))
                # Check that we don't modify already modified file
                assert 'schema' not in jdata, 'JSON file "%s" already modified inside zip!' % fname
                # Modify JSON content here
                jdata = {
                    'name': 'xyz',
                    'type': 'JSON',
                    'schema': jdata,
                    'owner': 'jack',
                }
                data = json.dumps(jdata, indent = 4).encode('utf-8')
            elif fname.endswith('.xml'):
                jdata = xmltodict.parse(data)
                jdata = {
                    'name': 'xyz',
                    'type': 'JSON',
                    'schema': jdata,
                    'owner': 'jack',
                }
                data = json.dumps(jdata, indent = 4).encode('utf-8')
                fname = fname[:fname.rfind('.')] + '.json'
            fout.writestr(fname, data)
    return res.getvalue()

with open('fin.zip', 'rb') as fin:
    res = ProcessZip(fin.read())
with open('fout.zip', 'wb') as fout:
    fout.write(res)

huangapple
  • 本文由 发表于 2020年9月9日 22:55:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/63814305.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定