2023年7月18日 07:01:46go评论117阅读模式

英文:

How to make a nested dict (json) from a schema database xml?

问题

以下是您提供的XML代码的翻译：

<?xml version="1.0" encoding="UTF-8" ?>
<project name="so_project" id="Project-9999">
    <schema name="database1">
        <table name="table1">
            <column name="foo" type="int"/>
            <column name="bar" type="string"/>
            <column name="details_resolution" type="array[object]">
                <column name="timestamp" type="timestamp"/>
                <column name="user_id" type="string"/>
                <column name="user_name" type="string"/>
            </column>
            <column name="details_closure" type="array[object]">
                <column name="timestamp" type="timestamp"/>
                <column name="auto_closure" type="bool"/>
            </column>
        </table>
    </schema>
    <schema name="database2">
        <table name="table1">
            <column name="foo" type="int"/>
            <column name="bar" type="string"/>
            <column name="details" type="array[object]">
                <column name="timestamp" type="timestamp"/>
                <column name="value" type="float"/>
            </column>
        </table>
    </schema>
</project>

请注意，这只是XML代码的翻译，没有任何其他内容。

英文:

Here is my input file.xml :

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?&gt;
&lt;project name=&quot;so_project&quot; id=&quot;Project-9999&quot;&gt;
    &lt;schema name=&quot;database1&quot;&gt;
        &lt;table name=&quot;table1&quot;&gt;
            &lt;column name=&quot;foo&quot; type=&quot;int&quot;/&gt;
            &lt;column name=&quot;bar&quot; type=&quot;string&quot;/&gt;
            &lt;column name=&quot;details_resolution&quot; type=&quot;array[object]&quot;&gt;
                &lt;column name=&quot;timestamp&quot; type=&quot;timestamp&quot;/&gt;
                &lt;column name=&quot;user_id&quot; type=&quot;string&quot;/&gt;
                &lt;column name=&quot;user_name&quot; type=&quot;string&quot;/&gt;
            &lt;/column&gt;
            &lt;column name=&quot;details_closure&quot; type=&quot;array[object]&quot;&gt;
                &lt;column name=&quot;timestamp&quot; type=&quot;timestamp&quot;/&gt;
                &lt;column name=&quot;auto_closure&quot; type=&quot;bool&quot;/&gt;
            &lt;/column&gt;
        &lt;/table&gt;
    &lt;/schema&gt;
    &lt;schema name=&quot;database2&quot;&gt;
        &lt;table name=&quot;table1&quot;&gt;
            &lt;column name=&quot;foo&quot; type=&quot;int&quot;/&gt;
            &lt;column name=&quot;bar&quot; type=&quot;string&quot;/&gt;
            &lt;column name=&quot;details&quot; type=&quot;array[object]&quot;&gt;
                &lt;column name=&quot;timestamp&quot; type=&quot;timestamp&quot;/&gt;
                &lt;column name=&quot;value&quot; type=&quot;float&quot;/&gt;
            &lt;/column&gt;
        &lt;/table&gt;
    &lt;/schema&gt;
&lt;/project&gt;

.. and I'm trying to make this classical nested dict :

{
    &quot;database1&quot;: {
        &quot;table1&quot;: {
            &quot;foo&quot;: &quot;int&quot;,
            &quot;bar&quot;: &quot;string&quot;,
            &quot;details_resolution&quot;: {
                &quot;timestamp&quot;: &quot;timestamp&quot;,
                &quot;user_id&quot;: &quot;string&quot;,
                &quot;user_name&quot;: &quot;string&quot;
            },
            &quot;details_closure&quot;: {
                &quot;timestamp&quot;: &quot;timestamp&quot;,
                &quot;auto_closure&quot;: &quot;bool&quot;
            }
        }
    },
    &quot;database2&quot;: {
        &quot;table1&quot;: {
            &quot;foo&quot;: &quot;int&quot;,
            &quot;bar&quot;: &quot;string&quot;,
            &quot;details&quot;: {
                &quot;timestamp&quot;: &quot;timestamp&quot;,
                &quot;value&quot;: &quot;float&quot;
            }
        }
    }
}

PS : Each database can eventually have more than one table.

I tried some AI codes but none of them gave me the expected result..
I'm sorry guys to not being able to show my attempts !

SO, any help would be greately appreciated.

答案1

得分: 1

您可以使用 xml.etree.ElementTree 模块。

import xml.etree.ElementTree as ET
def parse_column(column_elem):
    column_data = {}
    column_data['name'] = column_elem.get('name')
    column_data['type'] = column_elem.get('type')
    return column_data
def parse_table(table_elem):
    table_data = {}
    table_name = table_elem.get('name')
    for column_elem in table_elem.findall('column'):
        column_data = parse_column(column_elem)
        table_data[column_data['name']] = column_data['type']
    return {table_name: table_data}
def parse_schema(schema_elem):
    schema_data = {}
    schema_name = schema_elem.get('name')
    for table_elem in schema_elem.findall('table'):
        table_data = parse_table(table_elem)
        schema_data.update(table_data)
    return {schema_name: schema_data}
def parse_xml(xml_content):
    root = ET.fromstring(xml_content)
    project_data = {}
    for schema_elem in root.findall('schema'):
        schema_data = parse_schema(schema_elem)
        project_data.update(schema_data)
    return project_data
# 读取 XML 文件
with open('file.xml', 'r') as f:
    xml_content = f.read()
# 解析 XML 并生成嵌套字典
nested_dict = parse_xml(xml_content)
print(nested_dict)

这是您提供的代码的翻译部分。

英文:

You can use xml.etree.ElementTree

import xml.etree.ElementTree as ET
def parse_column(column_elem):
    column_data = {}
    column_data[&#39;name&#39;] = column_elem.get(&#39;name&#39;)
    column_data[&#39;type&#39;] = column_elem.get(&#39;type&#39;)
    return column_data
def parse_table(table_elem):
    table_data = {}
    table_name = table_elem.get(&#39;name&#39;)
    for column_elem in table_elem.findall(&#39;column&#39;):
        column_data = parse_column(column_elem)
        table_data[column_data[&#39;name&#39;]] = column_data[&#39;type&#39;]
    return {table_name: table_data}
def parse_schema(schema_elem):
    schema_data = {}
    schema_name = schema_elem.get(&#39;name&#39;)
    for table_elem in schema_elem.findall(&#39;table&#39;):
        table_data = parse_table(table_elem)
        schema_data.update(table_data)
    return {schema_name: schema_data}
def parse_xml(xml_content):
    root = ET.fromstring(xml_content)
    project_data = {}
    for schema_elem in root.findall(&#39;schema&#39;):
        schema_data = parse_schema(schema_elem)
        project_data.update(schema_data)
    return project_data
# Read XML file
with open(&#39;file.xml&#39;, &#39;r&#39;) as f:
    xml_content = f.read()
# Parse XML and generate nested dictionary
nested_dict = parse_xml(xml_content)
print(nested_dict)

答案2

得分: 0

使用[标签：beautifulsoup]的解决方案：

from bs4 import BeautifulSoup
with open("your_file.xml", "r") as f_in:
    soup = BeautifulSoup(f_in.read(), "xml")
def parse_columns(t):
    out = {}
    for c in t.find_all("column", recursive=False):
        if c.find("column"):
            out[c["name"]] = parse_columns(c)
        else:
            out[c["name"]] = c["type"]
    return out
def parse_schema(sch):
    out = {}
    for t in sch.select("table"):
        out[t["name"]] = parse_columns(t)
    return out
out = {}
for sch in soup.select("schema"):
    out[sch["name"]] = parse_schema(sch)
print(out)

打印输出：

{
    "database1": {
        "table1": {
            "foo": "int",
            "bar": "string",
            "details_resolution": {
                "timestamp": "timestamp",
                "user_id": "string",
                "user_name": "string",
            },
            "details_closure": {"timestamp": "timestamp", "auto_closure": "bool"},
        }
    },
    "database2": {
        "table1": {
            "foo": "int",
            "bar": "string",
            "details": {"timestamp": "timestamp", "value": "float"},
        }
    },
}

英文:

Solution using [tag:beautifulsoup]:

from bs4 import BeautifulSoup
with open(&quot;your_file.xml&quot;, &quot;r&quot;) as f_in:
    soup = BeautifulSoup(f_in.read(), &quot;xml&quot;)
def parse_columns(t):
    out = {}
    for c in t.find_all(&quot;column&quot;, recursive=False):
        if c.find(&quot;column&quot;):
            out[c[&quot;name&quot;]] = parse_columns(c)
        else:
            out[c[&quot;name&quot;]] = c[&quot;type&quot;]
    return out
def parse_schema(sch):
    out = {}
    for t in sch.select(&quot;table&quot;):
        out[t[&quot;name&quot;]] = parse_columns(t)
    return out
out = {}
for sch in soup.select(&quot;schema&quot;):
    out[sch[&quot;name&quot;]] = parse_schema(sch)
print(out)

Prints:

{
    &quot;database1&quot;: {
        &quot;table1&quot;: {
            &quot;foo&quot;: &quot;int&quot;,
            &quot;bar&quot;: &quot;string&quot;,
            &quot;details_resolution&quot;: {
                &quot;timestamp&quot;: &quot;timestamp&quot;,
                &quot;user_id&quot;: &quot;string&quot;,
                &quot;user_name&quot;: &quot;string&quot;,
            },
            &quot;details_closure&quot;: {&quot;timestamp&quot;: &quot;timestamp&quot;, &quot;auto_closure&quot;: &quot;bool&quot;},
        }
    },
    &quot;database2&quot;: {
        &quot;table1&quot;: {
            &quot;foo&quot;: &quot;int&quot;,
            &quot;bar&quot;: &quot;string&quot;,
            &quot;details&quot;: {&quot;timestamp&quot;: &quot;timestamp&quot;, &quot;value&quot;: &quot;float&quot;},
        }
    },
}

答案3

得分: 0

在XSLT 3.0中：

<xsl:output method="json" indent="yes" />
<xsl:template match="/">
  <xsl:map>
    <xsl:apply-templates select="*/schema"/>
  </xsl:map>
</xsl:template>
<xsl:template match="*[*]">
  <xsl:map-entry key="string(@name)">
    <xsl:map>
      <xsl:apply-templates select="*"/>
    </xsl:map>
  </xsl:map-entry>
</xsl:template>
<xsl:template match="*">
  <xsl:map-entry key="string(@name)" select="string(@type)"/>
</xsl:template>

解释：

第一个模板规则匹配文档，创建最外层的地图，并处理schema元素，跳过project级别。
第二个模板规则匹配具有一个或多个子元素的元素；它为容器元素创建一个具有@name属性作为键的地图条目，并通过递归应用模板规则生成另一个地图作为内容。
第三个模板规则匹配没有子元素的元素；它为容器元素创建一个地图条目，其中@name作为键，@type作为相应的值。

英文:

In XSLT 3.0:

  &lt;xsl:output method=&quot;json&quot; indent=&quot;yes&quot; /&gt;
  
  &lt;xsl:template match=&quot;/&quot;&gt;
    &lt;xsl:map&gt;
      &lt;xsl:apply-templates select=&quot;*/schema&quot;/&gt;
    &lt;/xsl:map&gt;
  &lt;/xsl:template&gt;
  &lt;xsl:template match=&quot;*[*]&quot;&gt;
     &lt;xsl:map-entry key=&quot;string(@name)&quot;&gt;
        &lt;xsl:map&gt;
          &lt;xsl:apply-templates select=&quot;*&quot;/&gt;
        &lt;/xsl:map&gt;
     &lt;/xsl:map-entry&gt;
  &lt;/xsl:template&gt;
  
  &lt;xsl:template match=&quot;*&quot;&gt;
    &lt;xsl:map-entry key=&quot;string(@name)&quot; select=&quot;string(@type)&quot;/&gt;
  &lt;/xsl:template&gt;

See https://xsltfiddle.liberty-development.net/bdvWh3 for the full stylesheet including boilerplate.

Explanation:

The first template rule matches the document, creates the outermost map, and processes the schema element, skipping the project level.
The second template rule matches elements that have one or more children; it creates a map entry for the container element with the @name attribute as the key, and generates another map as the content, by applying template rules to the children recursively.
The third template rule matches elements with no children; it creates a map entry for the container element, with @name as the key and @type as the corresponding value.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何从架构数据库XML创建嵌套字典（JSON）？

问题

答案1

答案2

答案3

在Python DataFrame中基于多个条件选择数值。

如何在嵌套结构中设置可选的 JSON？

GoLang – encoding/json.Marshal or fmt.sprintf?

PyCharm 无法在导入时看到我新编译的 .pyc 文件。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。