如何从架构数据库XML创建嵌套字典(JSON)?

huangapple go评论117阅读模式
英文:

How to make a nested dict (json) from a schema database xml?

问题

以下是您提供的XML代码的翻译:

  1. <?xml version="1.0" encoding="UTF-8" ?>
  2. <project name="so_project" id="Project-9999">
  3. <schema name="database1">
  4. <table name="table1">
  5. <column name="foo" type="int"/>
  6. <column name="bar" type="string"/>
  7. <column name="details_resolution" type="array[object]">
  8. <column name="timestamp" type="timestamp"/>
  9. <column name="user_id" type="string"/>
  10. <column name="user_name" type="string"/>
  11. </column>
  12. <column name="details_closure" type="array[object]">
  13. <column name="timestamp" type="timestamp"/>
  14. <column name="auto_closure" type="bool"/>
  15. </column>
  16. </table>
  17. </schema>
  18. <schema name="database2">
  19. <table name="table1">
  20. <column name="foo" type="int"/>
  21. <column name="bar" type="string"/>
  22. <column name="details" type="array[object]">
  23. <column name="timestamp" type="timestamp"/>
  24. <column name="value" type="float"/>
  25. </column>
  26. </table>
  27. </schema>
  28. </project>

请注意,这只是XML代码的翻译,没有任何其他内容。

英文:

Here is my input file.xml :

  1. &lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?&gt;
  2. &lt;project name=&quot;so_project&quot; id=&quot;Project-9999&quot;&gt;
  3. &lt;schema name=&quot;database1&quot;&gt;
  4. &lt;table name=&quot;table1&quot;&gt;
  5. &lt;column name=&quot;foo&quot; type=&quot;int&quot;/&gt;
  6. &lt;column name=&quot;bar&quot; type=&quot;string&quot;/&gt;
  7. &lt;column name=&quot;details_resolution&quot; type=&quot;array[object]&quot;&gt;
  8. &lt;column name=&quot;timestamp&quot; type=&quot;timestamp&quot;/&gt;
  9. &lt;column name=&quot;user_id&quot; type=&quot;string&quot;/&gt;
  10. &lt;column name=&quot;user_name&quot; type=&quot;string&quot;/&gt;
  11. &lt;/column&gt;
  12. &lt;column name=&quot;details_closure&quot; type=&quot;array[object]&quot;&gt;
  13. &lt;column name=&quot;timestamp&quot; type=&quot;timestamp&quot;/&gt;
  14. &lt;column name=&quot;auto_closure&quot; type=&quot;bool&quot;/&gt;
  15. &lt;/column&gt;
  16. &lt;/table&gt;
  17. &lt;/schema&gt;
  18. &lt;schema name=&quot;database2&quot;&gt;
  19. &lt;table name=&quot;table1&quot;&gt;
  20. &lt;column name=&quot;foo&quot; type=&quot;int&quot;/&gt;
  21. &lt;column name=&quot;bar&quot; type=&quot;string&quot;/&gt;
  22. &lt;column name=&quot;details&quot; type=&quot;array[object]&quot;&gt;
  23. &lt;column name=&quot;timestamp&quot; type=&quot;timestamp&quot;/&gt;
  24. &lt;column name=&quot;value&quot; type=&quot;float&quot;/&gt;
  25. &lt;/column&gt;
  26. &lt;/table&gt;
  27. &lt;/schema&gt;
  28. &lt;/project&gt;

.. and I'm trying to make this classical nested dict :

  1. {
  2. &quot;database1&quot;: {
  3. &quot;table1&quot;: {
  4. &quot;foo&quot;: &quot;int&quot;,
  5. &quot;bar&quot;: &quot;string&quot;,
  6. &quot;details_resolution&quot;: {
  7. &quot;timestamp&quot;: &quot;timestamp&quot;,
  8. &quot;user_id&quot;: &quot;string&quot;,
  9. &quot;user_name&quot;: &quot;string&quot;
  10. },
  11. &quot;details_closure&quot;: {
  12. &quot;timestamp&quot;: &quot;timestamp&quot;,
  13. &quot;auto_closure&quot;: &quot;bool&quot;
  14. }
  15. }
  16. },
  17. &quot;database2&quot;: {
  18. &quot;table1&quot;: {
  19. &quot;foo&quot;: &quot;int&quot;,
  20. &quot;bar&quot;: &quot;string&quot;,
  21. &quot;details&quot;: {
  22. &quot;timestamp&quot;: &quot;timestamp&quot;,
  23. &quot;value&quot;: &quot;float&quot;
  24. }
  25. }
  26. }
  27. }

PS : Each database can eventually have more than one table.

I tried some AI codes but none of them gave me the expected result..
I'm sorry guys to not being able to show my attempts !

SO, any help would be greately appreciated.

答案1

得分: 1

您可以使用 xml.etree.ElementTree 模块。

  1. import xml.etree.ElementTree as ET
  2. def parse_column(column_elem):
  3. column_data = {}
  4. column_data['name'] = column_elem.get('name')
  5. column_data['type'] = column_elem.get('type')
  6. return column_data
  7. def parse_table(table_elem):
  8. table_data = {}
  9. table_name = table_elem.get('name')
  10. for column_elem in table_elem.findall('column'):
  11. column_data = parse_column(column_elem)
  12. table_data[column_data['name']] = column_data['type']
  13. return {table_name: table_data}
  14. def parse_schema(schema_elem):
  15. schema_data = {}
  16. schema_name = schema_elem.get('name')
  17. for table_elem in schema_elem.findall('table'):
  18. table_data = parse_table(table_elem)
  19. schema_data.update(table_data)
  20. return {schema_name: schema_data}
  21. def parse_xml(xml_content):
  22. root = ET.fromstring(xml_content)
  23. project_data = {}
  24. for schema_elem in root.findall('schema'):
  25. schema_data = parse_schema(schema_elem)
  26. project_data.update(schema_data)
  27. return project_data
  28. # 读取 XML 文件
  29. with open('file.xml', 'r') as f:
  30. xml_content = f.read()
  31. # 解析 XML 并生成嵌套字典
  32. nested_dict = parse_xml(xml_content)
  33. print(nested_dict)

这是您提供的代码的翻译部分。

英文:

You can use xml.etree.ElementTree

  1. import xml.etree.ElementTree as ET
  2. def parse_column(column_elem):
  3. column_data = {}
  4. column_data[&#39;name&#39;] = column_elem.get(&#39;name&#39;)
  5. column_data[&#39;type&#39;] = column_elem.get(&#39;type&#39;)
  6. return column_data
  7. def parse_table(table_elem):
  8. table_data = {}
  9. table_name = table_elem.get(&#39;name&#39;)
  10. for column_elem in table_elem.findall(&#39;column&#39;):
  11. column_data = parse_column(column_elem)
  12. table_data[column_data[&#39;name&#39;]] = column_data[&#39;type&#39;]
  13. return {table_name: table_data}
  14. def parse_schema(schema_elem):
  15. schema_data = {}
  16. schema_name = schema_elem.get(&#39;name&#39;)
  17. for table_elem in schema_elem.findall(&#39;table&#39;):
  18. table_data = parse_table(table_elem)
  19. schema_data.update(table_data)
  20. return {schema_name: schema_data}
  21. def parse_xml(xml_content):
  22. root = ET.fromstring(xml_content)
  23. project_data = {}
  24. for schema_elem in root.findall(&#39;schema&#39;):
  25. schema_data = parse_schema(schema_elem)
  26. project_data.update(schema_data)
  27. return project_data
  28. # Read XML file
  29. with open(&#39;file.xml&#39;, &#39;r&#39;) as f:
  30. xml_content = f.read()
  31. # Parse XML and generate nested dictionary
  32. nested_dict = parse_xml(xml_content)
  33. print(nested_dict)

答案2

得分: 0

使用[标签:beautifulsoup]的解决方案:

  1. from bs4 import BeautifulSoup
  2. with open("your_file.xml", "r") as f_in:
  3. soup = BeautifulSoup(f_in.read(), "xml")
  4. def parse_columns(t):
  5. out = {}
  6. for c in t.find_all("column", recursive=False):
  7. if c.find("column"):
  8. out[c["name"]] = parse_columns(c)
  9. else:
  10. out[c["name"]] = c["type"]
  11. return out
  12. def parse_schema(sch):
  13. out = {}
  14. for t in sch.select("table"):
  15. out[t["name"]] = parse_columns(t)
  16. return out
  17. out = {}
  18. for sch in soup.select("schema"):
  19. out[sch["name"]] = parse_schema(sch)
  20. print(out)

打印输出:

  1. {
  2. "database1": {
  3. "table1": {
  4. "foo": "int",
  5. "bar": "string",
  6. "details_resolution": {
  7. "timestamp": "timestamp",
  8. "user_id": "string",
  9. "user_name": "string",
  10. },
  11. "details_closure": {"timestamp": "timestamp", "auto_closure": "bool"},
  12. }
  13. },
  14. "database2": {
  15. "table1": {
  16. "foo": "int",
  17. "bar": "string",
  18. "details": {"timestamp": "timestamp", "value": "float"},
  19. }
  20. },
  21. }
英文:

Solution using [tag:beautifulsoup]:

  1. from bs4 import BeautifulSoup
  2. with open(&quot;your_file.xml&quot;, &quot;r&quot;) as f_in:
  3. soup = BeautifulSoup(f_in.read(), &quot;xml&quot;)
  4. def parse_columns(t):
  5. out = {}
  6. for c in t.find_all(&quot;column&quot;, recursive=False):
  7. if c.find(&quot;column&quot;):
  8. out[c[&quot;name&quot;]] = parse_columns(c)
  9. else:
  10. out[c[&quot;name&quot;]] = c[&quot;type&quot;]
  11. return out
  12. def parse_schema(sch):
  13. out = {}
  14. for t in sch.select(&quot;table&quot;):
  15. out[t[&quot;name&quot;]] = parse_columns(t)
  16. return out
  17. out = {}
  18. for sch in soup.select(&quot;schema&quot;):
  19. out[sch[&quot;name&quot;]] = parse_schema(sch)
  20. print(out)

Prints:

  1. {
  2. &quot;database1&quot;: {
  3. &quot;table1&quot;: {
  4. &quot;foo&quot;: &quot;int&quot;,
  5. &quot;bar&quot;: &quot;string&quot;,
  6. &quot;details_resolution&quot;: {
  7. &quot;timestamp&quot;: &quot;timestamp&quot;,
  8. &quot;user_id&quot;: &quot;string&quot;,
  9. &quot;user_name&quot;: &quot;string&quot;,
  10. },
  11. &quot;details_closure&quot;: {&quot;timestamp&quot;: &quot;timestamp&quot;, &quot;auto_closure&quot;: &quot;bool&quot;},
  12. }
  13. },
  14. &quot;database2&quot;: {
  15. &quot;table1&quot;: {
  16. &quot;foo&quot;: &quot;int&quot;,
  17. &quot;bar&quot;: &quot;string&quot;,
  18. &quot;details&quot;: {&quot;timestamp&quot;: &quot;timestamp&quot;, &quot;value&quot;: &quot;float&quot;},
  19. }
  20. },
  21. }

答案3

得分: 0

在XSLT 3.0中:

  1. <xsl:output method="json" indent="yes" />
  2. <xsl:template match="/">
  3. <xsl:map>
  4. <xsl:apply-templates select="*/schema"/>
  5. </xsl:map>
  6. </xsl:template>
  7. <xsl:template match="*[*]">
  8. <xsl:map-entry key="string(@name)">
  9. <xsl:map>
  10. <xsl:apply-templates select="*"/>
  11. </xsl:map>
  12. </xsl:map-entry>
  13. </xsl:template>
  14. <xsl:template match="*">
  15. <xsl:map-entry key="string(@name)" select="string(@type)"/>
  16. </xsl:template>

解释:

  • 第一个模板规则匹配文档,创建最外层的地图,并处理schema元素,跳过project级别。

  • 第二个模板规则匹配具有一个或多个子元素的元素;它为容器元素创建一个具有@name属性作为键的地图条目,并通过递归应用模板规则生成另一个地图作为内容。

  • 第三个模板规则匹配没有子元素的元素;它为容器元素创建一个地图条目,其中@name作为键,@type作为相应的值。

英文:

In XSLT 3.0:

  1. &lt;xsl:output method=&quot;json&quot; indent=&quot;yes&quot; /&gt;
  2. &lt;xsl:template match=&quot;/&quot;&gt;
  3. &lt;xsl:map&gt;
  4. &lt;xsl:apply-templates select=&quot;*/schema&quot;/&gt;
  5. &lt;/xsl:map&gt;
  6. &lt;/xsl:template&gt;
  7. &lt;xsl:template match=&quot;*[*]&quot;&gt;
  8. &lt;xsl:map-entry key=&quot;string(@name)&quot;&gt;
  9. &lt;xsl:map&gt;
  10. &lt;xsl:apply-templates select=&quot;*&quot;/&gt;
  11. &lt;/xsl:map&gt;
  12. &lt;/xsl:map-entry&gt;
  13. &lt;/xsl:template&gt;
  14. &lt;xsl:template match=&quot;*&quot;&gt;
  15. &lt;xsl:map-entry key=&quot;string(@name)&quot; select=&quot;string(@type)&quot;/&gt;
  16. &lt;/xsl:template&gt;

See https://xsltfiddle.liberty-development.net/bdvWh3 for the full stylesheet including boilerplate.

Explanation:

  • The first template rule matches the document, creates the outermost map, and processes the schema element, skipping the project level.

  • The second template rule matches elements that have one or more children; it creates a map entry for the container element with the @name attribute as the key, and generates another map as the content, by applying template rules to the children recursively.

  • The third template rule matches elements with no children; it creates a map entry for the container element, with @name as the key and @type as the corresponding value.

huangapple
  • 本文由 发表于 2023年7月18日 07:01:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76708573.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定