2023年2月27日 13:51:04go评论163阅读模式

英文:

Create Panda DataFrame using nested dictionaries and a list: dict:{dict:{dict:[list]}}

问题

data = {
   "etherA": {
      "vlanY": {
         "local": ['mac01', 'mac02'],
         "external": ['mac03', 'mac02']
      }
   },
   "etherB": {
      "vlanZ": {
         "local": ['mac06', 'mac09'],
         "external": ['mac01', 'mac02', 'mac03']
      }
   }
}

import pandas as pd

# Create an empty DataFrame with the desired column names
df = pd.DataFrame(columns=['interface', 'vlan', 'dyn', 'mac-address'])

# Loop through the nested dictionary and flatten the data
for interface, nested_dict in data.items():
    for vlan, dyn_dict in nested_dict.items():
        for dyn, mac_list in dyn_dict.items():
            for mac in mac_list:
                df = df.append({'interface': interface, 'vlan': vlan, 'dyn': dyn, 'mac-address': mac}, ignore_index=True)

# Print the resulting DataFrame
print(df)

This code will create the DataFrame you desire from the nested dictionary without using multiple for loops.

英文:

I have a series of nested dicts with a list as the deepest value.

data = {
   &quot;etherA&quot;: {
      &quot;vlanY&quot;: {
         &quot;local&quot;: [&#39;mac01&#39;, &#39;mac02&#39;],
         &quot;external&quot;: [&#39;mac03&#39;, &#39;mac02&#39;]
      }
   },
   &quot;etherB&quot;: {
      &quot;vlanZ&quot;: {
         &quot;local&quot;: [&#39;mac06&#39;, &#39;mac09&#39;],
         &quot;external&quot;: [&#39;mac01&#39;, &#39;mac02&#39;, &#39;mac03&#39;]
      }
   }
}

To load the dict into a dataframe, I create the column headers and then loop through the dict and add a list to the end of the dataframe.

df = pd.DataFrame.from_dict({
   &#39;interface&#39;: [],
   &#39;vlan&#39;: [],
   &#39;dyn&#39;: [],
   &#39;mac-address&#39;: []
})

for a in data:
   for b in data[a]:
      for c in data[a][b]:
         for d in data[a][b][c]:
            df.loc[len(df)] = [a, b, c, d]

Final output:


print(df)

  interface   vlan       dyn mac-address
0    etherA  vlanY     local       mac01
1    etherA  vlanY     local       mac02
2    etherA  vlanY  external       mac03
3    etherA  vlanY  external       mac02
4    etherB  vlanZ     local       mac06
5    etherB  vlanZ     local       mac09
6    etherB  vlanZ  external       mac01
7    etherB  vlanZ  external       mac02
8    etherB  vlanZ  external       mac03

The "for loops" ultimately do what I need it to, but is there a panda method for getting the data from the dict into the dataframe?

I've read through numerous other posts and have tried their answers and suggestions. Most are dealing with a single nested dictionary and none have dealt with a nested, nested, nested list. A few of the suggested questions are what I was trying to achieve and the answer was to loop through to essentially flatten the data before appending it to the dataframe,so that may be the best course.

答案1

得分: 1

以下是翻译好的代码部分：

import pandas as pd

data = {
   "etherA": {
      "vlanY": {
         "local": ['mac01', 'mac02'],
         "external": ['mac03', 'mac02']
      }
   },
   "etherB": {
      "vlanZ": {
         "local": ['mac06', 'mac09'],
         "external": ['mac01', 'mac02', 'mac03']
      }
   }
}

df = pd.DataFrame([
    {'interface': interface, 'vlan': vlan, 'dyn': dyn, 'mac-address': mac}
    for interface, vlan_dict in data.items()
    for vlan, dyn_dict in vlan_dict.items()
    for dyn, mac_list in dyn_dict.items()
    for mac in mac_list
])

这段代码生成的DataFrame如下：

 interface   vlan       dyn mac-address
0    etherA  vlanY     local       mac01
1    etherA  vlanY     local       mac02
2    etherA  vlanY  external       mac03
3    etherA  vlanY  external       mac02
4    etherB  vlanZ     local       mac06
5    etherB  vlanZ     local       mac09
6    etherB  vlanZ  external       mac01
7    etherB  vlanZ  external       mac02
8    etherB  vlanZ  external       mac03

英文:

Another way to do this is:

import pandas as pd

data = {
   &quot;etherA&quot;: {
      &quot;vlanY&quot;: {
         &quot;local&quot;: [&#39;mac01&#39;, &#39;mac02&#39;],
         &quot;external&quot;: [&#39;mac03&#39;, &#39;mac02&#39;]
      }
   },
   &quot;etherB&quot;: {
      &quot;vlanZ&quot;: {
         &quot;local&quot;: [&#39;mac06&#39;, &#39;mac09&#39;],
         &quot;external&quot;: [&#39;mac01&#39;, &#39;mac02&#39;, &#39;mac03&#39;]
      }
   }
}

df = pd.DataFrame([
    {&#39;interface&#39;: interface, &#39;vlan&#39;: vlan, &#39;dyn&#39;: dyn, &#39;mac-address&#39;: mac}
    for interface, vlan_dict in data.items()
    for vlan, dyn_dict in vlan_dict.items()
    for dyn, mac_list in dyn_dict.items()
    for mac in mac_list
])

which gives

 interface   vlan       dyn mac-address
0    etherA  vlanY     local       mac01
1    etherA  vlanY     local       mac02
2    etherA  vlanY  external       mac03
3    etherA  vlanY  external       mac02
4    etherB  vlanZ     local       mac06
5    etherB  vlanZ     local       mac09
6    etherB  vlanZ  external       mac01
7    etherB  vlanZ  external       mac02
8    etherB  vlanZ  external       mac03

答案2

得分: 0

以下是代码部分的翻译：

我建议首先创建元组列表：

L = [(a, b, c, d) for a in data
               for b in data[a]
               for c in data[a][b]
               for d in data[a][b][c]]
df = pd.DataFrame(L, columns=['interface', 'vlan', 'dyn', 'mac-address'])

或者：

L = [(a, b, c, d) for a, d in data.items()
               for b, d1 in d.items()
               for c, d2 in d1.items()
               for d in d2]
df = pd.DataFrame(L, columns=['interface', 'vlan', 'dyn', 'mac-address'])

print(df)

  interface   vlan       dyn mac-address
0    etherA  vlanY     local       mac01
1    etherA  vlanY     local       mac02
2    etherA  vlanY  external       mac03
3    etherA  vlanY  external       mac02
4    etherB  vlanZ     local       mac06
5    etherB  vlanZ     local       mac09
6    etherB  vlanZ  external       mac01
7    etherB  vlanZ  external       mac02
8    etherB  vlanZ  external       mac03

英文:

I suggest create list of tuples first:

L = [(a,b,c,d) for a in data 
               for b in data[a] 
               for c in data[a][b] 
               for d in data[a][b][c]]
df = pd.DataFrame(L, columns=[&#39;interface&#39;,&#39;vlan&#39;,&#39;dyn&#39;,&#39;mac-address&#39;])

Or:

L = [(a,b,c,d) for a, d in data.items()
               for b, d1 in d.items()
               for c, d2 in d1.items()
               for d in d2]
df = pd.DataFrame(L, columns=[&#39;interface&#39;,&#39;vlan&#39;,&#39;dyn&#39;,&#39;mac-address&#39;])

print (df)

  interface   vlan       dyn mac-address
0    etherA  vlanY     local       mac01
1    etherA  vlanY     local       mac02
2    etherA  vlanY  external       mac03
3    etherA  vlanY  external       mac02
4    etherB  vlanZ     local       mac06
5    etherB  vlanZ     local       mac09
6    etherB  vlanZ  external       mac01
7    etherB  vlanZ  external       mac02
8    etherB  vlanZ  external       mac03

答案3

得分: 0

import pandas as pd

data = {
    "etherA": {
        "vlanY": {
            "local": ['mac01', 'mac02'],
            "external": ['mac03', 'mac02']
        }
    },
    "etherB": {
        "vlanZ": {
            "local": ['mac06', 'mac09'],
            "external": ['mac01', 'mac02', 'mac03']
        }
    }
}

df = pd.json_normalize(data, sep='_')
flatten_dict = df.to_dict(orient='records')[0]
res = []
for k, v in flatten_dict.items():
    for i in v:
        res.append(k.split("_")+[i])
res_df = pd.DataFrame(res, columns=["interface", "vlan", "dyn", "mac-address"])
print(res_df)

英文:

Firstly, you can flatten the nested dictionary using pd.json_normalize, then, you can build a list of lists and turn it into a DataFrame.

import pandas as pd

data = {
    &quot;etherA&quot;: {
        &quot;vlanY&quot;: {
            &quot;local&quot;: [&#39;mac01&#39;, &#39;mac02&#39;],
            &quot;external&quot;: [&#39;mac03&#39;, &#39;mac02&#39;]
        }
    },
    &quot;etherB&quot;: {
        &quot;vlanZ&quot;: {
            &quot;local&quot;: [&#39;mac06&#39;, &#39;mac09&#39;],
            &quot;external&quot;: [&#39;mac01&#39;, &#39;mac02&#39;, &#39;mac03&#39;]
        }
    }
}

df = pd.json_normalize(data, sep=&#39;_&#39;)
flatten_dict = df.to_dict(orient=&#39;records&#39;)[0]
res = []
for k, v in flatten_dict.items():
    for i in v:
        res.append(k.split(&quot;_&quot;)+[i])
res_df = pd.DataFrame(res, columns=[&quot;interface&quot;, &quot;vlan&quot;, &quot;dyn&quot;, &quot;mac-address&quot;])
print(res_df)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用嵌套字典和列表创建Panda DataFrame：dict:{dict:{dict:[list]}}

问题

答案1

答案2

答案3

如何查找数据框中是否包含任何字符串

Output 1 if greater than certain threshold and 0 less than another threshold and ignore if in between these threshold

函数内部的变量未定义，尽管在全局范围内已经定义。

Python DeepL API术语翻译不起作用

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论