使用嵌套字典和列表创建Panda DataFrame:dict:{dict:{dict:[list]}}

huangapple go评论102阅读模式
英文:

Create Panda DataFrame using nested dictionaries and a list: dict:{dict:{dict:[list]}}

问题

  1. data = {
  2. "etherA": {
  3. "vlanY": {
  4. "local": ['mac01', 'mac02'],
  5. "external": ['mac03', 'mac02']
  6. }
  7. },
  8. "etherB": {
  9. "vlanZ": {
  10. "local": ['mac06', 'mac09'],
  11. "external": ['mac01', 'mac02', 'mac03']
  12. }
  13. }
  14. }
  15. import pandas as pd
  16. # Create an empty DataFrame with the desired column names
  17. df = pd.DataFrame(columns=['interface', 'vlan', 'dyn', 'mac-address'])
  18. # Loop through the nested dictionary and flatten the data
  19. for interface, nested_dict in data.items():
  20. for vlan, dyn_dict in nested_dict.items():
  21. for dyn, mac_list in dyn_dict.items():
  22. for mac in mac_list:
  23. df = df.append({'interface': interface, 'vlan': vlan, 'dyn': dyn, 'mac-address': mac}, ignore_index=True)
  24. # Print the resulting DataFrame
  25. print(df)

This code will create the DataFrame you desire from the nested dictionary without using multiple for loops.

英文:

I have a series of nested dicts with a list as the deepest value.

  1. data = {
  2. "etherA": {
  3. "vlanY": {
  4. "local": ['mac01', 'mac02'],
  5. "external": ['mac03', 'mac02']
  6. }
  7. },
  8. "etherB": {
  9. "vlanZ": {
  10. "local": ['mac06', 'mac09'],
  11. "external": ['mac01', 'mac02', 'mac03']
  12. }
  13. }
  14. }

To load the dict into a dataframe, I create the column headers and then loop through the dict and add a list to the end of the dataframe.

  1. df = pd.DataFrame.from_dict({
  2. 'interface': [],
  3. 'vlan': [],
  4. 'dyn': [],
  5. 'mac-address': []
  6. })
  7. for a in data:
  8. for b in data[a]:
  9. for c in data[a][b]:
  10. for d in data[a][b][c]:
  11. df.loc[len(df)] = [a, b, c, d]

Final output:

  1. print(df)
  2. interface vlan dyn mac-address
  3. 0 etherA vlanY local mac01
  4. 1 etherA vlanY local mac02
  5. 2 etherA vlanY external mac03
  6. 3 etherA vlanY external mac02
  7. 4 etherB vlanZ local mac06
  8. 5 etherB vlanZ local mac09
  9. 6 etherB vlanZ external mac01
  10. 7 etherB vlanZ external mac02
  11. 8 etherB vlanZ external mac03

The "for loops" ultimately do what I need it to, but is there a panda method for getting the data from the dict into the dataframe?

I've read through numerous other posts and have tried their answers and suggestions. Most are dealing with a single nested dictionary and none have dealt with a nested, nested, nested list. A few of the suggested questions are what I was trying to achieve and the answer was to loop through to essentially flatten the data before appending it to the dataframe,so that may be the best course.

答案1

得分: 1

以下是翻译好的代码部分:

  1. import pandas as pd
  2. data = {
  3. "etherA": {
  4. "vlanY": {
  5. "local": ['mac01', 'mac02'],
  6. "external": ['mac03', 'mac02']
  7. }
  8. },
  9. "etherB": {
  10. "vlanZ": {
  11. "local": ['mac06', 'mac09'],
  12. "external": ['mac01', 'mac02', 'mac03']
  13. }
  14. }
  15. }
  16. df = pd.DataFrame([
  17. {'interface': interface, 'vlan': vlan, 'dyn': dyn, 'mac-address': mac}
  18. for interface, vlan_dict in data.items()
  19. for vlan, dyn_dict in vlan_dict.items()
  20. for dyn, mac_list in dyn_dict.items()
  21. for mac in mac_list
  22. ])

这段代码生成的DataFrame如下:

  1. interface vlan dyn mac-address
  2. 0 etherA vlanY local mac01
  3. 1 etherA vlanY local mac02
  4. 2 etherA vlanY external mac03
  5. 3 etherA vlanY external mac02
  6. 4 etherB vlanZ local mac06
  7. 5 etherB vlanZ local mac09
  8. 6 etherB vlanZ external mac01
  9. 7 etherB vlanZ external mac02
  10. 8 etherB vlanZ external mac03
英文:

Another way to do this is:

  1. import pandas as pd
  2. data = {
  3. "etherA": {
  4. "vlanY": {
  5. "local": ['mac01', 'mac02'],
  6. "external": ['mac03', 'mac02']
  7. }
  8. },
  9. "etherB": {
  10. "vlanZ": {
  11. "local": ['mac06', 'mac09'],
  12. "external": ['mac01', 'mac02', 'mac03']
  13. }
  14. }
  15. }
  16. df = pd.DataFrame([
  17. {'interface': interface, 'vlan': vlan, 'dyn': dyn, 'mac-address': mac}
  18. for interface, vlan_dict in data.items()
  19. for vlan, dyn_dict in vlan_dict.items()
  20. for dyn, mac_list in dyn_dict.items()
  21. for mac in mac_list
  22. ])

which gives

  1. interface vlan dyn mac-address
  2. 0 etherA vlanY local mac01
  3. 1 etherA vlanY local mac02
  4. 2 etherA vlanY external mac03
  5. 3 etherA vlanY external mac02
  6. 4 etherB vlanZ local mac06
  7. 5 etherB vlanZ local mac09
  8. 6 etherB vlanZ external mac01
  9. 7 etherB vlanZ external mac02
  10. 8 etherB vlanZ external mac03

答案2

得分: 0

以下是代码部分的翻译:

  1. 我建议首先创建元组列表
  2. L = [(a, b, c, d) for a in data
  3. for b in data[a]
  4. for c in data[a][b]
  5. for d in data[a][b][c]]
  6. df = pd.DataFrame(L, columns=['interface', 'vlan', 'dyn', 'mac-address'])
  7. 或者
  8. L = [(a, b, c, d) for a, d in data.items()
  9. for b, d1 in d.items()
  10. for c, d2 in d1.items()
  11. for d in d2]
  12. df = pd.DataFrame(L, columns=['interface', 'vlan', 'dyn', 'mac-address'])
  13. print(df)
  14. interface vlan dyn mac-address
  15. 0 etherA vlanY local mac01
  16. 1 etherA vlanY local mac02
  17. 2 etherA vlanY external mac03
  18. 3 etherA vlanY external mac02
  19. 4 etherB vlanZ local mac06
  20. 5 etherB vlanZ local mac09
  21. 6 etherB vlanZ external mac01
  22. 7 etherB vlanZ external mac02
  23. 8 etherB vlanZ external mac03
英文:

I suggest create list of tuples first:

  1. L = [(a,b,c,d) for a in data
  2. for b in data[a]
  3. for c in data[a][b]
  4. for d in data[a][b][c]]
  5. df = pd.DataFrame(L, columns=['interface','vlan','dyn','mac-address'])

Or:

  1. L = [(a,b,c,d) for a, d in data.items()
  2. for b, d1 in d.items()
  3. for c, d2 in d1.items()
  4. for d in d2]
  5. df = pd.DataFrame(L, columns=['interface','vlan','dyn','mac-address'])
  6. print (df)
  7. interface vlan dyn mac-address
  8. 0 etherA vlanY local mac01
  9. 1 etherA vlanY local mac02
  10. 2 etherA vlanY external mac03
  11. 3 etherA vlanY external mac02
  12. 4 etherB vlanZ local mac06
  13. 5 etherB vlanZ local mac09
  14. 6 etherB vlanZ external mac01
  15. 7 etherB vlanZ external mac02
  16. 8 etherB vlanZ external mac03

答案3

得分: 0

  1. import pandas as pd
  2. data = {
  3. "etherA": {
  4. "vlanY": {
  5. "local": ['mac01', 'mac02'],
  6. "external": ['mac03', 'mac02']
  7. }
  8. },
  9. "etherB": {
  10. "vlanZ": {
  11. "local": ['mac06', 'mac09'],
  12. "external": ['mac01', 'mac02', 'mac03']
  13. }
  14. }
  15. }
  16. df = pd.json_normalize(data, sep='_')
  17. flatten_dict = df.to_dict(orient='records')[0]
  18. res = []
  19. for k, v in flatten_dict.items():
  20. for i in v:
  21. res.append(k.split("_")+[i])
  22. res_df = pd.DataFrame(res, columns=["interface", "vlan", "dyn", "mac-address"])
  23. print(res_df)
英文:

Firstly, you can flatten the nested dictionary using pd.json_normalize, then, you can build a list of lists and turn it into a DataFrame.

  1. import pandas as pd
  2. data = {
  3. "etherA": {
  4. "vlanY": {
  5. "local": ['mac01', 'mac02'],
  6. "external": ['mac03', 'mac02']
  7. }
  8. },
  9. "etherB": {
  10. "vlanZ": {
  11. "local": ['mac06', 'mac09'],
  12. "external": ['mac01', 'mac02', 'mac03']
  13. }
  14. }
  15. }
  16. df = pd.json_normalize(data, sep='_')
  17. flatten_dict = df.to_dict(orient='records')[0]
  18. res = []
  19. for k, v in flatten_dict.items():
  20. for i in v:
  21. res.append(k.split("_")+[i])
  22. res_df = pd.DataFrame(res, columns=["interface", "vlan", "dyn", "mac-address"])
  23. print(res_df)

huangapple
  • 本文由 发表于 2023年2月27日 13:51:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/75577131.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定