循环以筛选字典中的NaN值

huangapple go评论98阅读模式
英文:

Loop for filtering NaN's in dictionaries

问题

我打开了我的字典,我想写一个循环,以便只获得"favorite color"列等于NaN的行作为输出。

我的代码到目前为止:

  1. # 导入模块
  2. import openpyxl as op
  3. import pandas as pd
  4. import numpy as np
  5. import xlsxwriter
  6. from openpyxl import Workbook, load_workbook
  7. # 定义文件路径
  8. my_file_path = r'C:\Users\machukovich\Desktop\stack.xlsx'
  9. # 将文件加载到数据框字典中
  10. my_dict = pd.read_excel(my_file_path, sheet_name=None, skiprows=2)

my_dict输出:

  1. my_dict = {'Sheet_1': Name Surname Concatenation ID_ Grade_ favourite color
  2. 1 Delilah Gonzalez Delilah Gonzalez NaN NaN NaN
  3. 2 Christina Rodwell Christina Rodwell 100.0 3.0 black
  4. 3 Ziggy Stardust Ziggy Stardust 40.0 7.0 red ,
  5. 'Sheet_2': Name Surname Concatenation ID_ Grade_ favourite color
  6. 0 Lucy Diamonds Lucy Diamonds 22.0 9.0 brown
  7. 1 Grace Kelly Grace Kelly 50.0 7.0 white
  8. 2 Uma Thurman Uma Thurman 105.0 7.0 purple
  9. 3 Lola King Lola King NaN NaN NaN ,
  10. 'Sheet_3': Name Surname Concatenation ID_ Grade_ favourite color
  11. 0 Eleanor Rigby Eleanor Rigby 104.0 6.0 blue
  12. 1 Barbara Ann Barbara Ann 168.0 8.0 pink
  13. 2 Polly Cracker Polly Cracker 450.0 7.0 black
  14. 3 Little Joe Little Joe NaN NaN NaN }

我想要的输出:

  1. my_dict = {'Sheet_1': Name Surname Concatenation ID_ Grade_ favourite color
  2. 1 Delilah Gonzalez Delilah Gonzalez NaN NaN NaN
  3. 'Sheet_2': Name Surname Concatenation ID_ Grade_ favourite color
  4. 3 Lola King Lola King NaN NaN NaN
  5. 'Sheet_3': Name Surname Concatenation ID_ Grade_ favourite color
  6. 3 Little Joe Little Joe NaN NaN NaN

最后,我想将"desired output"写入一个新的Excel文件(在单独的工作表中)。请为我提供指导。我是Python新手。

英文:

I have opened my dictionary and I would like to write a loop so to obtain as an output only those rows for which the ''favorite color'' column equals to NaN.

  • My code so far:

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-html -->

  1. # Importing modules
  2. import openpyxl as op
  3. import pandas as pd
  4. import numpy as np
  5. import xlsxwriter
  6. from openpyxl import Workbook, load_workbook
  7. # Defining the file path
  8. my_file_path = r&#39;C:\Users\machukovich\Desktop\stack.xlsx&#39;
  9. # Loading the file into a dictionary of Dataframes
  10. my_dict = pd.read_excel(my_file_path, sheet_name=None, skiprows=2)

<!-- end snippet -->

  • my_dict output:

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-html -->

  1. my_dict = {&#39;Sheet_1&#39;: Name Surname Concatenation ID_ Grade_ favourite color
  2. 1 Delilah Gonzalez Delilah Gonzalez NaN NaN NaN
  3. 2 Christina Rodwell Christina Rodwell 100.0 3.0 black
  4. 3 Ziggy Stardust Ziggy Stardust 40.0 7.0 red ,
  5. &#39;Sheet_2&#39;: Name Surname Concatenation ID_ Grade_ favourite color \
  6. 0 Lucy Diamonds Lucy Diamonds 22.0 9.0 brown
  7. 1 Grace Kelly Grace Kelly 50.0 7.0 white
  8. 2 Uma Thurman Uma Thurman 105.0 7.0 purple
  9. 3 Lola King Lola King NaN NaN NaN ,
  10. &#39;Sheet_3&#39;: Name Surname Concatenation ID_ Grade_ favourite color \
  11. 0 Eleanor Rigby Eleanor Rigby 104.0 6.0 blue
  12. 1 Barbara Ann Barbara Ann 168.0 8.0 pink
  13. 2 Polly Cracker Polly Cracker 450.0 7.0 black
  14. 3 Little Joe Little Joe NaN NaN NaN }

<!-- end snippet -->

  • My desired output:

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-html -->

  1. my_dict = {&#39;Sheet_1&#39;: Name Surname Concatenation ID_ Grade_ favourite color
  2. 1 Delilah Gonzalez Delilah Gonzalez NaN NaN NaN
  3. &#39;Sheet_2&#39;: Name Surname Concatenation ID_ Grade_ favourite color \
  4. 3 Lola King Lola King NaN NaN NaN
  5. &#39;Sheet_3&#39;: Name Surname Concatenation ID_ Grade_ favourite color \
  6. 3 Little Joe Little Joe NaN NaN NaN

<!-- end snippet -->

And, finally I would like to write the desired output to a new excel file (in separate sheets).
Please, enlighten me. I am new to python.

答案1

得分: 1

这是您提供的代码的翻译:

  1. 我会这样做
  2. 使用 pd.ExcelWriter("output.xlsx", engine="xlsxwriter") 作为 writer:
  3. 对于 sn, df in my_dict.items():
  4. (df.loc[df["favourite color"].isnull()] # 我们使用布尔索引
  5. .to_excel(writer, sheet_name=sn, index=False)) # 是否使用 startrow, startcol ?
  6. # 这是可选的
  7. 对于 ws in writer.sheets:
  8. writer.sheets[ws].autofit() # xlsxwriter 3.0.6+

输出(只有 Sheet_1):

循环以筛选字典中的NaN值

更新:

如果您想先更新 my_dict,可以使用以下方式:

  1. 对于 sn, df in my_dict.items():
  2. my_dict[sn] = df.loc[df["favourite color"].isnull()]

输出:

  1. print(my_dict)
  2. {'Sheet_1': Name Surname Concatenation ID_ Grade_ favourite color
  3. 0 Delilah Gonzalez Delilah Gonzalez NaN NaN NaN,
  4. 'Sheet_2': Name Surname Concatenation ID_ Grade_ favourite color
  5. 0 Lola King Lola King NaN NaN NaN,
  6. 'Sheet_3': Name Surname Concatenation ID_ Grade_ favourite color
  7. 0 Little Joe Little Joe NaN NaN NaN}

然后(如果需要),您可以循环遍历每个筛选后的 df 将其存储在电子表格中:

  1. 使用 pd.ExcelWriter("output.xlsx", engine="xlsxwriter") 作为 writer:
  2. 对于 sn, df in my_dict.items():
  3. df.to_excel(writer, sheet_name=sn, index=False)
  1. 请注意,我已经翻译了您提供的代码和相关注释,但没有翻译其他内容。
  2. <details>
  3. <summary>英文:</summary>
  4. I would do it this way :
  5. with pd.ExcelWriter(&quot;output.xlsx&quot;, engine=&quot;xlsxwriter&quot;) as writer:
  6. for sn, df in my_dict.items():
  7. (df.loc[df[&quot;favourite color&quot;].isnull()] # we use boolean indexing
  8. .to_excel(writer, sheet_name=sn, index=False)) # with startrow, starcol ?
  9. #this is optional
  10. for ws in writer.sheets:
  11. writer.sheets[ws].autofit() # xlsxwriter 3.0.6+
  12. Output (*only* `Sheet_1`):
  13. [![enter image description here][1]][1]
  14. ***Update :***
  15. If you want to update `my_dict` first, you can use this :
  16. for sn, df in my_dict.items():
  17. my_dict[sn] = df.loc[df[&quot;favourite color&quot;].isnull()]
  18. Output :
  19. print(my_dict)
  20. {&#39;Sheet_1&#39;: Name Surname Concatenation ID_ Grade_ favourite color
  21. 0 Delilah Gonzalez Delilah Gonzalez NaN NaN NaN,
  22. &#39;Sheet_2&#39;: Name Surname Concatenation ID_ Grade_ favourite color
  23. 0 Lola King Lola King NaN NaN NaN,
  24. &#39;Sheet_3&#39;: Name Surname Concatenation ID_ Grade_ favourite color
  25. 0 Little Joe Little Joe NaN NaN NaN}
  26. Then (*if needed*) you can loop through each filtered `df` to store it in a spreadsheet :
  27. with pd.ExcelWriter(&quot;output.xlsx&quot;, engine=&quot;xlsxwriter&quot;) as writer:
  28. for sn, df in my_dict.items():
  29. df.to_excel(writer, sheet_name=sn, index=False)
  30. [1]: https://i.stack.imgur.com/iUPhm.png
  31. </details>

huangapple
  • 本文由 发表于 2023年6月15日 02:53:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76476734.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定