如何合并3个哈希值?

huangapple go评论96阅读模式
英文:

How to merge 3 hashes?

问题

我已经成功获取了地址信息,但我该如何将其与 @party_names_and_types 合并,以便获得以下输出:

  1. {:part_name=>"SMALL, DANIEL", :party_type=>"Appellant &nbsp"}
  2. {:part_name=>"KELLY, MARK EDWARD", :party_type=>"Attorney for Appellant", :party_address => "134 N WATER STREET LIBERTY, MO 64068"}
  3. {:part_name=>"PITTMAN, KRISTI LANAE", :party_type=>"Co-Counsel for Appellant", :party_address => "134 N WATER STREET LIBERTY, MO 64068"}
  4. {:part_name=>"RED SIMPSON, INC.", :party_type=>"Respondent "}
  5. {:part_name=>"GREENWALD, DOUGLASMARK", :party_type=>"Attorney for Respondent", :party_address => "10 EAST CAMBRIDGE CIRCLE DRIVE KANSAS CITY, KS 66103"}
  6. {:part_name=>"BENJAMIN, SAMANTHA NICOLE", :party_type=>"Co-Counsel for Respondent", :party_address => "MCANANY VAN CLEVE AND PHILLIPS 10 E CAMBRIDGE CIRCLE DR STE 300 KANSAS CITY, KS 66103", :party_des => "Business:(913) 573-3319"}

你可以使用以下代码将地址信息与 @party_names_and_types 合并:

  1. @party_names_and_types.each_with_index do |party, index|
  2. party[:party_address] = @party_des[index] unless @party_des[index].nil?
  3. end

这将遍历 @party_names_and_types 数组,并将地址信息添加到每个条目中,如果地址信息可用的话。

英文:

I have been trying to get some information from a table into a hash so this is the code I have a HTML table like below, and Im extracting party_names and types and merging them in the single hash. Now I need to merge another hash with party addresses. I am able to get the address but the table structure is a bit unusual so I'm not sure how to merge the party address with the party names the one who has the address.

  1. require 'nokogiri'
  2. html = ' <table class="detailRecordTable"><tbody><tr>
  3. <td width="3%" class="detailSeperator" style="width:3%;"></td>
  4. <td width="30%" class="detailSeperator" style="width:30%;text-align:left">
  5. SMALL , DANIEL, Appellant  &nbsp </td> <td width="20%" class="detailSeperator" style="width:20%;font-weight: normal"> represented by   
  6. </td>
  7. <td width="47%" class="detailSeperator" style="width:47%;text-align:left">
  8. KELLY , MARK EDWARD
  9. , Attorney for Appellant
  10. </td>
  11. </tr>
  12. <tr>
  13. <td width="3%" class="detailData" style="width:3%;text-align:right">
  14. </td>
  15. <td width="30%" class="detailData">
  16. </td> <td width="20%" class="detailData">
  17. </td><td width="47%" class="detailData">
  18. 134 N WATER STREET<br>
  19. LIBERTY,
  20. MO
  21. 64068<br> <br>
  22. <p></p>
  23. </td>
  24. </tr>
  25. <tr>
  26. <td width="3%" class="detailData"> </td>
  27. <td width="30%" class="detailData"> </td>
  28. <td width="20%" class="detailData"> </td>
  29. <td width="47%" class="detailData"></td>
  30. </tr>
  31. <tr>
  32. <td class="detailSeperator" style="width:3%;text-align:right"></td>
  33. <td class="detailSeperator" style="width:30%;text-align:left"></td>
  34. <td class="detailSeperator" style="width:20%;font-weight: normal">co-counsel</td>
  35. <td class="detailSeperator" style="width:47%;text-align:left">
  36. PITTMAN , KRISTI LANAE , Co-Counsel for Appellant</td>
  37. </tr>
  38. <tr>
  39. <td width="3%" class="detailData"> </td>
  40. <td width="30%" class="detailData"> </td>
  41. <td width="20%" class="detailData"> </td>
  42. <td width="47%" class="detailData">
  43. 134 NORTH WATER STREET<br>
  44. LIBERTY,
  45. MO
  46. 64068<br> <br>
  47. </td>
  48. </tr>
  49. <tr>
  50. <td width="3%" class="detailSeperator" style="width:3%;"> 
  51. </td>
  52. <td width="30%" class="detailSeperator" style="width:30%;text-align:left">
  53. RED SIMPSON, INC.
  54. , Respondent   
  55. </td>
  56. <td width="20%" class="detailSeperator" style="width:20%;font-weight: normal"> represented by   
  57. </td>
  58. <td width="47%" class="detailSeperator" style="width:47%;text-align:left">
  59. GREENWALD , DOUGLAS MARK
  60. , Attorney for Respondent
  61. </td>
  62. </tr>
  63. <tr>
  64. <td width="3%" class="detailData" style="width:3%;text-align:right">
  65. </td>
  66. <td width="30%" class="detailData">
  67. </td>
  68. <td width="20%" class="detailData">
  69. </td>
  70. <td width="47%" class="detailData">
  71. 10 EAST CAMBRIDGE CIRCLE DRIVE<br>
  72. KANSAS CITY,
  73. KS
  74. 66103<br><br>
  75. <p></p>
  76. </td>
  77. </tr>
  78. <tr>
  79. <td width="3%" class="detailData"> </td>
  80. <td width="30%" class="detailData"> </td>
  81. <td width="20%" class="detailData"> </td>
  82. <td width="47%" class="detailData"></td>
  83. </tr>
  84. <tr>
  85. <td class="detailSeperator" style="width:3%;text-align:right"></td>
  86. <td class="detailSeperator" style="width:30%;text-align:left"></td>
  87. <td class="detailSeperator" style="width:20%;font-weight: normal">co-counsel</td>
  88. <td class="detailSeperator" style="width:47%;text-align:left">
  89. BENJAMIN, SAMANTHA NICOLE
  90. , Co-Counsel for Respondent</td>
  91. </tr>
  92. <tr>
  93. <td width="3%" class="detailData"> </td>
  94. <td width="30%" class="detailData"> </td>
  95. <td width="20%" class="detailData"> </td>
  96. <td width="47%" class="detailData">
  97. MCANANY VAN CLEVE AND PHILLIPS<br>
  98. 10 E CAMBRIDGE CIRCLE DR<br>
  99. STE 300<br>
  100. KANSAS CITY,
  101. KS
  102. 66103<br>
  103. <b>Business: </b>
  104. (913)
  105. 573-3319 <br> <br>
  106. </td>
  107. </tr>
  108. </tbody></table>'
  109. doc = Nokogiri::HTML(html)
  110. rows = doc.xpath("//table[@class='detailRecordTable']//tr")
  111. # address2 = doc.css('td:nth-of-type(4)').text.strip
  112. # puts address2
  113. @party_names = []
  114. @party_types = []
  115. @party_des = []
  116. rows.each do |row|
  117. nodes = row.css('.detailSeperator:nth-of-type(2), .detailSeperator:nth-of-type(4)')
  118. nodes.each do |node|
  119. name = node.text.strip.gsub("\n", '').gsub("\t", '')
  120. parts = name.split(',')
  121. name = if parts.length == 3
  122. "#{parts[0]}, #{parts[1]}"
  123. else
  124. parts[0]
  125. end
  126. party_type = parts[-1].strip if parts && parts.length >= 2
  127. addr = ("#{parts[0]}, #{parts[1]}" if parts.length == 2)
  128. @party_names << name
  129. @party_types << party_type
  130. @party_des << addr
  131. end
  132. address = row.css('td:nth-of-type(2),td:nth-of-type(4)')
  133. address.each do |node|
  134. addr = node.text.strip.gsub("\n", '').gsub("\t", '')
  135. parts = addr.split(',')
  136. addr = ("#{parts[0]}, #{parts[1]}" if parts.length == 2)
  137. @party_des << addr
  138. end
  139. end
  140. @party_names.compact!
  141. @party_names.reject(&:empty?)
  142. @party_types.compact!
  143. @party_des.compact!
  144. @party_names_and_types = @party_names.zip(@party_types).map { |name, type| { part_name: name, party_type: type } }

The out put I have currrently is like this

  1. {:part_name=>"SMALL, DANIEL", :party_type=>"Appellant &nbsp"}
  2. {:part_name=>"KELLY, MARK EDWARD", :party_type=>"Attorney for Appellant"}
  3. {:part_name=>"PITTMAN, KRISTI LANAE", :party_type=>"Co-Counsel for Appellant"}
  4. {:part_name=>"RED SIMPSON, INC.", :party_type=>"Respondent "}
  5. {:part_name=>"GREENWALD, DOUGLASMARK", :party_type=>"Attorney for Respondent"}
  6. {:part_name=>"BENJAMIN, SAMANTHA NICOLE", :party_type=>"Co-Counsel for Respondent"}

how I am able to get the party address but how can I merge it with @party_names_and_types so I have the output like this

  1. {:part_name=>"SMALL, DANIEL", :party_type=>"Appellant &nbsp"}
  2. {:part_name=>"KELLY, MARK EDWARD", :party_type=>"Attorney for Appellant", :party_address => "134 N WATER STREETLIBERTY,MO 64068"}
  3. {:part_name=>"PITTMAN, KRISTI LANAE", :party_type=>"Co-Counsel for Appellant",:party_address => "134 N WATER STREETLIBERTY,MO 64068"}
  4. {:part_name=>"RED SIMPSON, INC.", :party_type=>"Respondent "}
  5. {:part_name=>"GREENWALD, DOUGLASMARK", :party_type=>"Attorney for Respondent", :party_address => " 10 EAST CAMBRIDGE CIRCLE DRIVE KANSAS CITY,KS 66103"}
  6. {:part_name=>"BENJAMIN, SAMANTHA NICOLE", :party_type=>"Co-Counsel for Respondent", :party_address => " MCANANY VAN CLEVE AND PHILLIPS 10 E CAMBRIDGE CIRCLE DR STE 300 KANSAS CITY,KS 66103", :party_des => "Business:(913) 573-3319"}

答案1

得分: 0

你关于表格结构“有点不寻常”的观点是正确的。你实现的逻辑,我不会说是错的,但对于这个表格,我不会采用它,因为关联的值(比如党派名称和党派地址)位于不同的行中。

这是我编写的代码,以获得你提到的期望输出:

  1. require 'nokogiri'
  2. # html = '你提供的HTML代码...'
  3. doc = Nokogiri::HTML(html)
  4. rows = doc.xpath("//table[@class='detailRecordTable']//tr")
  5. @party_names_and_types = []
  6. start = 0
  7. step = 5
  8. def format_text(text)
  9. text.strip.gsub(" ", "").gsub("\n", ' ').gsub("\t", '')
  10. end
  11. def get_party_name_and_type(text)
  12. parts = text.split(',')
  13. name = parts.length == 3 ? "#{parts[0]}, #{parts[1]}" : parts[0]
  14. party_type = format_text(parts[-1].strip) if parts && parts.length >= 2
  15. { party_name: name, party_type: party_type }
  16. end
  17. while start < rows.count
  18. data_rows = rows.slice(start, step)
  19. [0, 3].each do |row_num|
  20. if row_num == 0
  21. [1, 3].each do |col_num|
  22. party_details = get_party_name_and_type(
  23. format_text(data_rows[row_num].children.filter("td")[col_num].text)
  24. )
  25. address = data_rows[row_num+1].children.filter("td")[3].text if col_num == 3
  26. party_details[:party_address] = format_text(address) unless address.nil? || address.empty?
  27. @party_names_and_types << party_details
  28. end
  29. else
  30. party_details = get_party_name_and_type(
  31. format_text(data_rows[3].children.filter("td")[3].text)
  32. )
  33. address = data_rows[row_num+1].children.filter("td")[3].text
  34. party_details[:party_address] = format_text(address) unless address.nil? || address.empty?
  35. @party_names_and_types << party_details
  36. end
  37. end
  38. start += step
  39. end
  40. puts "======@party_names_and_types======"
  41. puts @party_names_and_types

输出:

  1. ======@party_names_and_types======
  2. {:party_name=>"SMALL , DANIEL", :party_type=>"Appellant  &nbsp"}
  3. {:party_name=>"KELLY , MARK EDWARD ", :party_type=>"Attorney for Appellant", :party_address=>"134 N WATER STREETLIBERTY, MO 64068"}
  4. {:party_name=>"PITTMAN , KRISTI LANAE ", :party_type=>"Co-Counsel for Appellant", :party_address=>"134 NORTH WATER STREETLIBERTY, MO 64068"}
  5. {:party_name=>"RED SIMPSON, INC. ", :party_type=>"Respondent   "}
  6. {:party_name=>"GREENWALD , DOUGLAS MARK ", :party_type=>"Attorney for Respondent", :party_address=>"10 EAST CAMBRIDGE CIRCLE DRIVEKANSAS CITY, KS 66103"}
  7. {:party_name=>"BENJAMIN, SAMANTHA NICOLE ", :party_type=>"Co-Counsel for Respondent", :party_address=>"MCANANY VAN CLEVE AND PHILLIPS10 E CAMBRIDGE CIRCLE DRSTE 300KANSAS CITY, KS 66103 Business:(913) 573-3319"}

我将在一段时间内更新答案,以解释其中的逻辑。希望这有所帮助。

英文:

You were right about the table structure being "a bit unusual".
The logic that you implemented, I won't say it was wrong, but for this table, I won't go with it since the associated values (like party name and party address) were in different rows.

Here is the code that I wrote to get the expected output as mentioned by you

  1. require &#39;nokogiri&#39;
  2. # html = &#39;your provided html code...&#39;
  3. doc = Nokogiri::HTML(html)
  4. rows = doc.xpath(&quot;//table[@class=&#39;detailRecordTable&#39;]//tr&quot;)
  5. @party_names_and_types = []
  6. start = 0
  7. step = 5
  8. def format_text(text)
  9. text.strip.gsub(&quot; &quot;, &quot;&quot;).gsub(&quot;\n&quot;, &#39; &#39;).gsub(&quot;\t&quot;, &#39;&#39;)
  10. end
  11. def get_party_name_and_type(text)
  12. parts = text.split(&#39;,&#39;)
  13. name = parts.length == 3 ? &quot;#{parts[0]}, #{parts[1]}&quot; : parts[0]
  14. party_type = format_text(parts[-1].strip) if parts &amp;&amp; parts.length &gt;= 2
  15. { party_name: name, party_type: party_type }
  16. end
  17. while start &lt; rows.count
  18. data_rows = rows.slice(start, step)
  19. [0, 3].each do |row_num|
  20. if row_num == 0
  21. [1, 3].each do |col_num|
  22. party_details = get_party_name_and_type(
  23. format_text(data_rows[row_num].children.filter(&quot;td&quot;)[col_num].text)
  24. )
  25. address = data_rows[row_num+1].children.filter(&quot;td&quot;)[3].text if col_num == 3
  26. party_details[:party_address] = format_text(address) unless address.nil? || address.empty?
  27. @party_names_and_types &lt;&lt; party_details
  28. end
  29. else
  30. party_details = get_party_name_and_type(
  31. format_text(data_rows[3].children.filter(&quot;td&quot;)[3].text)
  32. )
  33. address = data_rows[row_num+1].children.filter(&quot;td&quot;)[3].text
  34. party_details[:party_address] = format_text(address) unless address.nil? || address.empty?
  35. @party_names_and_types &lt;&lt; party_details
  36. end
  37. end
  38. start += step
  39. end
  40. puts &quot;======@party_names_and_types======&quot;
  41. puts @party_names_and_types

Output:

  1. ======@party_names_and_types======
  2. {:party_name=&gt;&quot;SMALL , DANIEL&quot;, :party_type=&gt;&quot;Appellant&#160;&#160;&amp;nbsp&quot;}
  3. {:party_name=&gt;&quot;KELLY , MARK EDWARD &quot;, :party_type=&gt;&quot;Attorney for Appellant&quot;, :party_address=&gt;&quot;134 N WATER STREETLIBERTY, MO 64068&quot;}
  4. {:party_name=&gt;&quot;PITTMAN , KRISTI LANAE &quot;, :party_type=&gt;&quot;Co-Counsel for Appellant&quot;, :party_address=&gt;&quot;134 NORTH WATER STREETLIBERTY, MO 64068&quot;}
  5. {:party_name=&gt;&quot;RED SIMPSON, INC. &quot;, :party_type=&gt;&quot;Respondent&#160;&#160;&#160;&quot;}
  6. {:party_name=&gt;&quot;GREENWALD , DOUGLAS MARK &quot;, :party_type=&gt;&quot;Attorney for Respondent&quot;, :party_address=&gt;&quot;10 EAST CAMBRIDGE CIRCLE DRIVEKANSAS CITY, KS 66103&quot;}
  7. {:party_name=&gt;&quot;BENJAMIN, SAMANTHA NICOLE &quot;, :party_type=&gt;&quot;Co-Counsel for Respondent&quot;, :party_address=&gt;&quot;MCANANY VAN CLEVE AND PHILLIPS10 E CAMBRIDGE CIRCLE DRSTE 300KANSAS CITY, KS 66103 Business:(913) 573-3319&quot;}

I'll update the answer to explain the logic in some time.
Hope this helps.

huangapple
  • 本文由 发表于 2023年2月6日 16:53:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75359128.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定