英文:
How to merge 3 hashes?
问题
我已经成功获取了地址信息,但我该如何将其与 @party_names_and_types 合并,以便获得以下输出:
{:part_name=>"SMALL, DANIEL", :party_type=>"Appellant  "}
{:part_name=>"KELLY, MARK EDWARD", :party_type=>"Attorney for Appellant", :party_address => "134 N WATER STREET LIBERTY, MO 64068"}
{:part_name=>"PITTMAN, KRISTI LANAE", :party_type=>"Co-Counsel for Appellant", :party_address => "134 N WATER STREET LIBERTY, MO 64068"}
{:part_name=>"RED SIMPSON, INC.", :party_type=>"Respondent "}
{:part_name=>"GREENWALD, DOUGLASMARK", :party_type=>"Attorney for Respondent", :party_address => "10 EAST CAMBRIDGE CIRCLE DRIVE KANSAS CITY, KS 66103"}
{:part_name=>"BENJAMIN, SAMANTHA NICOLE", :party_type=>"Co-Counsel for Respondent", :party_address => "MCANANY VAN CLEVE AND PHILLIPS 10 E CAMBRIDGE CIRCLE DR STE 300 KANSAS CITY, KS 66103", :party_des => "Business:(913) 573-3319"}
你可以使用以下代码将地址信息与 @party_names_and_types
合并:
@party_names_and_types.each_with_index do |party, index|
party[:party_address] = @party_des[index] unless @party_des[index].nil?
end
这将遍历 @party_names_and_types
数组,并将地址信息添加到每个条目中,如果地址信息可用的话。
英文:
I have been trying to get some information from a table into a hash so this is the code I have a HTML table like below, and Im extracting party_names and types and merging them in the single hash. Now I need to merge another hash with party addresses. I am able to get the address but the table structure is a bit unusual so I'm not sure how to merge the party address with the party names the one who has the address.
require 'nokogiri'
html = ' <table class="detailRecordTable"><tbody><tr>
<td width="3%" class="detailSeperator" style="width:3%;"></td>
<td width="30%" class="detailSeperator" style="width:30%;text-align:left">
SMALL , DANIEL, Appellant&nbsp;&nbsp;&nbsp </td> <td width="20%" class="detailSeperator" style="width:20%;font-weight: normal"> represented by&nbsp;&nbsp;&nbsp;
</td>
<td width="47%" class="detailSeperator" style="width:47%;text-align:left">
KELLY , MARK EDWARD
, Attorney for Appellant
</td>
</tr>
<tr>
<td width="3%" class="detailData" style="width:3%;text-align:right">
</td>
<td width="30%" class="detailData">
</td> <td width="20%" class="detailData">
</td><td width="47%" class="detailData">
134 N WATER STREET<br>
LIBERTY,
MO
64068<br> <br>
<p></p>
</td>
</tr>
<tr>
<td width="3%" class="detailData">&nbsp;</td>
<td width="30%" class="detailData">&nbsp;</td>
<td width="20%" class="detailData">&nbsp;</td>
<td width="47%" class="detailData"></td>
</tr>
<tr>
<td class="detailSeperator" style="width:3%;text-align:right"></td>
<td class="detailSeperator" style="width:30%;text-align:left"></td>
<td class="detailSeperator" style="width:20%;font-weight: normal">co-counsel</td>
<td class="detailSeperator" style="width:47%;text-align:left">
PITTMAN , KRISTI LANAE , Co-Counsel for Appellant</td>
</tr>
<tr>
<td width="3%" class="detailData">&nbsp;</td>
<td width="30%" class="detailData">&nbsp;</td>
<td width="20%" class="detailData">&nbsp;</td>
<td width="47%" class="detailData">
134 NORTH WATER STREET<br>
LIBERTY,
MO
64068<br> <br>
</td>
</tr>
<tr>
<td width="3%" class="detailSeperator" style="width:3%;">&nbsp;
</td>
<td width="30%" class="detailSeperator" style="width:30%;text-align:left">
RED SIMPSON, INC.
, Respondent&nbsp;&nbsp;&nbsp;
</td>
<td width="20%" class="detailSeperator" style="width:20%;font-weight: normal"> represented by&nbsp;&nbsp;&nbsp;
</td>
<td width="47%" class="detailSeperator" style="width:47%;text-align:left">
GREENWALD , DOUGLAS MARK
, Attorney for Respondent
</td>
</tr>
<tr>
<td width="3%" class="detailData" style="width:3%;text-align:right">
</td>
<td width="30%" class="detailData">
</td>
<td width="20%" class="detailData">
</td>
<td width="47%" class="detailData">
10 EAST CAMBRIDGE CIRCLE DRIVE<br>
KANSAS CITY,
KS
66103<br><br>
<p></p>
</td>
</tr>
<tr>
<td width="3%" class="detailData">&nbsp;</td>
<td width="30%" class="detailData">&nbsp;</td>
<td width="20%" class="detailData">&nbsp;</td>
<td width="47%" class="detailData"></td>
</tr>
<tr>
<td class="detailSeperator" style="width:3%;text-align:right"></td>
<td class="detailSeperator" style="width:30%;text-align:left"></td>
<td class="detailSeperator" style="width:20%;font-weight: normal">co-counsel</td>
<td class="detailSeperator" style="width:47%;text-align:left">
BENJAMIN, SAMANTHA NICOLE
, Co-Counsel for Respondent</td>
</tr>
<tr>
<td width="3%" class="detailData">&nbsp;</td>
<td width="30%" class="detailData">&nbsp;</td>
<td width="20%" class="detailData">&nbsp;</td>
<td width="47%" class="detailData">
MCANANY VAN CLEVE AND PHILLIPS<br>
10 E CAMBRIDGE CIRCLE DR<br>
STE 300<br>
KANSAS CITY,
KS
66103<br>
<b>Business: </b>
(913)
573-3319 <br> <br>
</td>
</tr>
</tbody></table>'
doc = Nokogiri::HTML(html)
rows = doc.xpath("//table[@class='detailRecordTable']//tr")
# address2 = doc.css('td:nth-of-type(4)').text.strip
# puts address2
@party_names = []
@party_types = []
@party_des = []
rows.each do |row|
nodes = row.css('.detailSeperator:nth-of-type(2), .detailSeperator:nth-of-type(4)')
nodes.each do |node|
name = node.text.strip.gsub("\n", '').gsub("\t", '')
parts = name.split(',')
name = if parts.length == 3
"#{parts[0]}, #{parts[1]}"
else
parts[0]
end
party_type = parts[-1].strip if parts && parts.length >= 2
addr = ("#{parts[0]}, #{parts[1]}" if parts.length == 2)
@party_names << name
@party_types << party_type
@party_des << addr
end
address = row.css('td:nth-of-type(2),td:nth-of-type(4)')
address.each do |node|
addr = node.text.strip.gsub("\n", '').gsub("\t", '')
parts = addr.split(',')
addr = ("#{parts[0]}, #{parts[1]}" if parts.length == 2)
@party_des << addr
end
end
@party_names.compact!
@party_names.reject(&:empty?)
@party_types.compact!
@party_des.compact!
@party_names_and_types = @party_names.zip(@party_types).map { |name, type| { part_name: name, party_type: type } }
The out put I have currrently is like this
{:part_name=>"SMALL, DANIEL", :party_type=>"Appellant &nbsp"}
{:part_name=>"KELLY, MARK EDWARD", :party_type=>"Attorney for Appellant"}
{:part_name=>"PITTMAN, KRISTI LANAE", :party_type=>"Co-Counsel for Appellant"}
{:part_name=>"RED SIMPSON, INC.", :party_type=>"Respondent "}
{:part_name=>"GREENWALD, DOUGLASMARK", :party_type=>"Attorney for Respondent"}
{:part_name=>"BENJAMIN, SAMANTHA NICOLE", :party_type=>"Co-Counsel for Respondent"}
how I am able to get the party address but how can I merge it with @party_names_and_types so I have the output like this
{:part_name=>"SMALL, DANIEL", :party_type=>"Appellant &nbsp"}
{:part_name=>"KELLY, MARK EDWARD", :party_type=>"Attorney for Appellant", :party_address => "134 N WATER STREETLIBERTY,MO 64068"}
{:part_name=>"PITTMAN, KRISTI LANAE", :party_type=>"Co-Counsel for Appellant",:party_address => "134 N WATER STREETLIBERTY,MO 64068"}
{:part_name=>"RED SIMPSON, INC.", :party_type=>"Respondent "}
{:part_name=>"GREENWALD, DOUGLASMARK", :party_type=>"Attorney for Respondent", :party_address => " 10 EAST CAMBRIDGE CIRCLE DRIVE KANSAS CITY,KS 66103"}
{:part_name=>"BENJAMIN, SAMANTHA NICOLE", :party_type=>"Co-Counsel for Respondent", :party_address => " MCANANY VAN CLEVE AND PHILLIPS 10 E CAMBRIDGE CIRCLE DR STE 300 KANSAS CITY,KS 66103", :party_des => "Business:(913) 573-3319"}
答案1
得分: 0
你关于表格结构“有点不寻常”的观点是正确的。你实现的逻辑,我不会说是错的,但对于这个表格,我不会采用它,因为关联的值(比如党派名称和党派地址)位于不同的行中。
这是我编写的代码,以获得你提到的期望输出:
require 'nokogiri'
# html = '你提供的HTML代码...'
doc = Nokogiri::HTML(html)
rows = doc.xpath("//table[@class='detailRecordTable']//tr")
@party_names_and_types = []
start = 0
step = 5
def format_text(text)
text.strip.gsub(" ", "").gsub("\n", ' ').gsub("\t", '')
end
def get_party_name_and_type(text)
parts = text.split(',')
name = parts.length == 3 ? "#{parts[0]}, #{parts[1]}" : parts[0]
party_type = format_text(parts[-1].strip) if parts && parts.length >= 2
{ party_name: name, party_type: party_type }
end
while start < rows.count
data_rows = rows.slice(start, step)
[0, 3].each do |row_num|
if row_num == 0
[1, 3].each do |col_num|
party_details = get_party_name_and_type(
format_text(data_rows[row_num].children.filter("td")[col_num].text)
)
address = data_rows[row_num+1].children.filter("td")[3].text if col_num == 3
party_details[:party_address] = format_text(address) unless address.nil? || address.empty?
@party_names_and_types << party_details
end
else
party_details = get_party_name_and_type(
format_text(data_rows[3].children.filter("td")[3].text)
)
address = data_rows[row_num+1].children.filter("td")[3].text
party_details[:party_address] = format_text(address) unless address.nil? || address.empty?
@party_names_and_types << party_details
end
end
start += step
end
puts "======@party_names_and_types======"
puts @party_names_and_types
输出:
======@party_names_and_types======
{:party_name=>"SMALL , DANIEL", :party_type=>"Appellant  "}
{:party_name=>"KELLY , MARK EDWARD ", :party_type=>"Attorney for Appellant", :party_address=>"134 N WATER STREETLIBERTY, MO 64068"}
{:party_name=>"PITTMAN , KRISTI LANAE ", :party_type=>"Co-Counsel for Appellant", :party_address=>"134 NORTH WATER STREETLIBERTY, MO 64068"}
{:party_name=>"RED SIMPSON, INC. ", :party_type=>"Respondent "}
{:party_name=>"GREENWALD , DOUGLAS MARK ", :party_type=>"Attorney for Respondent", :party_address=>"10 EAST CAMBRIDGE CIRCLE DRIVEKANSAS CITY, KS 66103"}
{:party_name=>"BENJAMIN, SAMANTHA NICOLE ", :party_type=>"Co-Counsel for Respondent", :party_address=>"MCANANY VAN CLEVE AND PHILLIPS10 E CAMBRIDGE CIRCLE DRSTE 300KANSAS CITY, KS 66103 Business:(913) 573-3319"}
我将在一段时间内更新答案,以解释其中的逻辑。希望这有所帮助。
英文:
You were right about the table structure being "a bit unusual".
The logic that you implemented, I won't say it was wrong, but for this table, I won't go with it since the associated values (like party name and party address) were in different rows.
Here is the code that I wrote to get the expected output as mentioned by you
require 'nokogiri'
# html = 'your provided html code...'
doc = Nokogiri::HTML(html)
rows = doc.xpath("//table[@class='detailRecordTable']//tr")
@party_names_and_types = []
start = 0
step = 5
def format_text(text)
text.strip.gsub(" ", "").gsub("\n", ' ').gsub("\t", '')
end
def get_party_name_and_type(text)
parts = text.split(',')
name = parts.length == 3 ? "#{parts[0]}, #{parts[1]}" : parts[0]
party_type = format_text(parts[-1].strip) if parts && parts.length >= 2
{ party_name: name, party_type: party_type }
end
while start < rows.count
data_rows = rows.slice(start, step)
[0, 3].each do |row_num|
if row_num == 0
[1, 3].each do |col_num|
party_details = get_party_name_and_type(
format_text(data_rows[row_num].children.filter("td")[col_num].text)
)
address = data_rows[row_num+1].children.filter("td")[3].text if col_num == 3
party_details[:party_address] = format_text(address) unless address.nil? || address.empty?
@party_names_and_types << party_details
end
else
party_details = get_party_name_and_type(
format_text(data_rows[3].children.filter("td")[3].text)
)
address = data_rows[row_num+1].children.filter("td")[3].text
party_details[:party_address] = format_text(address) unless address.nil? || address.empty?
@party_names_and_types << party_details
end
end
start += step
end
puts "======@party_names_and_types======"
puts @party_names_and_types
Output:
======@party_names_and_types======
{:party_name=>"SMALL , DANIEL", :party_type=>"Appellant  &nbsp"}
{:party_name=>"KELLY , MARK EDWARD ", :party_type=>"Attorney for Appellant", :party_address=>"134 N WATER STREETLIBERTY, MO 64068"}
{:party_name=>"PITTMAN , KRISTI LANAE ", :party_type=>"Co-Counsel for Appellant", :party_address=>"134 NORTH WATER STREETLIBERTY, MO 64068"}
{:party_name=>"RED SIMPSON, INC. ", :party_type=>"Respondent   "}
{:party_name=>"GREENWALD , DOUGLAS MARK ", :party_type=>"Attorney for Respondent", :party_address=>"10 EAST CAMBRIDGE CIRCLE DRIVEKANSAS CITY, KS 66103"}
{:party_name=>"BENJAMIN, SAMANTHA NICOLE ", :party_type=>"Co-Counsel for Respondent", :party_address=>"MCANANY VAN CLEVE AND PHILLIPS10 E CAMBRIDGE CIRCLE DRSTE 300KANSAS CITY, KS 66103 Business:(913) 573-3319"}
I'll update the answer to explain the logic in some time.
Hope this helps.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论