英文:
PHP preg_match_all extract id and name, where id in tag is optional
问题
我有以下的代码:
<?php
$html = '<div>
<div class="block">
<div class="id">10</div>
<div class="name">first element</div>
</div>
<div class="block">
<div class="name">second element</div>
</div>
<div class="block">
<div class="id">30</div>
<div class="name">third element</div>
</div>
</div>';
preg_match_all('/<div class="block">[\s]+<div class="id">(.*?)<\/div>[\s]+<div class="name">(.*?)<\/div>[\s]+<\/div>/ms', $html, $matches);
print_r($matches);
我想要获得包含id和name的数组,但第二个位置没有id,所以我的preg_match跳过了这个。如何生成一个不跳过的数组,并打印出类似这样的内容 [ ... [id => 0 // 或 null, name => 'second element'] ...]?
英文:
I have following code:
<?php
$html = '<div>
<div class="block">
<div class="id">10</div>
<div class="name">first element</div>
</div>
<div class="block">
<div class="name">second element</div>
</div>
<div class="block">
<div class="id">30</div>
<div class="name">third element</div>
</div>
</div>';
preg_match_all('/<div class="block">[\s]+<div class="id">(.*?)<\/div>[\s]+<div class="name">(.*?)<\/div>[\s]+<\/div>/ms', $html, $matches);
print_r($matches);
I want to get array with id and name, but the second position doesn't have id, so my preg match skipped this one. How can I generate array without skip and print sth like this [ ... [id => 0 // or null, name => 'second element'] ...]?
答案1
得分: 1
使用 DOMDocument
来解决这个任务;有很多很好的理由 不使用正则表达式。
假设您的 HTML 代码存储在 $html
变量中,请创建一个 DOMDocument
的实例,加载 HTML 代码,并初始化 DOMXPath
:
$dom = new DOMDocument();
libxml_use_internal_errors(1);
$dom->loadHTML($html, LIBXML_NOBLANKS);
$dom->formatOutput = True;
$xpath = new DOMXPath($dom);
使用 DOMXPath
来搜索所有具有类名 "name" 的 <div>
节点,并为结果准备一个空数组:
$nodes = $xpath->query('//div[@class="name"]');
$result = array();
对于找到的每个节点,运行一个额外的查询以查找具有类名 "id" 的可选节点,然后将记录添加到结果数组中:
foreach ($nodes as $node) {
$id = $xpath->query('div[@class="id"]', $node->parentNode);
$result[] = array(
'id' => $id->count() ? $id->item(0)->nodeValue : null,
'name' => $node->nodeValue
);
}
print_r($result);
这是结果:
Array
(
[0] => Array
(
[id] => 10
[name] => first element
)
[1] => Array
(
[id] =>
[name] => second element
)
[2] => Array
(
[id] => 30
[name] => third element
)
)
英文:
Use DOMDocument
to solve this task; there are a lot of good reasons not to use regular expressions.
Assuming your HTML code is stored in $html
variable, create an instance of DOMDocument
, load the HTML code, and initialize DOMXPath
:
$dom = new DOMDocument();
libxml_use_internal_errors(1);
$dom->loadHTML($html, LIBXML_NOBLANKS);
$dom->formatOutput = True;
$xpath = new DOMXPath($dom);
Use DOMXPath
to search for all <div>
nodes with class "name" and prepare an empty array for the results:
$nodes = $xpath->query('//div[@class="name"]');
$result = array();
For each node found, run an additional query to find the optional node with class "id", then add a record to the results array:
foreach ($nodes as $node) {
$id = $xpath->query('div[@class="id"]', $node->parentNode);
$result[] = array(
'id' => $id->count() ? $id->item(0)->nodeValue : null,
'name' => $node->nodeValue
);
}
print_r($result);
This is the result:
Array
(
[0] => Array
(
[id] => 10
[name] => first element
)
[1] => Array
(
[id] =>
[name] => second element
)
[2] => Array
(
[id] => 30
[name] => third element
)
)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论