PHP preg_match_all提取id和name,其中标签中的id是可选的。

huangapple go评论65阅读模式
英文:

PHP preg_match_all extract id and name, where id in tag is optional

问题

我有以下的代码:

<?php
$html = '<div>
    <div class="block">
        <div class="id">10</div>
        <div class="name">first element</div>
    </div>
    <div class="block">
        <div class="name">second element</div>
    </div>
    <div class="block">
        <div class="id">30</div>
        <div class="name">third element</div>
    </div>
</div>';

preg_match_all('/<div class="block">[\s]+<div class="id">(.*?)<\/div>[\s]+<div class="name">(.*?)<\/div>[\s]+<\/div>/ms', $html, $matches);

print_r($matches);

我想要获得包含id和name的数组,但第二个位置没有id,所以我的preg_match跳过了这个。如何生成一个不跳过的数组,并打印出类似这样的内容 [ ... [id => 0 // 或 null, name => 'second element'] ...]?

英文:

I have following code:

&lt;?php
$html = &#39;&lt;div&gt;
    &lt;div class=&quot;block&quot;&gt;
        &lt;div class=&quot;id&quot;&gt;10&lt;/div&gt;
        &lt;div class=&quot;name&quot;&gt;first element&lt;/div&gt;
    &lt;/div&gt;
    &lt;div class=&quot;block&quot;&gt;
        &lt;div class=&quot;name&quot;&gt;second element&lt;/div&gt;
    &lt;/div&gt;
    &lt;div class=&quot;block&quot;&gt;
        &lt;div class=&quot;id&quot;&gt;30&lt;/div&gt;
        &lt;div class=&quot;name&quot;&gt;third element&lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;&#39;;

preg_match_all(&#39;/&lt;div class=&quot;block&quot;&gt;[\s]+&lt;div class=&quot;id&quot;&gt;(.*?)&lt;\/div&gt;[\s]+&lt;div class=&quot;name&quot;&gt;(.*?)&lt;\/div&gt;[\s]+&lt;\/div&gt;/ms&#39;, $html, $matches);

print_r($matches);

I want to get array with id and name, but the second position doesn't have id, so my preg match skipped this one. How can I generate array without skip and print sth like this [ ... [id => 0 // or null, name => 'second element'] ...]?

答案1

得分: 1

使用 DOMDocument 来解决这个任务;有很多很好的理由 不使用正则表达式

假设您的 HTML 代码存储在 $html 变量中,请创建一个 DOMDocument 的实例,加载 HTML 代码,并初始化 DOMXPath

$dom = new DOMDocument();
libxml_use_internal_errors(1);
$dom->loadHTML($html, LIBXML_NOBLANKS);
$dom->formatOutput = True;
$xpath = new DOMXPath($dom);

使用 DOMXPath 来搜索所有具有类名 "name" 的 &lt;div&gt; 节点,并为结果准备一个空数组:

$nodes = $xpath->query('//div[@class="name"]');
$result = array();

对于找到的每个节点,运行一个额外的查询以查找具有类名 "id" 的可选节点,然后将记录添加到结果数组中:

foreach ($nodes as $node) {
    $id = $xpath->query('div[@class="id"]', $node->parentNode);
    
    $result[] = array(
        'id' => $id->count() ? $id->item(0)->nodeValue : null,
        'name' => $node->nodeValue
    );
}

print_r($result);

这是结果:

Array
(
    [0] => Array
        (
            [id] => 10
            [name] => first element
        )

    [1] => Array
        (
            [id] => 
            [name] => second element
        )

    [2] => Array
        (
            [id] => 30
            [name] => third element
        )

)
英文:

Use DOMDocument to solve this task; there are a lot of good reasons not to use regular expressions.

Assuming your HTML code is stored in $html variable, create an instance of DOMDocument, load the HTML code, and initialize DOMXPath:

$dom = new DOMDocument();
libxml_use_internal_errors(1);
$dom-&gt;loadHTML($html, LIBXML_NOBLANKS);
$dom-&gt;formatOutput = True;
$xpath = new DOMXPath($dom);

Use DOMXPath to search for all &lt;div&gt; nodes with class "name" and prepare an empty array for the results:

$nodes = $xpath-&gt;query(&#39;//div[@class=&quot;name&quot;]&#39;);
$result = array();

For each node found, run an additional query to find the optional node with class "id", then add a record to the results array:

foreach ($nodes as $node) {
    $id = $xpath-&gt;query(&#39;div[@class=&quot;id&quot;]&#39;, $node-&gt;parentNode);
    
    $result[] = array(
        &#39;id&#39; =&gt; $id-&gt;count() ? $id-&gt;item(0)-&gt;nodeValue : null,
        &#39;name&#39; =&gt; $node-&gt;nodeValue
    );
}

print_r($result);

This is the result:

Array
(
    [0] =&gt; Array
        (
            [id] =&gt; 10
            [name] =&gt; first element
        )

    [1] =&gt; Array
        (
            [id] =&gt; 
            [name] =&gt; second element
        )

    [2] =&gt; Array
        (
            [id] =&gt; 30
            [name] =&gt; third element
        )

)

huangapple
  • 本文由 发表于 2023年2月18日 20:43:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/75493425.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定