PHP 简单 HTML DOM 解析器:访问自定义属性

2022-01-13 00:00:00 attributes dom php

我想访问我添加到 HTML 文件中某些元素的自定义属性,这是 littleBox="somevalue" 属性的示例

I want to access a custom attribute that I added to some elements in an HTML file, here's an example of the littleBox="somevalue" attribute

<div id="someId" littleBox="someValue">inner text</div>

以下不起作用:

foreach($html->find('div') as $element){
 echo $element;
 if(isset($element->type)){
 echo $element->littleBox;
   }
}

我看到一篇有类似问题的文章,但由于某种原因我无法复制它.这是我尝试过的:

I saw an article with a similar problem, but I couldn't replicate it for some reason. Here is what I tried:

function retrieveValue($str){
if (stripos($str, 'littleBox')){//check if element has it
$var=preg_split("/littleBox="/",$str);
//echo $var[1];
$var1=preg_split("/"/",$var[1]);
echo $var1[0];
}
else
return false;
}

每当我调用 retrieveValue() 函数时,什么都没有发生.$element (在上面的第一个 PHP 示例中)不是字符串吗?我不知道我是否错过了什么,但它没有返回任何东西.

When ever I call the retrieveValue() function, nothing happens. Is $element (in the first PHP example above) not a string? I don't know if I missed something but it's not returning anything.

这是完整的脚本:

<?php
require("../../simplehtmldom/simple_html_dom.php");

if (isset($_POST['submit'])){

$html = file_get_html($_POST['webURL']);

// Find all images 
foreach($html->find('div') as $element){
    echo $element;
   if(isset($element->type)!= false){
    echo retrieveValue($element);
   }
}
}


function retrieveValue($str){
if (stripos($str, 'littleBox')){//check if element has it
$var=preg_split("/littleBox="/",$str);
//echo $var[1];
$var1=preg_split("/"/",$var[1]);
return $var1[0];
}
else
return false;
}

?>

<form method="post">
Website URL<input type="text" name="webURL">
<br />
<input type="submit" name="submit">
</form>

推荐答案

你试过了吗:

$html->getElementById("someId")->getAttribute('littleBox');

您也可以使用 SimpleXML:

You could also use SimpleXML:

$html = '<div id="someId" littleBox="someValue">inner text</div>';
$dom = new DOMDocument;
$dom->loadXML($html);
$div = simplexml_import_dom($dom);
echo $div->attributes()->littleBox;

我建议不要使用正则表达式来解析 html,但这部分不应该是这样的:

I would advice against using regex to parse html but shouldn't this part be like this:

$str = $html->getElementById("someId")->outertext;
$var = preg_split('/littleBox="/', $str);
$var1 = preg_split('/"/',$var[1]);
echo $var1[0];

另请参阅此答案https://stackoverflow.com/a/8851091/1059001

相关文章