html - Return each div with a certain class name in PHP
Get the solution ↓↓↓OK, So I have a page that has that has images on it that I'm looking to scrape and return the following information:
- Base Image URL ("website.com/imagepage")
- Image URL ("website.com/image.png")
- Image QUOTE if it has one ("Wow, nice image")
I have it working to return ONE Image, but I need it to return all of them (there is about 5)
This is what I have at the moment:
function getMostRecentScreenshot($url) {
$content = file_get_contents($url);
$first_step = explode('<div class="imageWall5Floaters">' , $content );
$second_step = explode('<div style="clear: left;"></div>' , $first_step[1] );
return $second_step[0];
}
This is what it returns
<div class="floatHelp">
<a href="websiteurl.com/imagepage" onclick="return OnScreenshotClicked(9384938);" class="profile_media_item modalContentLink " data-desired-aspect="1.77777777778">
<div style="background-image: url('website.com/image');" class="imgWallItem " id="imgWallItem_757249198">
<div style="position: relative;">
<input type="checkbox" style="position: absolute; display: none;" name="screenshots[9384938]" class="screenshot_checkbox" id="screenshot_checkbox_9384938" />
</div>
<div class="imgWallHover" id="imgWallHover9384938">
<div class="imgWallHoverBottom">
<div class="imgWallHoverDescription ">
<q class="ellipsis">Quote about the image</q>
</div>
</div>
</div>
</div>
</a>
The give images have different ID's (the 9384938 part).
How would I get the information needed from what it returns?
I have another function at the moment that returns the data for one of the images (kind of), but it's basically just the exact same thing with code between the explode, which is very messy.
Answer
Solution:
You could use PHP'sDOMDocument
class with this function:
function getDataFromHTML($html) {
$doc = new DOMDocument();
$html = $doc->loadHTML($html);
foreach($doc->getElementsByTagName('a') as $a) {
if (strpos($a->getAttribute('class'), 'profile_media_item') !== false) {
$row = [];
$row['baseURL'] = $a->getAttribute('href');
foreach($a->getElementsByTagName('div') as $div) {
preg_match("~(?<=url\(['\"]).*?(?=['\"])~",
$div->getAttribute('style'), $attr);
$row['imageURL'] = reset($attr);
foreach($a->getElementsByTagName('q') as $q) {
$row['quote'] = $q->textContent;
break;
}
break;
}
$result[] = $row;
}
}
return $result;
}
Call it as:
$result = getDataFromHTML($html);
Output for the sample data is:
array (
array (
'baseURL' => 'websiteurl.com/imagepage',
'imageURL' => 'website.com/image',
'quote' => 'Quote about the image'
)
)
The outer array would have more such entries if run on a HTML string that has several of those DOM structures.
Share solution ↓
Additional Information:
Link To Answer People are also looking for solutions of the problem: foreach() argument must be of type array|object, null given
Didn't find the answer?
Our community is visited by hundreds of web development professionals every day. Ask your question and get a quick answer for free.
Similar questions
Find the answer in similar questions on our website.
Write quick answer
Do you know the answer to this question? Write a quick response to it. With your help, we will make our community stronger.