preg match all - preg_match_all specific directory/path using php
Get the solution ↓↓↓I am am trying to get all links from a page but i would like to match specific directory/path. i am currently using the code below:
preg_match_all("/a[\s]+[^>]*?href[\s]?=[\s\"\']+(.*?)[\"\']+.*?>([^<]+|.*?)?<\/a>/is", $sPageContent, $aResults);
the above code gets all the link of the page but i need s solution that would get links from specific directory, I like to match directories/paths with /music/music/.
For example, I have these links:
https://www.example.co.uk/music/music/397/adoramus-te/
https://www.example.co.uk/music/music/3113/obsesi/
https://www.example.co.uk/music/music/2707/the-piano/
https://www.example.co.uk/music/music/2677/irreemplazable/
https://www.example.co.uk/music/music/25981/lo/
https://www.example.co.uk/music/top/1243/core/
https://www.example.co.uk/music/top/12/late/
https://www.example.co.uk/music/top/13/new/
From the links above, I want to get all links that looks like these:
https://www.example.co.uk/music/music/397/adoramus-te/
https://www.example.co.uk/music/music/3113/obsesi/
https://www.example.co.uk/music/music/2707/the-piano/
https://www.example.co.uk/music/music/2677/irreemplazable/
https://www.example.co.uk/music/music/25981/lo/
but ignore every other links
Answer
Solution:
You might use for example DOMDocument to get the data, and then get all the anchors from it.
Then use a pattern to match/music/music/
from the first forward slash after https://
^https?://[^/]+/music/music/\S+$
Explanation
^
Start of stringhttps?://
Match the protocol with optionals
[^/]+
Match 1+ times any char except/
/music/music/
Match literally\S+
Match 1+ times a non whitespace char$
End of string
Example code
$dom = new DOMDocument();
$dom->loadHTML($data);
$anchors = $dom->getElementsByTagName("a");
foreach ($anchors as $anchor) {
$url = $anchor->getAttribute("href");
if (preg_match("~^https?://[^/]+/music/music/\S+$~", $url)) {
echo $url . PHP_EOL;
}
}
Output
https://www.example.co.uk/music/music/397/adoramus-te/
https://www.example.co.uk/music/music/3113/obsesi/
https://www.example.co.uk/music/music/2707/the-piano/
https://www.example.co.uk/music/music/2677/irreemplazable/
https://www.example.co.uk/music/music/25981/lo/
Share solution ↓
Additional Information:
Link To Answer People are also looking for solutions of the problem: a non-numeric value encountered
Didn't find the answer?
Our community is visited by hundreds of web development professionals every day. Ask your question and get a quick answer for free.
Similar questions
Find the answer in similar questions on our website.
Write quick answer
Do you know the answer to this question? Write a quick response to it. With your help, we will make our community stronger.