Resolving Relative Urls
When you scrape urls from a website you will come across relative urls like
/path/to/page
, ../path/to/page
, ?param=value
, #anchor
and alike. This
package makes it a breeze to resolve these urls to absolute ones with the url
of the page where they have been found on.
$documentUrl = Url::parse('https://www.example.com/foo/bar/baz');
$relativeLinks = [
'/path/to/page',
'../path/to/page',
'?param=value',
'#anchor'
];
$absoluteLinks = array_map(function($relativeLink) use ($documentUrl) {
return $documentUrl->resolve($relativeLink)->toString();
}, $relativeLinks);
var_dump($absoluteLinks);
Output
array(4) {
[0]=>
string(36) "https://www.example.com/path/to/page"
[1]=>
string(40) "https://www.example.com/foo/path/to/page"
[2]=>
string(47) "https://www.example.com/foo/bar/baz?param=value"
[3]=>
string(42) "https://www.example.com/foo/bar/baz#anchor"
}
If you pass an absolute url to resolve()
it will just return that absolute
url.