PHP parse html from url

A php script that scans the countries of the site by its address and makes a selection of the necessary data by HTML tags.

PHP parse html from url
php parse html from url

An example of a script for rebuilding site pages and displaying the necessary data that is contained in meta tags or in site blocks. You can also extract links, photos and other data.

Specifically, in this example, there are lines of code that save the found image by reference in the meta tags.

To send a crawl request to a link like this:

https://mysite.com/parsing.php?url=https://sitedonor.com/post/123

parsing.php

<?php
$myurl = $_GET['url'];
header("Content-Type: text/html; charset=utf-8");
function file_get_contents_curl($url)
{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}

$html = file_get_contents_curl($myurl);

//parsing begins here:
$doc = new DOMDocument();
@$doc->loadHTML('<?xml encoding="utf-8" ?>' . $html);
$nodes = $doc->getElementsByTagName('title');

//get and display what you need:
$title = $nodes->item(0)->nodeValue;

$metas = $doc->getElementsByTagName('meta');

for ($i = 0; $i < $metas->length; $i++)
{
    $meta = $metas->item($i);
    if($meta->getAttribute('name') == 'description')
        $description = $meta->getAttribute('content');
   
   if($meta->getAttribute('name') == 'keywords')
        $keywords = $meta->getAttribute('content');
}

$prop = $doc->getElementsByTagName('meta');

for ($i = 0; $i < $prop->length; $i++)
{
    $property = $prop->item($i);
    if($property->getAttribute('property') == 'og:image')
        $descriptions = $property->getAttribute('content');
	
	if($property->getAttribute('property') == 'og:url')
        $fsurl = $property->getAttribute('content');
}

echo "Title: $title". '<br/><br/>';
echo "Description: $description". '<br/><br/>';
echo "Keywords: $keywords". '<br/><br/>';
echo "Images: $descriptions". '<br/><br/>';
echo "LINK Post: $fsurl". '<br/><br/>';
$date = new DateTime();
echo $date->getTimestamp();

$path = './img/'. $date->getTimestamp() .'.png';//сохраняю картинку
file_put_contents($path, file_get_contents($descriptions));//сохраняю картинку
?> 

 

Our website is for sale! The domain and website has been running for more than 3 years, but it's time to part with it. There is no price, so I consider any offer. Contact us if you are interested in buying in the feedback form, we will discuss options, price and details for transferring the site. (script, database and everything else for the site to work).