PHP parse html from url

A php script that scans the countries of the site by its address and makes a selection of the necessary data by HTML tags.

PHP parse html from url
php parse html from url

An example of a script for rebuilding site pages and displaying the necessary data that is contained in meta tags or in site blocks. You can also extract links, photos and other data.

Specifically, in this example, there are lines of code that save the found image by reference in the meta tags.

To send a crawl request to a link like this:

https://mysite.com/parsing.php?url=https://sitedonor.com/post/123

parsing.php

<?php
$myurl = $_GET['url'];
header("Content-Type: text/html; charset=utf-8");
function file_get_contents_curl($url)
{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}

$html = file_get_contents_curl($myurl);

//parsing begins here:
$doc = new DOMDocument();
@$doc->loadHTML('<?xml encoding="utf-8" ?>' . $html);
$nodes = $doc->getElementsByTagName('title');

//get and display what you need:
$title = $nodes->item(0)->nodeValue;

$metas = $doc->getElementsByTagName('meta');

for ($i = 0; $i < $metas->length; $i++)
{
    $meta = $metas->item($i);
    if($meta->getAttribute('name') == 'description')
        $description = $meta->getAttribute('content');
   
   if($meta->getAttribute('name') == 'keywords')
        $keywords = $meta->getAttribute('content');
}

$prop = $doc->getElementsByTagName('meta');

for ($i = 0; $i < $prop->length; $i++)
{
    $property = $prop->item($i);
    if($property->getAttribute('property') == 'og:image')
        $descriptions = $property->getAttribute('content');
	
	if($property->getAttribute('property') == 'og:url')
        $fsurl = $property->getAttribute('content');
}

echo "Title: $title". '<br/><br/>';
echo "Description: $description". '<br/><br/>';
echo "Keywords: $keywords". '<br/><br/>';
echo "Images: $descriptions". '<br/><br/>';
echo "LINK Post: $fsurl". '<br/><br/>';
$date = new DateTime();
echo $date->getTimestamp();

$path = './img/'. $date->getTimestamp() .'.png';//сохраняю картинку
file_put_contents($path, file_get_contents($descriptions));//сохраняю картинку
?> 

 



on the site are taken from open sources. The site does not contain files for download. All links to files from open sources. Owners of copyrights to the material may request removal of the post from the site.