Skip to content

Commit

Permalink
Merge pull request #5 from eclipxe13/development
Browse files Browse the repository at this point in the history
Downloaders with fixed data (Version 4.0.0)
  • Loading branch information
eclipxe13 authored Jul 1, 2023
2 parents e48f17d + 3b11457 commit 9c932a8
Show file tree
Hide file tree
Showing 15 changed files with 380 additions and 79 deletions.
4 changes: 2 additions & 2 deletions .phive/phars.xml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<?xml version="1.0" encoding="UTF-8"?>
<phive xmlns="https://phar.io/phive">
<phar name="php-cs-fixer" version="^3.16.0" installed="3.16.0" location="./tools/php-cs-fixer" copy="false"/>
<phar name="php-cs-fixer" version="^3.20.0" installed="3.20.0" location="./tools/php-cs-fixer" copy="false"/>
<phar name="phpcs" version="^3.7.2" installed="3.7.2" location="./tools/phpcs" copy="false"/>
<phar name="phpcbf" version="^3.7.2" installed="3.7.2" location="./tools/phpcbf" copy="false"/>
<phar name="phpstan" version="^1.10.15" installed="1.10.15" location="./tools/phpstan" copy="false"/>
<phar name="phpstan" version="^1.10.22" installed="1.10.22" location="./tools/phpstan" copy="false"/>
</phive>
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,11 +64,19 @@ If you only want to download the source file from SEPOMEX, check script file `sc
/**
* @var string $destinationFile is the path where the destination file will be located.
*/
$downloader = new \Eclipxe\SepomexPhp\Downloader\Downloader();
$downloader = new \Eclipxe\SepomexPhp\Downloader\SymfonyDownloader();
printf("Download from %s to %s\n", $downloader::LINK, $destinationFile);
$downloader->downloadTo($destinationFile);
```

It is possible to use your own downloader, just implement the interface `DownloaderInterface`.

The project provides the following implementations:

- `SymfonyDownloader`: It uses *Symfony Browser Kit* to perform the download (recommended).
- `GuzzleDownloader`: Uses *Guzzle* and fixed data to perform the download.
- `PhpStreamsDownloader`: Uses plain PHP functions and fixed data to perform the download.

If you want to import the source file from SEPOMEX into your own SQLite3 database, check `create-sqlite-from-raw.php`.

```php
Expand Down
13 changes: 10 additions & 3 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,19 @@
"php": ">=8.1",
"ext-pdo": "*",
"ext-iconv": "*",
"ext-zip": "*",
"symfony/browser-kit": "^6.2"
"ext-zip": "*"
},
"require-dev": {
"phpunit/phpunit": "^9.5",
"dms/phpunit-arraysubset-asserts": "^0.4.0"
"dms/phpunit-arraysubset-asserts": "^0.4.0",
"guzzlehttp/guzzle": "^7.7",
"symfony/browser-kit": "^6.3",
"symfony/http-client": "^6.3",
"symfony/mime": "^6.3"
},
"suggest": {
"guzzlehttp/guzzle": "Perform the download process using Guzzle",
"symfony/browser-kit": "Perform the download process using Symfony Browser Kit (recommended)"
},
"scripts": {
"dev:build": ["@dev:fix-style", "@dev:test"],
Expand Down
17 changes: 17 additions & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,23 @@

**This document is in spanish**

## Versión 4.0.0 2023-06-30

El proyecto tiene cambios menores en la descarga, pero que rompen la compatibilidad.
Puede ver la guía de actualización en el archivo [CHANGES_VERSION_3.0_TO_4.0.md](CHANGES_VERSION_3.0_TO_4.0.md).

En esta actualización, se supone que es posible reutilizar los datos del formulario indefinidamente, por lo tanto,
no sería necesario utilizar una herramienta tan completa como *Symfony Browser Kit* y simplemente se puede generar
la descarga utilizando un cliente HTTP.

### Cambios para usuarios

- La clase `Downloader` se renombró a `SymfonyDownloader`.
- Se agregó `GuzzleDownloader` que también permite hacer la descarga del recurso público.
- Se agregó `PhpStreamsDownloader` que también permite hacer la descarga del recurso público sin dependencias.
- Ya no es necesario instalar forzosamente `symfony/browser-kit`.
- El proyecto sugiere `symfony/browser-kit` y `guzzlehttp/guzzle`, haciendo opcional su instalación.

## Version 3.0.0 2023-05-13

El proyecto cambió drásticamente. El mejor consejo es volver a implementar la librería en esta nueva versión.
Expand Down
37 changes: 37 additions & 0 deletions docs/CHANGES_VERSION_3.0_TO_4.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Cambios principales de la versión 3.0 a 4.0

En esta actualización, se supone que es posible reutilizar los datos del formulario indefinidamente, por lo tanto,
no sería necesario utilizar una herramienta tan completa como *Symfony Browser Kit* y simplemente se puede generar
la descarga utilizando un cliente HTTP.


## La clase `Downloader` se renombró a `SymfonyDownloader`

Sustituya en su proyecto la importación de la clase, por ejemplo:

```diff
- use Eclipxe\SepomexPhp\Downloader\Downloader;
+ use Eclipxe\SepomexPhp\Downloader\SymfonyDownloader as Downloader;
```

La clase `SymfonyDownloader` se comporta como un navegador que entra la página del recurso,
selecciona la opción deseada y envía el formulario. Por lo anterior, es la opción que mejor
funcione a largo plazo.

También se pasó la dependencia de la librería `symfony/browser-kit` a una sugerencia,
debido a que se puede utilizar cualquier implementación.

## Se agregó `GuzzleDownloader`

La clase `GuzzleDownloader` permite hacer la descarga del recurso público pero utilizando
[Guzzle](https://github.com/guzzle/guzzle).

A diferencia de la clase `SymfonyDownloader`, no interpreta el formulario y utiliza datos fijos.

## Se agregó `PhpStreamsDownloader`

La clase `PhpStreamsDownloader` permite hacer la descarga del recurso público, pero utilizando
funciones de PHP. No necesita ninguna dependencia externa, pero en algunos entornos restringidos
podría llegar a fallar.

A diferencia de la clase `SymfonyDownloader`, no interpreta el formulario y utiliza datos fijos.
4 changes: 2 additions & 2 deletions scripts/create-sqlite-from-raw.php
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

declare(strict_types=1);

use Eclipxe\SepomexPhp\Downloader\Downloader;
use Eclipxe\SepomexPhp\Downloader\SymfonyDownloader;
use Eclipxe\SepomexPhp\Importer\PdoImporter;

require_once __DIR__ . '/../vendor/autoload.php';
Expand All @@ -25,7 +25,7 @@

// raw file
if (! file_exists($rawFile)) {
$downloader = new Downloader();
$downloader = new SymfonyDownloader();
printf("File %s does not exists, will be downloaded from %s\n", $rawFile, $downloader::LINK);
$downloader->downloadTo($rawFile);
}
Expand Down
4 changes: 2 additions & 2 deletions scripts/download.php
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

declare(strict_types=1);

use Eclipxe\SepomexPhp\Downloader\Downloader;
use Eclipxe\SepomexPhp\Downloader\SymfonyDownloader;

require_once __DIR__ . '/../vendor/autoload.php';

Expand All @@ -32,7 +32,7 @@
if ('' === $destinationFile) {
$destinationFile = $defaultDestinationFile;
}
$downloader = new Downloader();
$downloader = new SymfonyDownloader();
echo sprintf("Download from %s to %s\n", $downloader::LINK, $destinationFile);

try {
Expand Down
2 changes: 1 addition & 1 deletion sonar-project.properties
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ sonar.sourceEncoding=UTF-8
sonar.language=php
sonar.sources=src
sonar.tests=tests
sonar.exclusions=vendor/,tools/,build/,tests/_files/
sonar.test.exclusions=tests/_files/**/*
sonar.working.directory=build/.scannerwork
sonar.php.tests.reportPath=build/sonar-junit.xml
sonar.php.coverage.reportPaths=build/sonar-coverage.xml
68 changes: 0 additions & 68 deletions src/Downloader/Downloader.php

This file was deleted.

2 changes: 2 additions & 0 deletions src/Downloader/DownloaderInterface.php
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@

interface DownloaderInterface
{
public const LINK = 'https://www.correosdemexico.gob.mx/SSLServicios/ConsultaCP/CodigoPostal_Exportar.aspx';

/**
* @throws RuntimeException
*/
Expand Down
83 changes: 83 additions & 0 deletions src/Downloader/DownloaderTrait.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
<?php

declare(strict_types=1);

namespace Eclipxe\SepomexPhp\Downloader;

use RuntimeException;
use ZipArchive;

/**
* @internal
*/
trait DownloaderTrait
{
/**
* Returns a commonly known data that can be used to perform a POST request
*
* @return array<string, string>
*/
private function fixedFormData(): array
{
return [
'__EVENTTARGET' => '',
'__EVENTARGUMENT' => '',
'__LASTFOCUS' => '',
'__VIEWSTATE' => '/wEPDwUINzcwOTQyOTgPZBYCAgEPZBYCAgEPZBYGAgMPDxYCHgRUZXh0BTjDmmx0aW1hIEFjdHVh'
. 'bGl6YWNpw7NuIGRlIEluZm9ybWFjacOzbjogSnVuaW8gMjkgZGUgMjAyM2RkAgcPEA8WBh4NRGF0'
. 'YVRleHRGaWVsZAUDRWRvHg5EYXRhVmFsdWVGaWVsZAUFSWRFZG8eC18hRGF0YUJvdW5kZ2QQFSEj'
. 'LS0tLS0tLS0tLSBUICBvICBkICBvICBzIC0tLS0tLS0tLS0OQWd1YXNjYWxpZW50ZXMPQmFqYSBD'
. 'YWxpZm9ybmlhE0JhamEgQ2FsaWZvcm5pYSBTdXIIQ2FtcGVjaGUUQ29haHVpbGEgZGUgWmFyYWdv'
. 'emEGQ29saW1hB0NoaWFwYXMJQ2hpaHVhaHVhEUNpdWRhZCBkZSBNw6l4aWNvB0R1cmFuZ28KR3Vh'
. 'bmFqdWF0bwhHdWVycmVybwdIaWRhbGdvB0phbGlzY28HTcOpeGljbxRNaWNob2Fjw6FuIGRlIE9j'
. 'YW1wbwdNb3JlbG9zB05heWFyaXQLTnVldm8gTGXDs24GT2F4YWNhBlB1ZWJsYQpRdWVyw6l0YXJv'
. 'DFF1aW50YW5hIFJvbxBTYW4gTHVpcyBQb3Rvc8OtB1NpbmFsb2EGU29ub3JhB1RhYmFzY28KVGFt'
. 'YXVsaXBhcwhUbGF4Y2FsYR9WZXJhY3J1eiBkZSBJZ25hY2lvIGRlIGxhIExsYXZlCFl1Y2F0w6Fu'
. 'CVphY2F0ZWNhcxUhAjAwAjAxAjAyAjAzAjA0AjA1AjA2AjA3AjA4AjA5AjEwAjExAjEyAjEzAjE0'
. 'AjE1AjE2AjE3AjE4AjE5AjIwAjIxAjIyAjIzAjI0AjI1AjI2AjI3AjI4AjI5AjMwAjMxAjMyFCsD'
. 'IWdnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2dnZ2RkAh0PPCsACwBkGAEFHl9fQ29udHJv'
. 'bHNSZXF1aXJlUG9zdEJhY2tLZXlfXxYBBQtidG5EZXNjYXJnYXo4r2owexoHvaYUs8ZA4j6MDfNz',
'__VIEWSTATEGENERATOR' => 'BE1A6D2E',
'__EVENTVALIDATION' => '/wEWKAK0hrLiBALG/OLvBgLWk4iCCgLWk4SCCgLWk4CCCgLWk7yCCgLWk7iCCgLWk7SCCgLWk7CC'
. 'CgLWk6yCCgLWk+iBCgLWk+SBCgLJk4iCCgLJk4SCCgLJk4CCCgLJk7yCCgLJk7iCCgLJk7SCCgLJ'
. 'k7CCCgLJk6yCCgLJk+iBCgLJk+SBCgLIk4iCCgLIk4SCCgLIk4CCCgLIk7yCCgLIk7iCCgLIk7SC'
. 'CgLIk7CCCgLIk6yCCgLIk+iBCgLIk+SBCgLLk4iCCgLLk4SCCgLLk4CCCgLL+uTWBALa4Za4AgK+'
. 'qOyRAQLI56b6CwL1/KjtBZ0Iwsb2glbyqEbKgPFJYu0SWNmk',
'cboEdo' => '00',
'rblTipo' => 'txt',
'btnDescarga.x' => '10',
'btnDescarga.y' => '10',
];
}

private function extractFirstFileTo(string $zipPath, string $pattern, string $destinationPath): void
{
$zipArchive = new ZipArchive();
if (true !== $zipArchive->open($zipPath, ZipArchive::RDONLY)) {
throw new RuntimeException('Cannot open downloaded data');
}
$selectedName = null;
for ($i = 0; $i < $zipArchive->numFiles; $i++) {
$currentName = (string) $zipArchive->getNameIndex($i);
if (! fnmatch($pattern, $currentName)) {
continue;
}
$selectedName = $currentName;
}
if (null === $selectedName) {
throw new RuntimeException(
sprintf('Cannot find a text file that match "%s" inside the downloaded data', $pattern)
);
}
if (false === $destinationStream = fopen($destinationPath, 'w')) {
throw new RuntimeException("Unable to open or create $destinationPath");
}
if (false === $sourceStream = $zipArchive->getStream($selectedName)) {
throw new RuntimeException("Unable to open stream from source $selectedName");
}
if (false === stream_copy_to_stream($sourceStream, $destinationStream)) {
throw new RuntimeException("Unable to write contents on $destinationPath");
}
$zipArchive->close();
}
}
45 changes: 45 additions & 0 deletions src/Downloader/GuzzleDownloader.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
<?php

declare(strict_types=1);

namespace Eclipxe\SepomexPhp\Downloader;

use GuzzleHttp\Client;
use GuzzleHttp\ClientInterface;
use GuzzleHttp\RequestOptions;
use RuntimeException;

final class GuzzleDownloader implements DownloaderInterface
{
use DownloaderTrait;

public function __construct(
public readonly ClientInterface $client = new Client(),
) {
}

public function downloadTo(string $destinationFile): void
{
if (false === $zipTempFile = tempnam('', '')) {
throw new RuntimeException('Unable to create a temporary file');
}

$response = $this->client->request('POST', self::LINK, [
RequestOptions::FORM_PARAMS => $this->fixedFormData(),
RequestOptions::SINK => $zipTempFile,
]);
if (200 !== $response->getStatusCode()) {
throw new RuntimeException(
sprintf('Received a non 200 HTTP Status Code (code: %s)', $response->getStatusCode())
);
}
if (! str_contains($response->getHeaderLine('Content-Disposition'), 'attachment')) {
throw new RuntimeException(
sprintf('Unexpected response content disposition: %s', $response->getHeaderLine('Content-Disposition'))
);
}

$this->extractFirstFileTo($zipTempFile, '*.txt', $destinationFile);
unlink($zipTempFile);
}
}
Loading

0 comments on commit 9c932a8

Please sign in to comment.