New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for PSR-18 HTTP clients #826
base: master
Are you sure you want to change the base?
Add support for PSR-18 HTTP clients #826
Conversation
f8d385f
to
e5dce13
Compare
e5dce13
to
bd54a17
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed up to Add SimplePie::set_http_client(), add tests
src/HTTP/Psr18Client.php
Outdated
$permanentUrl = $requestedUrl; | ||
} | ||
|
||
$request = $request->withUri($this->uriFactory->createUri($requestedUrl)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-absolute URLs are allowed these days: https://httpwg.org/specs/rfc9110.html#field.location
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what's the problem. This is part of the UriFactory and Request class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the location
header could have a non-absolute url (without host). But this is not a problem.
$this->uriFactory->createUri()
creates a new Uri instance from the non-absolute url. This uri is set to $request->withUri($uri)
. This will take care of the missing host:
This method MUST update the Host header of the returned request by
* default if the URI contains a host component. If the URI does not
* contain a host component, any pre-existing Host header MUST be carried
* over to the returned request.
See https://github.com/php-fig/http-message/blob/2.0/src/RequestInterface.php#L99-L129
$requestedUrl = $response->getHeaderLine('Location'); | ||
|
||
if ($statusCode === 301) { | ||
$permanentUrl = $requestedUrl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can do anything about being passed a client that already handles redirects (like selfoss does).
We might not have other choice than adding support for each specific one. For example:
if (class_exists(\GuzzleHttp\RedirectMiddleware::class) && $response instanceof \GuzzleHttp\Psr7\Response && $response->hasHeader(\GuzzleHttp\RedirectMiddleware::HISTORY_HEADER)) {
$permanentUrl = self::getGuzzleHistoryPermanentUrl($url, $response);
}
/**
* Get permanent URL of the response.
* RedirectMiddleware will need to be enabled for this to work.
*
* @param string $url requested URL, to use as a fallback
* @param GuzzleHttp\Psr7\Response $response response to inspect
*
* @return string URL we should change the future requests to
*/
public static function getGuzzleHistoryPermanentUrl(string $url, \GuzzleHttp\Psr7\Response $response) {
// Sequence of fetched URLs
$urlStack = array_merge([$url], $response->getHeader(\GuzzleHttp\RedirectMiddleware::HISTORY_HEADER));
// Let’s consider the first URL a permanent redirect since it is our start point.
$statusStack = array_merge([301], $response->getHeader(\GuzzleHttp\RedirectMiddleware::STATUS_HISTORY_HEADER));
// Follow the chain util first non-permanent redirect, since non-permanent
// redirect could invalidate permanent ones later in the chain.
$index = 0;
while (($statusStack[$i] === 301 || $statusStack[$i] === 308) && $index < count($urlStack)) {
$index++;
}
return $urlStack[$index - 1];
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a PSR-18 client handles the redirects they will return a PSR-7 Response
with a 200
(or other then 301||308
) status code. So we have no chance to get the final requested url. But this is not a problem because File
will be deprecated and removed. I would not put any work into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but we need to get the permanent URL for features like fossar/selfoss#1178.
Unless we want to deprecate SimplePie::$permanent_url
and expose the Response
in SimplePie
so that apps could extract the permanent URL themselves.
Or we could instruct consumers to not pass clients with RedirectMiddleware
in the docs but that is error-prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should deprecate also SimplePie::$permanent_url
while deprecating File
, and we should expose Psr\Http\Message\ResponseInterface
instead. The extraction of the permanent URL would be a detail of the provided PSR-18 client. This has to be discussed in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, looking at the Guzzle docs, their redirect middleware will apparently not be used when using Guzzle as PSR-18 client:
This option has no effect when making requests using
GuzzleHttp\Client::sendRequest()
. In order to stay compliant with PSR-18 any redirect response is returned as is.
That means the PSR-18 client will not be aware that the requests are connected when we handle redirects ourselves and there will be no way to extract the permanent URL from it.
$filepath = dirname(__FILE__, 3) . '/data/this-file-does-not-exist'; | ||
|
||
$this->expectException(HttpException::class); | ||
$this->expectExceptionMessage('file_get_contents('.$filepath.'): Failed to open stream: No such file or directory'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should match error messages like this. I do not think they are guaranteed to be stable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm checking the error message to be sure that the test failed for the expected reason. As an alternative we could throw a new FileNotFoundException
.
src/HTTP/Psr18Client.php
Outdated
try { | ||
$raw = file_get_contents($path); | ||
} catch (Throwable $th) { | ||
throw new HttpException($th->getMessage(), $th->getCode(), $th); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file_get_contents
should not throw an exception by itself. I presume this is to catch it if app throws in the handler passed to set_error_handler
? Perhaps add a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file_get_contents() can generate Errors/Throwables. This is tested here: https://github.com/simplepie/simplepie/pull/777/files#diff-71f5ab2331097a56ec44e0ee126d69e4ef859f22a076b0f0055ebbd3a6303e86
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AIUI, E_WARNING
will not be converted into throwable unless the application defines a custom error handler with set_error_handler
, that throws.
That tests catches the throw
below:
if ($raw === false) {
throw new HttpException('file_get_contents() could not read the file', 1);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user has a custom error handler we can catch the error and throw it as a HttpException. The tests expect the message from the PHP Error as file_get_contents(/path/to/file): Failed to open stream: No such file or directory
.
If the user has no custom error handler, $raw will be false and we throw a HttpException with the message file_get_contents() could not read the file
. (Saying this I'm realizing that we have no tests for this second case.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a is_readable()
check in #848.
src/HTTP/Psr18Client.php
Outdated
try { | ||
$raw = file_get_contents($path); | ||
} catch (Throwable $th) { | ||
throw new HttpException($th->getMessage(), $th->getCode(), $th); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way, the error code will probably not be meaningful for HttpException
.
839643d
to
1a218b6
Compare
src/Sanitize.php
Outdated
private function get_http_client(): Client | ||
{ | ||
if ($this->http_client === null) { | ||
$this->http_client = new FileClient( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like SimplePie
will no longer share input data. Perhaps we should have separate internal set_http_client
method that will continue accepting it from SimplePie
and a public one set_http_factories
for consumers that want to use Sanitize
or Locator
outside the context of SimplePie
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanitize
has already set_http_client()
for PSR classes and the (deprecated) pass_file_data()
. If one has not called set_http_client()
, the http client will be created on the deprecated file data.
ea32232
to
223e6ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have finished reordering the commits and noted some more stuff I have noticed.
UriFactoryInterface $uri_factory | ||
): void { | ||
$this->http_client_injected = true; | ||
$this->http_client = new Psr18Client($http_client, $request_factory, $uri_factory); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have seen some libraries to use Psr18ClientDiscovery
. While I dislike this on principle, it is certainly convenient.
} | ||
|
||
/** | ||
* Return the string representation as a URI reference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should just use {@inheritdoc}
here for ApiGen.
Weirdly, Symfony started removing it symfony/symfony#47390. Perhaps because they no longer seem to have generated API docs – quick search only found https://symfony.com/blog/new-symfony-api-documentation which returns 404.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of {@inheritdoc}
either.
CHANGELOG.md
Outdated
- New method `SimplePie\Locator::set_http_client()` for providing PSR-18 HTTP client and PSR-17 factories by @Art4 in [#777](https://github.com/simplepie/simplepie/pull/777) | ||
- New method `SimplePie\Sanitize::set_http_client()` for providing PSR-18 HTTP client and PSR-17 factories by @Art4 in [#777](https://github.com/simplepie/simplepie/pull/777) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #815, we marked the last two as internal. Do we want to promote them to API? I think that would only make sense if we expect consumers to instantiate these classes directly, rather than only having SimplePie
create them internally.
For 2.0, I would like to minimize the API surface marking everything final and private, and only opening methods that have good use cases. So we should probably decide if we want to allow consumers to create Locator
and Sanitize
instances, and make set_http_client()
methods on them internal, so that we do not need to deprecate them again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last two was set internal
because they accept an also internal SimplePie\HTTP\Client
. #777 changes this to PSR-18 client and PSR-17 factories.
I agree with you that we should minimize the API surface but atm it is allowed to create Locator
and Sanitize
instances and we are referencing Sanitize::set_http_client()
as an alternative for the deprecated Sanitize::pass_file_data()
and maybe should do the same with Locator::set_http_client()
.
Alternative we could complete deprecate the instantiation of Locator and Sanitize for users. This could be made and discussed in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecating the the instantiation of Locator
and Sanitize
for consumers in a separate PR sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #849.
src/Content/Type/Sniffer.php
Outdated
*/ | ||
public function __construct(File $file) | ||
public function __construct(/* File */ $file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this is no change for PHP variance checker, extending the type that the argument accepts is still a breaking change since subclasses might not expect it.
And there are apparently some 😿 https://github.com/search?q=%2Fextends%20SimplePie.%2BSniffer%2F&type=code
Edit: I am using File::fromResponse
here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the type hint is not a breaking change because we have not released 1.9.0 yet. (refs)
Also the type hint in the docblock was wrong, see https://github.com/simplepie/simplepie/blob/1.8.0/src/Content/Type/Sniffer.php#L71-L79
src/SimplePie.php
Outdated
$locate = $this->registry->create(Locator::class, [&$file, $this->timeout, $this->useragent, $this->max_checked_feeds, $this->force_fsockopen, $this->curl_options]); | ||
$locate->set_http_client($this->get_http_client()); | ||
$locate = $this->registry->create(Locator::class, [ | ||
(! $file instanceof File) ? File::fromResponse($file) : $file, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to move the instanceof
check into the method. Edit: Done here.
tests/Integration/SimplePieTest.php
Outdated
$requestFactory->method('createRequest')->willReturn($request); | ||
|
||
$body = $this->createMock(StreamInterface::class); | ||
$body->method('getContents')->willReturnCallback(function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be changed to __toString
. Edit: Done.
223e6ed
to
7ce4db4
Compare
7ce4db4
to
1abf153
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you could resolve the git conflicts I could review this PR.
I think most of the notes are addressed in other issues/PRs.
@jtojnar May I take over this PR? I could resolve the git conflicts and will review the changes there. |
…ent-support" There are correctness issues as well as disorganized commit history. The fixed changes are re-applied in the commits that follow.
Providers are meant for data parametrization. This one basically created a completely different test situation and what we are going to test next will be even more wild. Let’s dissolve the provider into individual test cases.
We are going to allow using HTTP Clients that will set this to a different `Response` subclasses than just `File`. It was already marked private with PHPDoc, let’s enforce it. Hopefully, this will produce a more scrutable error message in case someone wants to access `File` properties on this.
This will allow using PSR-18 HTTP Clients. Since those return a different `Response` subclass than `File`, we need to convert `Response` to `File` to pass it to `Locator`, `Sniffer` & co.
Allows passing PSR implementations to Sanitize instead of internal HTTP Client. TODO: Possibly make the set_http_client method internal again.
Allows passing PSR implementations to Locator instead of internal HTTP Client. TODO: Possibly make the set_http_client method internal again.
1abf153
to
0427439
Compare
I will try moving this forward during the week. |
This is #777 rebased and commits squashed/reordered for easier reviewing and cleaner Git history.
Similarly to how #815 is rebased #774.
Full diff: https://github.com/Art4/simplepie/compare/c7330e69af805877bf599bf8c48e7082d44d3b84..jtojnar:simplepie:add-psr18-http-client-support-rebased