Skip to content

simplepie/feed-locator

Repository files navigation



Feed Locator

Feed Locator is a modern PHP implementation of Mark Pilgrim's rssfinder.py which was born from his desire for an ultra-liberal RSS locator. This was the model for the SimplePie_Locator class in SimplePie “OG”.

Badges

Health

Open Issues Pull Requests Contributors Repo Size GitHub Commit Activity GitHub Last Commit

Quality

Travis branch Coveralls Code Quality Symfony Insight

Social

Author Follow Blog Stars

Compliance

License

Features

Process

  1. At every step, feeds are minimally verified to make sure they are really feeds.
  2. If the URI points to a feed, it is simply returned; otherwise the page is downloaded and the real fun begins.
  3. Feeds pointed to by <link> tags in the header of the page. (This is standard autodiscovery.)
  4. <a> links to feeds on the same hostname, where the URIs contain atom, feed, rdf, rss, or xml.
  5. <a> links to feeds on a subdomain, where the URIs contain atom, feed, rdf, rss, or xml.
  6. <a> links to feeds on a different domain, where the URIs contain atom, feed, rdf, rss, or xml.

Discover Feeds!

  • Returns all the feeds it can possibly find!
  • Can be configured to only perform standard autodiscovery.
  • Can be configured to stop after discovering the first feed.
  • Can be configured favor one language for a feed over another (e.g., XML vs. JSON).
  • Can be configured favor some formats for feeds over others (e.g., Atom vs. RSS vs. RDF vs. JSONFeed).

Nice Things

  • Supports domain names without typing the http:// or https://.
  • Returns a list of results; each contains the feed URI, the format of the feed, and its server media type.
  • Will provide a CLI tool which accepts an input URI and can return a list of feeds.
  • Will support offline/local mode where you can parse a local file, and receive "best-guess" matches.
  • Will support caching the results so that the next request for a URI will return the cached results instead of making live queries.
  • Supports automatic retries, with exponential back-off + jitter.

Standards-Compliant

  • Code formatting is all PSR-1, PSR-2, and PSR-12-compliant.
  • Supports standardized PSR-3 loggers like Monolog.
  • Supports standardized PSR-4 autoloading for classes and directory structure.
  • Code comments and docblocks are compatible with PSR-5/PSR-19.
  • Caching will be compatible with PSR-6/PSR-16.
  • We leverage PSR-7 for message handling, and typehint against PSR-7 interface types.

Development Status

Pre-1.0 code here. No SemVer backwards-compatiblity guaranteed from commit to commit at this stage.

Most of the important bits are working. Still tweaking the user-facing APIs. Need to refactor a few spots for DRY and just general efficiency. Need to write automated tests. Need to tune the log-levels. Still continues to perform some work after we already have what we need. See the path to 1.0.

We support PSR-7, but for making the actual requests, we (presently) have a hard dependency on Guzzle 6 Async Pools for purposes of speed and efficiency. Other possible adapters could be accepted once I create a pluggable framework for them.

Example Usage

use Bramus\Monolog\Formatter\ColoredLineFormatter;
use FeedLocator\FeedLocator;
use FeedLocator\Http\DefaultConfig;
use FeedLocator\Http\Retry;
use GuzzleHttp\Client;
use Kevinrob\GuzzleCache\CacheMiddleware;
use Kevinrob\GuzzleCache\Storage\FlysystemStorage;
use Kevinrob\GuzzleCache\Strategy\PrivateCacheStrategy;
use Monolog\Handler\ErrorLogHandler;
use Monolog\Logger;
use Psr\Log\LogLevel;

# Define our logger
$logger  = new Logger('FeedLocator');
$handler = new ErrorLogHandler(
    ErrorLogHandler::OPERATING_SYSTEM,
    LogLevel::DEBUG,
    true,
    false
);
$handler->setFormatter(new ColoredLineFormatter());
$logger->pushHandler($handler);

# Define our HTTP cacher
$tmpDir = sprintf('%s/FeedLocator', sys_get_temp_dir());
$logger->debug(sprintf('Cache directory: %s', $tmpDir));
$cacheMiddleware = new CacheMiddleware(
    new PrivateCacheStrategy(
        new FlysystemStorage(
            new Local($tmpDir)
        )
    )
);

# Discover the feeds linked from Apple's RSS page.
$locator = new FeedLocator('apple.com/rss');

# Use the default configuration, but tweak a few values.
$options = DefaultConfig::clientOptions($logger);
$options['connect_timeout'] = 10.0;
$options['timeout']         = 10.0;
$options['handler']         = DefaultConfig::handlerStack(
    $logger,
    Retry::defaultHandler($logger),
    $cacheMiddleware
);

# Set the logger and Guzzle client to use
$locator->setLogger($logger);
$locator->setGuzzleClient(new Client($options));

# Run, using Guzzle Promises
$pool = $locator->run();
$pool->wait();

# Get the results as an array (from an ArrayIterator)
$results = $locator->getResults()->getArrayCopy();
\print_r($results);

Output:

Array
(
    [0] => Array
        (
            [0] => https://www.apple.com/newsroom/rss-feed.rss
            [1] => atom
            [2] => application/rss+xml;charset=utf-8
        )

    [1] => Array
        (
            [0] => http://ax.itunes.apple.com/WebObjects/MZStoreServices.woa/ws/RSS/topsongs/limit=10/xml
            [1] => atom
            [2] => application/xml
        )

    [2] => Array
        (
            [0] => http://ax.itunes.apple.com/WebObjects/MZStoreServices.woa/ws/RSS/topalbums/limit=10/xml
            [1] => atom
            [2] => application/xml
        )

    [3] => Array
        (
            [0] => http://ax.itunes.apple.com/WebObjects/MZStoreServices.woa/ws/RSS/toppaidapplications/limit=10/xml
            [1] => atom
            [2] => application/xml
        )

    [4] => Array
        (
            [0] => https://developer.apple.com/news/rss/news.rss
            [1] => rss
            [2] => application/rss+xml; charset=UTF-8
        )

    # ...snip...
)

Coding Standards

PSR-1/2/5/12/19 are a solid foundation, but are not an entire coding style by themselves. We automate a large part of our style requirements using PHP CS Fixer and PHP CodeSniffer. (The things that we cannot yet automate are documented in the SimplePie NG Coding Standards.)

These can be applied/fixed automatically by running the (lightweight) linter:

make lint

Additionally, in our quest to write excellent code, we use a variety of tools to help us catch issues with what we've written, including:

Type Description
Linting Tools PHP CS Fixer, PHP CodeSniffer
QA Tools PDepend, PHPLOC, PHP Copy/Paste Detector, PHP Code Analyzer
Static Analysis Phan, PHPStan, Psalm, PHP Dependency Analysis

These reports can be generated by running the (heavyweight) analyzer:

make analyze

Please Support or Sponsor Development

The SimplePie project is a labor of love. Development of the next-generation of SimplePie was started in June 2017 as because it's a project I love, and I believe our community would benefit from this tool.

If you use SimplePie — especially to make money — it would be swell if you could kick down a few bucks. As the project grows, and we start leveraging more services and architecture, it would be great if it didn't all need to come out of my pocket.

You can also sponsor the development of a particular feature. If there's a feature that you want to see implemented, and I believe it's the right fit for the SimplePie project, you can sponsor the development of the feature to get it prioritized.

Your contributions are greatly and sincerely appreciated. See the Sponsor button along the top of the page for more information.