Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some entries doesn't load metadata properly #14

Open
Rakambda opened this issue Sep 13, 2023 · 6 comments
Open

Some entries doesn't load metadata properly #14

Rakambda opened this issue Sep 13, 2023 · 6 comments

Comments

@Rakambda
Copy link

Sometimes, some feed entries doen't display properly and are skipped by the plugin.

Example : https://www.reddit.com/user/neo3dofficial/submitted/.rss?sort=new
At some point this error happens (NO METADATA is a log I added in the isValid method) :

[Wed, 13 Sep 2023 18:55:16 +0200] [error] --- NO METADATA
[Wed, 13 Sep 2023 18:55:16 +0200] [error] --- RedditImage\Exception\InvalidContentException:   submitted by   <a href="https://www.reddit.com/user/neo3dofficial"> /u/neo3dofficial </a>   to   <a href="https://www.reddit.com/r/DigitalArt/"> r/DigitalArt </a> <br> <span><a href="https://i.redd.it/d88vk9o3ehgb1.jpg">[link]</a></span>   <span><a href="https://www.reddit.com/r/DigitalArt/comments/15jo38m/chroma_abstract_wallpaper_pack/">[comments]</a></span> in /app/www/extensions/xExtension-RedditImage/Content.php:29
Stack trace:
#0 /app/www/extensions/xExtension-RedditImage/Processor/BeforeInsertProcessor.php(51): RedditImage\Content->__construct()
#1 /app/www/lib/Minz/ExtensionManager.php(338): RedditImage\Processor\BeforeInsertProcessor->process()
#2 /app/www/lib/Minz/ExtensionManager.php(309): Minz_ExtensionManager::callOneToOne()
#3 /app/www/app/Controllers/feedController.php(483): Minz_ExtensionManager::callHook()
#4 /app/www/app/Controllers/feedController.php(653): FreshRSS_feed_Controller::actualizeFeed()
#5 /app/www/lib/Minz/Dispatcher.php(119): FreshRSS_feed_Controller->actualizeAction()
#6 /app/www/lib/Minz/Dispatcher.php(46): Minz_Dispatcher->launchAction()
#7 /app/www/lib/Minz/FrontController.php(58): Minz_Dispatcher->run()
#8 /app/www/p/i/index.php(57): Minz_FrontController->run()
#9 {main}

If I make the metadata regex a bit more tolerent with #(?P<metadata>\s+submitted.*</span>)#, the error is gone.
This allows the entry to actually be processed by the transformers and the image inlined. Before, as the error happened, the entry was skipped from going through the processors.

However this doesn't seem to handle all cases.
Example (NSFW) : https://www.reddit.com/user/throwmeaway896/submitted/.rss?sort=new
With this feed, even if I have the regex modified, some entries failed to match (though image was already added by the BeforeInsertProcessor) :

[Wed, 13 Sep 2023 19:01:09 +0200] [error] --- NO METADATA
[Wed, 13 Sep 2023 19:01:09 +0200] [error] --- RedditImage\Exception\InvalidContentException: <div class="reddit-image figure"><!--xExtension-RedditImage/1.1.1 | RedditImage\Processor\BeforeInsertProcessor | RedditImage\Transformer\Agnostic\ImageTransformer--><img src="https://i.redd.it/rtuzzy7h9nnb1.jpg" class="reddit-image"></div>
  submitted by   <a href="https://www.reddit.com/user/throwmeaway896"> /u/throwmeaway896 </a>   to   <a href="https://www.reddit.com/r/phgonewild/"> r/phgonewild </a> <br> <span><a href="https://i.redd.it/rtuzzy7h9nnb1.jpg">[link]</a></span>   <span><a href="https://www.reddit.com/r/phgonewild/comments/16fy4tb/what_if_nasa_kama_mo_ako_now/">[comments]</a></span> in /app/www/extensions/xExtension-RedditImage/Content.php:29
Stack trace:
#0 /app/www/extensions/xExtension-RedditImage/Processor/BeforeDisplayProcessor.php(43): RedditImage\Content->__construct()
#1 /app/www/lib/Minz/ExtensionManager.php(338): RedditImage\Processor\BeforeDisplayProcessor->process()
#2 /app/www/lib/Minz/ExtensionManager.php(309): Minz_ExtensionManager::callOneToOne()
#3 /app/www/app/views/index/normal.phtml(34): Minz_ExtensionManager::callHook()
#4 /app/www/lib/Minz/View.php(88): include('...')
#5 /app/www/lib/Minz/View.php(110): Minz_View->includeFile()
#6 /app/www/app/layout/layout.phtml(69): Minz_View->render()
#7 /app/www/lib/Minz/View.php(88): include('...')
#8 /app/www/lib/Minz/View.php(101): Minz_View->includeFile()
#9 /app/www/lib/Minz/View.php(68): Minz_View->buildLayout()
#10 /app/www/lib/Minz/Dispatcher.php(56): Minz_View->build()
#11 /app/www/lib/Minz/FrontController.php(58): Minz_Dispatcher->run()
#12 /app/www/p/i/index.php(57): Minz_FrontController->run()
#13 {main}

I have to say I don't really understand that one, using an online checker the regex seems to match https://www.phpliveregex.com/p/JSm

@aledeg
Copy link
Owner

aledeg commented Sep 19, 2023

Thank you for the report. I'll look into that.

@aledeg
Copy link
Owner

aledeg commented Sep 20, 2023

I was trying to download the XML file from the RSS feed. But for some reason my wget command does not work anymore. Do you happen to have either a working wget/curl command or the files? The former is preferred since I could reproduce that.
Thank you

@couchoud-t
Copy link

Here you go, I just renamed them as .log otherwise Github doesn't accept the .xml

neo3dofficial.log
throwmeaway896.log - NSFW

@aledeg
Copy link
Owner

aledeg commented Sep 20, 2023

Thank you. Do you have a way of downloading them? I'll be interested since I am hitting 403 errors when using wget.

@couchoud-t
Copy link

Just opened them in the browser and saved the page 😄

@aledeg
Copy link
Owner

aledeg commented Sep 20, 2023

You're lucky. When I am doing that it enters in an non-ending loop of downloading atom files.
I need to figure out what this is about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants