Sorry fox, I’m not at home and can’t really dig into this as much as I usually would. Guessing this is some niche website outputting their response in a non-standard way because reasons.
Describe the problem you’re having:
[22:12:01/30699] article processed
[22:12:01/30699] guid 1,t3_91fa88 / SHA1:4f92c07a8a4ae127eb6896a81467b0e42d6de3d0
[22:12:01/30699] orig date: 1532417236
[22:12:01/30699] date 1532417236 [2018/07/24 07:27:16]
[22:12:01/30699] title elfbac - runtime intent-level ABI-granular memory protection for Linux
[22:12:01/30699] link https://www.reddit.com/r/netsec/comments/91fa88/elfbac_runtime_intentlevel_abigranular_memory/
[22:12:01/30699] author /u/wademealing
[22:12:01/30699] num_comments: 0
[22:12:01/30699] looking for tags…
[22:12:01/30699] tags found: netsec
[22:12:01/30699] done collecting data.
[22:12:01/30699] article hash: 43e8bc37359ace5179881c159041baccdd1a938b [stored=]
[22:12:01/30699] hash differs, applying plugin filters:
[22:12:01/30699] … Af_Comics
[22:12:01/30699] === 0.0000 (sec)
[22:12:01/30699] … Af_Fsckportal
[22:12:01/30699] === 0.0001 (sec)
[22:12:01/30699] … Af_RedditImgur
PHP Fatal error: Uncaught andreskrey\Readability\ParseException: Invalid or incomplete HTML. in /opt/tt-rss/vendor/andreskrey/Readability/Readability.php:142
Stack trace:
#0 /opt/tt-rss/plugins/af_redditimgur/init.php(527): andreskrey\Readability\Readability->parse('\r\nreadability(Array, ‘http://elfbac.o…’, Object(DOMDocument), Object(DOMXPath))
#2 /opt/tt-rss/classes/rssutils.php(754): Af_RedditImgur->hook_article_filter(Array)
#3 /opt/tt-rss/update.php(415): RSSUtils::update_rss_feed(‘241’)
#4 {main}
thrown in /opt/tt-rss/vendor/andreskrey/Readability/Readability.php on line 142
If possible include steps to reproduce the problem:
Gets thrown every time tt-rss tries to parse Technical Information Security Content & Discussion at the moment, either automatically or via the feed debugger.
tt-rss version (including git commit id):
Tiny Tiny RSS v17.12 (a2d1fa5)
Platform (i.e. Linux distro, PHP, PostgreSQL, etc) versions:
Ubuntu 18.04
lighttpd (unsure on version, pi-hole switched this on me and I haven’t gotten around to fixing it)
php 7.2.7-0ubuntu0.18.04.2
postgres 9.6.8 (thought I was on 10 something, but this was the select version() output)
Please provide any additional information below:
Obviously the feed could change before you can check it, so here’s a pastebin of the feed containing the troublesome entry, if you don’t get a crack at it: https://pastebin.com/0qp0fPdi
Seems like it may be generally breaking tt-rss’ ability to check feeds until the troublesome entry is cleared. Going to turn off af_redditimgur for now. I can see in journalctl that other sites/entries have done this, but it’s too far back for me to get many details about that.
Edit:
Meant to save you a click. The submission just points to “http://elfbac.org/”
Edit 2:
Got a few minutes that I used to dig. Using wget to grab elfbac.org redirects to http://elfbac.org/SViZZ (probably changes per user), then redirects BACK to elfbac.org, which is just a frame that embeds ELFbac: runtime intent-level ABI-granular memory protection for Linux. Sorry, I’m an inferior user unfamiliar with curl from the command prompt. Not sure exactly why readability chokes on this nonsense, but I’m not particularly surprised that it does, either.