Sorry fox, I’m not at home and can’t really dig into this as much as I usually would. Guessing this is some niche website outputting their response in a non-standard way because reasons.
Describe the problem you’re having:
[22:12:01/30699] article processed
[22:12:01/30699] guid 1,t3_91fa88 / SHA1:4f92c07a8a4ae127eb6896a81467b0e42d6de3d0
[22:12:01/30699] orig date: 1532417236
[22:12:01/30699] date 1532417236 [2018/07/24 07:27:16]
[22:12:01/30699] title elfbac - runtime intent-level ABI-granular memory protection for Linux
[22:12:01/30699] link https://www.reddit.com/r/netsec/comments/91fa88/elfbac_runtime_intentlevel_abigranular_memory/
[22:12:01/30699] author /u/wademealing
[22:12:01/30699] num_comments: 0
[22:12:01/30699] looking for tags…
[22:12:01/30699] tags found: netsec
[22:12:01/30699] done collecting data.
[22:12:01/30699] article hash: 43e8bc37359ace5179881c159041baccdd1a938b [stored=]
[22:12:01/30699] hash differs, applying plugin filters:
[22:12:01/30699] … Af_Comics
[22:12:01/30699] === 0.0000 (sec)
[22:12:01/30699] … Af_Fsckportal
[22:12:01/30699] === 0.0001 (sec)
[22:12:01/30699] … Af_RedditImgur
PHP Fatal error: Uncaught andreskrey\Readability\ParseException: Invalid or incomplete HTML. in /opt/tt-rss/vendor/andreskrey/Readability/Readability.php:142
#0 /opt/tt-rss/plugins/af_redditimgur/init.php(527): andreskrey\Readability\Readability->parse(’\r\nreadability(Array, ‘http://elfbac.o…’, Object(DOMDocument), Object(DOMXPath))
#2 /opt/tt-rss/classes/rssutils.php(754): Af_RedditImgur->hook_article_filter(Array)
#3 /opt/tt-rss/update.php(415): RSSUtils::update_rss_feed(‘241’)
thrown in /opt/tt-rss/vendor/andreskrey/Readability/Readability.php on line 142
If possible include steps to reproduce the problem:
Gets thrown every time tt-rss tries to parse https://www.reddit.com/r/netsec/.rss at the moment, either automatically or via the feed debugger.
tt-rss version (including git commit id):
Tiny Tiny RSS v17.12 (a2d1fa5)
Platform (i.e. Linux distro, PHP, PostgreSQL, etc) versions:
lighttpd (unsure on version, pi-hole switched this on me and I haven’t gotten around to fixing it)
postgres 9.6.8 (thought I was on 10 something, but this was the select version() output)
Please provide any additional information below:
Obviously the feed could change before you can check it, so here’s a pastebin of the feed containing the troublesome entry, if you don’t get a crack at it: https://pastebin.com/0qp0fPdi
Seems like it may be generally breaking tt-rss’ ability to check feeds until the troublesome entry is cleared. Going to turn off af_redditimgur for now. I can see in journalctl that other sites/entries have done this, but it’s too far back for me to get many details about that.
Meant to save you a click. The submission just points to “http://elfbac.org/”
Got a few minutes that I used to dig. Using wget to grab elfbac.org redirects to http://elfbac.org/SViZZ (probably changes per user), then redirects BACK to elfbac.org, which is just a frame that embeds http://www.cs.dartmouth.edu/~sergey/elfbac/. Sorry, I’m an inferior user unfamiliar with curl from the command prompt. Not sure exactly why readability chokes on this nonsense, but I’m not particularly surprised that it does, either.