Readability using feed's article URL instead of 301'ed or 302'ed URL

PLEASE READ THIS BEFORE POSTING: Read before posting / reporting bugs

I think I did.

Describe the problem you’re having:

Feedburner feed content retrieved by readability refers images relatively not absolutely resulting in img-sources with feedburner URL not article URL.

If possible include steps to reproduce the problem:

Taking this feed scinexx | Das Wissensmagazin and activating af_readability on it shows the problem.

tt-rss version (including git commit id):

latest git (I just git pull’ed a few minutes ago)

Platform (i.e. Linux distro, PHP, PostgreSQL, etc) versions:

Ubuntu 16.04.3, PHP 7, PostgreSQL 9.5

Please provide any additional information below:

Expected behaviour: When af_readability retrieves article content it should use the last 301 or 302 redirect URL as base URL for completing relative URLs and not the feed’s article URL. I don’t know if this would be the formally correct behaviour but it’s the one I’d expect TT-RSS to show.

This would be easy enough to do with curl because it stores the last URL as effective_url (or something) that could be set as a global along with the other values (content type, etc).

Not sure about non-curl use.

But @fox gets to decide…

this sounds like a terrible hack tbh, much like readability is in itself

nobody is going to stop you from making your own af_readability with this but i’m definitely not piling anymore hacks on top of hacks on top of this shitpile library, so i guess the answer is no

e: i might reconsider in case someone bothers to make a clean PR

This? https://git.tt-rss.org/git/tt-rss/issues/28

Untested, but it should work.

thanks, looks ok (haven’t tested either tho)

Thanks a lot, @JustAMacUser, I thought it would be a simple change. Just pulled it, now some new feed items have to come in.

Thanks, @fox, for merging this PR. I consider it not a hack but more unhacking this awful Feedburner setup. It’s less hacky now using the actual URLs :wink: .

It has been a pleasure. :slight_smile:

Or I just force refetch. And it works, great.

2017-12-13\\3101:11

that’s an interesting date format

That’s the result of date() format string “D, Y-m-d\ \ \ \tH:i” (spaces inserted because preview interprets backslashes as escape characters).

Honestly, I can’t remember having ever changed it, but that doesn’t count much, I guess, as I use TT-RSS since … well, when did they close Google Reader? I used Feedly for half a year before setting up TT-RSS, so that’s about four years now.

My best guess is, that during some upgrade “\t” for literal “t” to delimit date and time became “\ \t” and after another upgrade of whatever it became “\ \ \ \t”. Doesn’t need to be a TT-RSS update, maybe PostgreSQL or something like that. When upgrading the release from 14.04 to 16.04 I accidentally messed up PostgreSQL (9.3 → 9.5) and had to manually rewind the PostgreSQL upgrade. I uninstalled 9.3 before upgrading the cluster to 9.5 and had to reinstall 9.3 or parts of it. You learn a little bit with each mistake you make :blush:.