Almost all articles marked unread after update

I just updated from commit https://git.tt-rss.org/fox/tt-rss/commit/a1ffc116196e023491ff2c3c7b24f48924ea4fd1 to https://git.tt-rss.org/fox/tt-rss/commit/cd1f3cb8cc5fc6e3679fb778ee23f35d179b0a1c

Afterwards, all subscriptions for GitHub commits, Reddit topics (and several misc forum topics) were marked completely unread (about 2700 articles…).

Debugging the feeds never has presented any issues.

For some of those blogs, the article time is reset to the exact time tt-rss fetched the feed and marked all articles unread, not the actual time). This is not the case for all GitHub and Reddit subscriptions right now (which previously didn’t exhibit this behavior anyway, so I’m guessing this is only due to some change between the commits I linked above).

In that sense, it’s very well possible I’m describing two separate issues here, which have the same outcome.

I have had several occasions where some subscriptions were “reset” in terms of read articles, but never on this scale. This was, however limited to some blogs I run. The RSS guids and lastBuildDates were never changed by the blog itself, so it was definitely something in tt-rss on those cases, too.

I’m running tt-rss on PHP 7.3.6 and MariaDB, provided by a hosting company. There were no event log entries. If I can give more specific info I’m not aware of, please let me know.

article GUID storage format has been changed in https://git.tt-rss.org/fox/tt-rss/commit/cd1f3cb8cc5fc6e3679fb778ee23f35d179b0a1c although previous format is also supported for backwards compatibility.

so, existing entries marked unread or newly appearing duplicates? do you have mark unread on update enabled perhaps?

Either existing entries, or the complete feed is reset? Either way, there are NO duplicates.

if there are no duplicates, it’s probably mark unread on update. is it enabled for these feeds? in the feed editor.

there’s no way this could happen on feed update.

Yes, you’re right, mark unread on update is set for most of these feeds.

The thing is, however, if this is supposed to correspond to the lastBuildDates in the feed, then I’m sure they aren’t updated, for two reasons:

  1. I manage a set of these feeds myself
  2. I have another tt-rss instance, with some of the same feeds, but not yet updated as described above. This issue did not present itself there.

it’s not and it doesn’t.

this is a case of partially my bad (because i didn’t think about it and forgot that this option exists) and partially working as intended. tt-rss calculates a normalized hash of an entire article and considers article updated if it changes.

unfortunately, for people with this option enabled, this also includes hashed article GUID that changed, which changed the hash, you get the idea.

just mark everyone older than one day (or those specific feeds) as read and forget the whole thing.

e: i can exclude GUIDs from this hash calculation which would prevent this from happening next time this format changes but you’ll get another set of unread articles. is it worth it? :slight_smile:

Haha, that’s clear. Thanks for explaining.

I’m hoping to still find out the sub-issue with the blogs I’m managing, though, since they (at seemingly random times, without an updated or new article present) still exhibited the same behavior (all articles unread), but with the extra peculiar detail that the article timestamp was set to that of the fetching action, not the actual timestamp of the article.

e: i can exclude GUIDs from this hash calculation which would prevent this from happening next time this format changes but you’ll get another set of unread articles. is it worth it? :slight_smile:

Well, I’m already writing a mini-plugin to allow hotkeys for instant unsubscribe and hopefully instant mark-all-as-read, so it gives me a good chance to clean up old subscriptions anyway.

there’s a bunch of fields that are taken into account and if any one changes you’ll get your pseudo unread article, even if the change is (almost-) invisible.

could be updated timestamp or number of comments, etc. this option is simply not very useful for some feeds.

https://git.tt-rss.org/fox/tt-rss/commit/3a142cbf58b927a7e0d7ac07922e19ee73440eef

since I broke this once today, might as well do this.

e: woops https://git.tt-rss.org/fox/tt-rss/commit/06d2c65193b8679a6a7afbcd342ced07c53a4268

The feed really doesn’t have many fields for tt-rss to take into account anyway. Here’s an anonymized excerpt of the feed:

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>Title of Blog</title>
    <link>https://some.thing</link>
    <description>Desc</description>
    <lastBuildDate>Sun, 17 May 2020 08:50:24 +0200</lastBuildDate>
    <item>
      <title>A title </title>
      <link>https://some.thing/article-first</link>
      <description>The entire article text is here</description>
      <pubDate>Sa, 16 May 2020 09:52:42 +0200</pubDate>
      <guid isPermaLink="false">e032052af3258ee58b06a3f3de9d43cd</guid>
    </item>
    <item>
      <title>2nd article </title>
      <link>https://some.thing/article-second</link>
      <description>The entire article text is here</description>
      <pubDate>Sa, 18 Apr 2020 12:06:36 +0200</pubDate>
      <guid isPermaLink="false">61e12be95c389ea9e10eeb6a279b2ed8</guid>
    </item>
  </channel>
</rss>

e: All right, I’ve updated. Let’s see how this goes at the next cron job.

…yeah, my blogs that have the feed structure as posted above just got all their articles reset to unread again (I updated https://git.tt-rss.org/fox/tt-rss/commit/06d2c65193b8679a6a7afbcd342ced07c53a4268 about an hour ago, so cronjob has been run twice since then).

anything you had mark unread on update enabled would’ve done this after this last update, in this case it’s not related to feed data.

Ok,cool.I’ll keep an eye on it and report back if this happens again. Thanks!

I just updated from commit c8cc845d5b1c64ea259667c01a9591a04e0e4e98 to ac17ded854557e77840bf99ec48e736a2586f7e4 and again 889 articles got their read status reset.

They again consist of various sources: Reddit topics, forum topics, GitHub RSS feeds, and blogs (including the one I described above)…

search before posting, there were guid changes some time ago

I know that, it’s been discussed here. But those changes were not between c8cc845d5b1c64ea259667c01a9591a04e0e4e98 and ac17ded854557e77840bf99ec48e736a2586f7e4.

hmmmm

ah right, enabling or disabling plugins causes a rehash, which is what you probably did

this kind of stuff is why I wouldn’t recommend this option :slight_smile:

Ouch! :face_with_raised_eyebrow: That is indeed what I did, to test for Prefs: TypeError: App.isCombinedMode is not a function, and what I had to do again just now, right after cleaning up the mess… :dizzy_face:

Imho this is unexpected behavior, what’s the reason behind it?

plugins might change article content so when you enable something this lets the plugin do its thing transparently

this was the reason, idk if its a worthy one

Here’s my 2ct.

I’m trying to think of a use case where the end user would want to re-read a whole bunch of articles, just because a plugin has now handled the respective articles differently. I can’t think of a reason t.b.h.

I think it would be the responsibility of the end user to first mark articles unread if they wanted those articles to be re-read after some plugin went over them (or stopped doing so).

The option to “Mark updated articles as unread” to me reads as an update of the source, so the article itself, not what tt-rss does with it.

An RSS feed doesn’t provide the entire article in the feed, only an excerpt. The user enables af_readability to replace the excerpt with the full article.

Every user is different. It makes sense that when enabling a plugin that modifies article content that TT-RSS would consider the article content changed.

But only because that’s how you’re thinking. I’m not saying it’s not a valid opinion, but I am saying it’s just as valid as: “I think it would be the responsibility of TT-RSS to mark articles unread after some plugin went over them.”

Internally TT-RSS does a lot of work to figure out what the source being modified is, but plugins can intercept and change that at multiple points. Given that users doesn’t often toggle plugins on a daily basis, the current approach makes sense. The alternative makes sense, too. There’s no right or wrong here, just the way TT-RSS does it.