Tiny Tiny RSS: Community

Suggested patch(es) to make 'HOOK_FETCH_FEED' more friendly


#1

I was writing a plugin w/ ‘HOOK_FETCH_FEED’ and my feed always came in as “[Unknown]” w/o a site url, even though I was supplying a valid XML feed w/ all the details. I found the place in feeds.php where the subscription happens and I have the following suggested patch (below). Once I subscribed with the patch in place, my subscriptions came in with title and site url just fine.

Caveat: not extensively tested yet; though I have subscribed to 1 xml feed, 1 html feed where I chose 1 sub-feed, and 3 gocomics for the af_comics plugin.

--- feeds.php-dist	2019-03-07 08:21:29.385836900 -0600
+++ feeds.php	2019-03-07 08:18:56.894413500 -0600
@@ -1208,10 +1208,21 @@
 		} else {
 			$sth = $pdo->prepare(
 				"INSERT INTO ttrss_feeds
-					(owner_uid,feed_url,title,cat_id, auth_login,auth_pass,update_method,auth_pass_encrypted)
-				VALUES (?, ?, ?, ?, ?, ?, 0, false)");
+					(owner_uid,feed_url,title,cat_id,site_url,auth_login,auth_pass,update_method,auth_pass_encrypted)
+				VALUES (?, ?, ?, ?, ?, ?, ?, 0, false)");
 
-			$sth->execute([$_SESSION['uid'], $url, "[Unknown]", $cat_id, (string)$auth_login, (string)$auth_pass]);
+			$title = "[Unknown]";
+			$site_url = '';
+
+			$rss = new FeedParser($contents);
+			$rss->init();
+
+			if (!$rss->error()) {
+				$title = $rss->get_title();
+				$site_url = $rss->get_link();
+			}
+
+			$sth->execute([$_SESSION['uid'], $url, $title, $cat_id, $site_url, (string)$auth_login, (string)$auth_pass]);
 
 			$sth = $pdo->prepare("SELECT id FROM ttrss_feeds WHERE feed_url = ?
 					AND owner_uid = ?");

And patch af_comics to return basic information:

--- init.php-dist	2019-02-28 11:09:48.380812800 -0600
+++ init.php	2019-03-07 08:53:38.135080100 -0600
@@ -160,8 +160,11 @@
 		if ($auth_login || $auth_pass)
 			return $contents;
 
-		if (preg_match('#^https?://www\.gocomics\.com/([-a-z0-9]+)$#i', $url))
-			return '<?xml version="1.0" encoding="utf-8"?>'; // Get is_html() to return false.
+		if (preg_match('#^https?://www\.gocomics\.com/([-a-z0-9]+)$#i', $url, $matches))
+			return "<?xml version='1.0' encoding='utf-8'?>
+<feed xmlns='http://www.w3.org/2005/Atom'><title>" . ucfirst($matches[1]) . " Daily Strip</title>
+<generator uri='http://tt-rss.org/'>Tiny Tiny RSS/19.2 (dfd3a4e)</generator>
+<link href='$matches[0]' /></feed>"; // Make is_html() return false and supply basic info to subscribe with.
 
 		return $contents;
 	}

Thoughts?


#2

BTW, if this is desirable then I can submit a PR if you give me a repo.


#3

I remember adding a separate hook to get basic feed info, is it really necessary to duplicate that?


#4

No, certainly not. I just didn’t run into that in any of the examples. Thanks for the pointer.


#5

How about this patch? It may not get the comic title exactly right (ucfirst-es the comic name from url) but is better than “[Unknown]” and it correctly sets the site url.

Q: why a separate hook/call? That causes another screen refresh which looks a bit jumpy. Doesn’t it make sense that “subscribe_feed” would know the title and url to better setup the feed from the start?

--- init.php-dist	2019-02-28 11:09:48.380812800 -0600
+++ init.php	2019-03-07 15:06:30.112127200 -0600
@@ -14,6 +14,7 @@
 		$this->host = $host;
 
 		$host->add_hook($host::HOOK_FETCH_FEED, $this);
+		$host->add_hook($host::HOOK_FEED_BASIC_INFO, $this);
 		$host->add_hook($host::HOOK_SUBSCRIBE_FEED, $this);
 		$host->add_hook($host::HOOK_ARTICLE_FILTER, $this);
 		$host->add_hook($host::HOOK_PREFS_TAB, $this);
@@ -166,6 +167,16 @@
 		return $contents;
 	}
 
+	function hook_feed_basic_info($basic_info, $fetch_url, $owner_uid, $feed, $auth_login, $auth_pass) {
+		if ($auth_login || $auth_pass)
+			return $basic_info;
+
+		if (preg_match('#^https?://www\.gocomics\.com/([-a-z0-9]+)$#i', $fetch_url, $matches))
+			$basic_info = array('title' => ucfirst($matches[1]) . ' Daily Strip', 'site_url' => $matches[0]);
+
+		return $basic_info;
+	}
+
 	function api_version() {
 		return 2;
 	}

#7

that’s how this was originally requested i guess so that’s how this ended up being implemented

this way info is filled on subscribe, before an actual feed update happens, so you won’t see [Unknown] stuck for quite some time until your cronjob runs or whatever


#8

Ok, I’m sure there are lots of edge cases; that makes sense.

What about the af_comics patch? AFIK, the title and url remains Unknown and blank forever (for the geocomics part).


#9

When I added the GoComics code to af_comics it was because they removed RSS feeds entirely. The HOOK_FEED_BASIC_INFO didn’t exist then and adding the extra code to make the title, etc. work seemed unnecessary.

I think your code could be added now that we have the newer hook, but I would leave the title as just the comic name (i.e. without " Daily Strip") because some comics are not daily and that nomenclature seems more like personal preference.


#10

Yeah, I thought about the daily strip nomenclature too. How about ’ [gocomics]’? Or like you say, just nothing. Or even a quick html pull to get the real title.

Thx.


#11

That’s the part where it seems like we’re adding code just to do something a single time when the user can just click and change it if they need to.

I mean, for the most part it will work fine. You get into these odd comics (e.g. [email protected]) where ucfirst() won’t work, or a user may prefer a different name (e.g. Adam At Home) but those are even further edge cases. Fox gets to decide, but I just don’t think it’s necessary.


#12

i think just using base name without “daily strip” would work best, it seems like such a minor thing anyway and easily changed by user if needed so doing all this fetching and parsing html because of it doesn’t seem worth it

in order to file a PR i’ll need your gogs account name so i can adjust its repo limits


#13

Ok, just setup rodneys_mission.


#14

you should be able to fork stuff now