Suggested patch(es) to make 'HOOK_FETCH_FEED' more friendly

I was writing a plugin w/ ‘HOOK_FETCH_FEED’ and my feed always came in as “[Unknown]” w/o a site url, even though I was supplying a valid XML feed w/ all the details. I found the place in feeds.php where the subscription happens and I have the following suggested patch (below). Once I subscribed with the patch in place, my subscriptions came in with title and site url just fine.

Caveat: not extensively tested yet; though I have subscribed to 1 xml feed, 1 html feed where I chose 1 sub-feed, and 3 gocomics for the af_comics plugin.

--- feeds.php-dist	2019-03-07 08:21:29.385836900 -0600
+++ feeds.php	2019-03-07 08:18:56.894413500 -0600
@@ -1208,10 +1208,21 @@
 		} else {
 			$sth = $pdo->prepare(
 				"INSERT INTO ttrss_feeds
-					(owner_uid,feed_url,title,cat_id, auth_login,auth_pass,update_method,auth_pass_encrypted)
-				VALUES (?, ?, ?, ?, ?, ?, 0, false)");
+					(owner_uid,feed_url,title,cat_id,site_url,auth_login,auth_pass,update_method,auth_pass_encrypted)
+				VALUES (?, ?, ?, ?, ?, ?, ?, 0, false)");
 
-			$sth->execute([$_SESSION['uid'], $url, "[Unknown]", $cat_id, (string)$auth_login, (string)$auth_pass]);
+			$title = "[Unknown]";
+			$site_url = '';
+
+			$rss = new FeedParser($contents);
+			$rss->init();
+
+			if (!$rss->error()) {
+				$title = $rss->get_title();
+				$site_url = $rss->get_link();
+			}
+
+			$sth->execute([$_SESSION['uid'], $url, $title, $cat_id, $site_url, (string)$auth_login, (string)$auth_pass]);
 
 			$sth = $pdo->prepare("SELECT id FROM ttrss_feeds WHERE feed_url = ?
 					AND owner_uid = ?");

And patch af_comics to return basic information:

--- init.php-dist	2019-02-28 11:09:48.380812800 -0600
+++ init.php	2019-03-07 08:53:38.135080100 -0600
@@ -160,8 +160,11 @@
 		if ($auth_login || $auth_pass)
 			return $contents;
 
-		if (preg_match('#^https?://www\.gocomics\.com/([-a-z0-9]+)$#i', $url))
-			return '<?xml version="1.0" encoding="utf-8"?>'; // Get is_html() to return false.
+		if (preg_match('#^https?://www\.gocomics\.com/([-a-z0-9]+)$#i', $url, $matches))
+			return "<?xml version='1.0' encoding='utf-8'?>
+<feed xmlns='http://www.w3.org/2005/Atom'><title>" . ucfirst($matches[1]) . " Daily Strip</title>
+<generator uri='http://tt-rss.org/'>Tiny Tiny RSS/19.2 (dfd3a4e)</generator>
+<link href='$matches[0]' /></feed>"; // Make is_html() return false and supply basic info to subscribe with.
 
 		return $contents;
 	}

Thoughts?

BTW, if this is desirable then I can submit a PR if you give me a repo.

I remember adding a separate hook to get basic feed info, is it really necessary to duplicate that?

No, certainly not. I just didn’t run into that in any of the examples. Thanks for the pointer.

How about this patch? It may not get the comic title exactly right (ucfirst-es the comic name from url) but is better than “[Unknown]” and it correctly sets the site url.

Q: why a separate hook/call? That causes another screen refresh which looks a bit jumpy. Doesn’t it make sense that “subscribe_feed” would know the title and url to better setup the feed from the start?

--- init.php-dist	2019-02-28 11:09:48.380812800 -0600
+++ init.php	2019-03-07 15:06:30.112127200 -0600
@@ -14,6 +14,7 @@
 		$this->host = $host;
 
 		$host->add_hook($host::HOOK_FETCH_FEED, $this);
+		$host->add_hook($host::HOOK_FEED_BASIC_INFO, $this);
 		$host->add_hook($host::HOOK_SUBSCRIBE_FEED, $this);
 		$host->add_hook($host::HOOK_ARTICLE_FILTER, $this);
 		$host->add_hook($host::HOOK_PREFS_TAB, $this);
@@ -166,6 +167,16 @@
 		return $contents;
 	}
 
+	function hook_feed_basic_info($basic_info, $fetch_url, $owner_uid, $feed, $auth_login, $auth_pass) {
+		if ($auth_login || $auth_pass)
+			return $basic_info;
+
+		if (preg_match('#^https?://www\.gocomics\.com/([-a-z0-9]+)$#i', $fetch_url, $matches))
+			$basic_info = array('title' => ucfirst($matches[1]) . ' Daily Strip', 'site_url' => $matches[0]);
+
+		return $basic_info;
+	}
+
 	function api_version() {
 		return 2;
 	}

https://discourse.tt-rss.org/t/hook-for-basic-feed-info/550

that’s how this was originally requested i guess so that’s how this ended up being implemented

this way info is filled on subscribe, before an actual feed update happens, so you won’t see [Unknown] stuck for quite some time until your cronjob runs or whatever

Ok, I’m sure there are lots of edge cases; that makes sense.

What about the af_comics patch? AFIK, the title and url remains Unknown and blank forever (for the geocomics part).

When I added the GoComics code to af_comics it was because they removed RSS feeds entirely. The HOOK_FEED_BASIC_INFO didn’t exist then and adding the extra code to make the title, etc. work seemed unnecessary.

I think your code could be added now that we have the newer hook, but I would leave the title as just the comic name (i.e. without " Daily Strip") because some comics are not daily and that nomenclature seems more like personal preference.

Yeah, I thought about the daily strip nomenclature too. How about ’ [gocomics]’? Or like you say, just nothing. Or even a quick html pull to get the real title.

Thx.

That’s the part where it seems like we’re adding code just to do something a single time when the user can just click and change it if they need to.

I mean, for the most part it will work fine. You get into these odd comics (e.g. Adam@Home) where ucfirst() won’t work, or a user may prefer a different name (e.g. Adam At Home) but those are even further edge cases. Fox gets to decide, but I just don’t think it’s necessary.

i think just using base name without “daily strip” would work best, it seems like such a minor thing anyway and easily changed by user if needed so doing all this fetching and parsing html because of it doesn’t seem worth it

in order to file a PR i’ll need your gogs account name so i can adjust its repo limits

Ok, just setup rodneys_mission.

you should be able to fork stuff now

Hi you could help me in implementing the function hook_fetch_feed, from what I read it would solve my problem that I have with some feeds behind cloudflare.
I configured the pluginhost.php file like this (const HOOK_FETCH_FEED = 22; // * 1) but I don’t see any improvements, probably some concept escapes me.

@Diokobol,

Why are you posting in a 3 month old thread about an issue when you have already posted what is basically the same inquiry in another, more appropriate thread?

This forum is not here to teach you how to code in PHP. You’re welcome to ask for assistance, but if you know absolutely nothing about coding this isn’t the place to start learning. Please stay on topic and don’t post the same thing in multiple threads.

I humbly apologize. :frowning: