Should escape/preserve < and > within <code>



An article with <...> contained within <code> and <pre> is treated like a regular HTML tag and apparently stripped out if invalid. It should be escaped and preserved instead.

The first article there ( has a couple code blocks. Loading the feed URL directly in the browser renders correctly in Firefox and also in but the same article is missing a bunch of code in between < and > when viewed through TT-RSS.

tt-rss git (af13f3009c59c3db338b719b09335a472383d11c)

Ubuntu 16.04, PHP 7.0.22-0ubuntu0.16.04.1, PostgreSQL 9.5.8

All HTML special characters must be encoded. If a web site is serving literal <, >, and & characters inside any HTML tags (including pre and code) then clients are correct to interpret them as HTML code. Characters should be encoded as &lt;, &gt;, and &amp; (e.g. PHP has htmlspecialchars() function for this purpose).


Can still cause trouble. A recent example I could pull up on zero notice:

Specifically, &quot;&amp;lt;Kobayashi-san chi no maid dragon&amp;gt;&quot; and &quot;&amp;lt;Nichijou&amp;gt;&quot; will present a blank title in TT-RSS, at the time of this post.

It’s late enough and I’m tired enough that I could be wrong about whether they’re escaping properly. Also, code tags because the discourse preview is interpreting quot; to be helpful to me, I guess.


rss content is parsed as a valid xml document, i’m not sure if html special snowflake stuff is applicable

even if it is i’m not going to add any special processing for content instead of feeding it to DOMDocument because it’s a security nightmare so this behavior is not going to change

tldr: no