Html tag <img> in filters: how to create filter?

I want to create a filter that finds articles with lots of images.
I want to put a label img to such articles.

I’ve tried match expressions: (<img\s.*){5,} and (&lt;img\s.*){5,}
The filter does not working.
1

Now I use the expression: (\bimg\s.*){5,}
It does not quite accurate, of course.

Is there another solution?

Tiny Tiny RSS v19.2 (f38a89a), Server: Ubuntu 18.4, mySQL, Client: Win10 Chrome. search_sphinx disabled.


See also:
https://discourse.tt-rss.org/t/html-in-filters-not-possible-any-more/766

you can’t use html markup in filters, it’s a known limitation.

e: you did find that previous thread about it, why make another one?

My question is:

I want to put a label img to articles with lots of images.

well there’s one obvious solution: make a custom plugin.

…server-side custom plugin.

is there one?

Untested but should probably do what you want (copied partly from auto_assign_labels).

(Save as ./plugins.local/auto_image_labels/init.php and don’t forget to activate it in the preferences)

<?php
class Auto_Image_Labels extends Plugin {
private $host;

function about() {
    return array(1.0,
        "Assign labels to articles with more than x images",
        "fox / aeritir");
}

function init($host) {
    $this->host = $host;
    $host->add_hook($host::HOOK_ARTICLE_FILTER, $this);
}

function hook_article_filter($article) {
    $max_images = 5;
    $my_label = 'img5';

    $doc = new DOMDocument();
    @$doc->loadHTML($article["content"]);

    if ($doc) {
       $xpath = new DOMXPath($doc);
       $images = $xpath->query('//img');
       if (count($images) > $max_images) {
            array_push($article["labels"], $my_label);
       }
    }

    return $article;
}

function api_version() {
    return 2;
}
}

smart! interesting. thanks!

why import to xml?
can I just use preg_match on $article["content"]?

No one prevents you from doing so but most plugins use the DOMDocument/XPath approach as the article is often modified as well and you want to make sure the HTML still validates afterwards.