Information overload has become a serious issue in modern times. One of the domains in which information overloads exist is news coverage. On the one hand, I’d like to stay up-to-date about what’s happening in the world, especially when it affects me. On the other hand, I often find myself questioning whether reading certain articles are of importance to me, and whether or not I am just wasting my time on endless noise.

Filtering out the noise is quite easy with tools such as Large Language Models. Rather than manually scanning through countless information sources each morning, LLM’s can be leveraged to filter news articles based on specifically stated interests and professional needs.

In this blog, I will showcase a solution that filters out noise from an IT related RSS news feed, essentially using the LLM for decision making on whether or not some article is worth the reading effort. Note that my evaluation of this solution is not so much about the specific application, i.e. filtering out noise in an IT newsfeed, but more about the concept of utilizing an LLM for a non-critical decision. This concept is translatable into many other domains, and I hope it inspires you to translate it into your own personal or business needs.

The script

Let’s start with the interesting part; the program itself. I have written the script in Python, and it looks like this:

#!/usr/bin/env python3
from datetime import datetime, timedelta
from email.utils import parsedate_to_datetime
from openai import OpenAI
from os import getenv
import json
import requests
import subprocess
import xml.etree.ElementTree as ET


FEED_URL = "https://www.security.nl/rss/headlines.xml"
MODEL = "Claude-Haiku-3.5"
with open("system_prompt.txt", "r") as f:
    SYSTEM_PROMPT = f.read()


def get_feed():
    """Get RSS feed"""
    response = requests.get(FEED_URL)
    response.raise_for_status()
    return response.content


def parse(feed_content: bytes):
    """Parse RRS feed"""
    root = ET.fromstring(feed_content)
    channel = root.find("channel")
    title = channel.findtext("title", default="(untitled)")
    link = channel.findtext("link", default="")
    description = channel.findtext("description", default="")
    pub_date = channel.findtext("pubDate")
    last_build = channel.findtext("lastBuildDate")
    return {
        "channel": {
            "title": title,
            "link": link,
            "description": description,
            "pubDate": pub_date,
            "lastBuildDate": last_build,
        },
        "items": [
            {
                "title": item.findtext("title", default="(no title)").strip(),
                "link": item.findtext("link", default="").strip(),
                "description": item.findtext("description", default="").strip(),
                "pubDate": item.findtext("pubDate", default="").strip(),
            }
            for item in channel.findall("item")
        ],
    }


def extract_recent_items(feed, days_back: int = 0):
    """Check if the publication date is within the specified number of days back from today."""
    recent_items = []

    for item in feed["items"]:
        dt = parsedate_to_datetime(item["pubDate"])
        now = datetime.now()
        today = now.date()
        cutoff_date = today - timedelta(days=days_back)
        item_date = dt.date()

        if item_date >= cutoff_date:
            recent_items.append(item)

    return recent_items


def evaluate(items, filter=True):
    """Evaluate all items"""
    api_key = getenv("POE_API_KEY")
    client = OpenAI(api_key=api_key, base_url=getenv("POE_BASEURL"))
    evaluations = []

    for i, it in enumerate(items):
        title = it["title"]
        completion = client.chat.completions.create(
            model=MODEL,
            messages=[
                {"role": "system", "content": SYSTEM_PROMPT},
                {"role": "user", "content": title},
            ],
        )
        out = completion.choices[0].message.content
        data = json.loads(out)
        data["title"] = title
        data["link"] = it["link"]
        evaluations.append(data)
        return evaluations

    return evaluations


def notify(results):
    """Notify about all results that are noteworthy"""
    for r in results:
        if r["decision"] == "yes":
            command = [
                "/usr/local/bin/telegram.py",
                f"=== {r['title']} ===\n\n{r['reasoning']}\n\n{r['link']}==========="
                "".strip(),
            ]
            subprocess.run(command, check=True, text=True, capture_output=True)


def main():
    feed_content = get_feed()
    items = parse(feed_content)
    recent_items = extract_recent_items(items)
    results = evaluate(recent_items)
    notify(results)


if __name__ == "__main__":
    main()

Copy code

The workflow of the script is as follows:

Get an example RSS newsfeed; I use www.security.nl, a Dutch site publishing IT related news (note that it publishes in Dutch, so titles et cetera are shown in Dutch also throughout this blog)
Parsing the feed to get all news items out of it
Extract only today’s items from it
Evaluate the items using an LLM to determine if they are of any importance
Send a notification about the news article if the LLM determines it is of interest

Each of these steps has its own function, and it is the main() function of the script that calls all of these functions.

Runtime!

Getting the news items

The script starts by retrieving the RSS feed through the get_feed() function. This sends a HTTP request to the RSS endpoint of the website I am using for the news items, and returns the response content.

Then, th parse() function parses the feed content and extracts all feed items. A single extracted news item (an item in the items[“items”] list) comes out like this:

{
  "title": "Ziekenhuis ontslaat medewerkers wegens ongeoorloofd inzien patiëntendossiers",
  "link": "https://www.security.nl/posting/908552/Ziekenhuis+ontslaat+medewerkers+wegens+ongeoorloofd+inzien+pati%C3%ABntendossiers?channel=rss",
  "description": "Het Albert Schweitzer Ziekenhuis heeft twee medewerkers ontslagen die elfhonderd patiëntendossiers ongeoorloofd inzagen. Het ...",
  "pubDate": "Thu, 09 Oct 2025 17:10:37 +0200"
}

Copy code

To filter out only today’s items, the feed content is passed through the extract_recent_items() function, which returns only items for the last day by default. I do this so I can schedule the script every day without seeing articles more than once. If you want to make it more complete, you can implement a state somewhere (like in a database) to keep track of what items have already been handled, allowing you to run it more than once a day.

Model choice, system prompt and sending the message

With the items of today filtered out, we can have an LLM evaluate whether the article is worth reading or not, based on a certain system prompt which I will show below. The filtering is done in the evaluate() function. I use www.poe.com for interacting with LLM’s, a Quora company that recently introduced an OpenAI compatible API, so I can just use the OpenAI Python SDK to interact with LLM’s through Poe.

As you see at the start of the code, the model that is used is Claude Haiku 3.5. My choice here is based on costs and energy efficiency. Claude Haiku is a smaller model that has less parameters, so it was less expensive to train, and inference costs are also lower, both in terms of energy and money. The tradeoff is capability, but as we are not working with huge complexity here, it more than suffices.

The system prompt used for filtering by the LLM is being read from a local file (system_prompt.txt). Below is an English translation of the prompt (I implemented it in Dutch as I use a Dutch news site and wanted my results in Dutch):

You are an IT news filter for a Linux consultant. Assess news headlines for relevance to Cloud consultancy work.

CONSULTANT PROFILE:
- Cloud consultant
- Works with: Ansible, Terraform, HashiCorp Vault, Python, AI/LLMs, Linux distributions, Kubernetes
- Has limited time - only let critical items through

FILTER CRITERIA:
YES (let through):
- Security issues in the above technologies or Linux
- Breaking changes/major releases of the above tools
- Major cloud/infrastructure outages
- Enterprise Linux regulations/compliance
- Zero-days, CVEs, critical patches

NO (filter out):
- Consumer tech, gaming, mobile apps
- Marketing announcements, tutorials, opinions
- Minor updates, conferences, startup news
- Regional news that doesn't have major impact on the IT industry
- Non-Dutch national news that doesn't have major impact on the IT industry

WHEN IN DOUBT: Filter out (be conservative)

OUTPUT:
Always return JSON with this structure:
{
  "decision": "yes" | "no",
  "reasoning": "brief explanation why relevant/not relevant for Linux consultant",
  "other": "optional additional context if relevant"
}

No additional text is possible outside this structure. Do not provide explanations outside the JSON.

Focus on: Would this directly impact the consultant's work or clients?

Copy code

This prompt is also mainly AI generated. All I did about it was add some specific technologies I often work with, and add the line that states it should never provide explanations outside of the JSON, which it started doing when I was testing the generated system prompt. Raw output of the LLM for a single news item looks like this:

{
  "decision": "no",
  "reasoning": "Geen directe technische of IT infrastructuur relevantie voor Linux consultancy werk",
  "other": "Privacy incident, meer geschikt voor HR of compliance afdelingen"
}

Copy code

In this format, downstream code can easily act upon the decisions of the LLM.

Notifications

Finally, to actually be notified about the news articles of interest, the filtered items are passed to the notify() function, which uses a local tool I’ve written that can send me notifications through Telegram. This tool is not to be covered here, but is still shown here to complete the script. This can, of course, be replaced with any upstream notification method.

Translating this idea into other concepts and applications

It is not hard to see how the techniques used in this news parser translates to business applications. Retailers, for example, use LLM’s (and other types of Natural Language Processing) for inventory management decisions, analyzing customer reviews and more. In the legal sector, LLM’s are used for analyzing contracts and doing legal research. Trading and hedge funds even use LLM’s to aid in trading strategies and risk assessments.

Conclusion

The above is a way of implementing LLM’s in what is in this case a non-critical decision. By having the LLM perform some decisions based on user input (the system prompt), a lot of time and effort can be saved and the noise is filtered out of the news feed. It’s a very practical example, but the ideas here can be translated into many other domains. Are you interested in how LLM’s, or AI in general, could help you in your personal or business needs? Or did this inspire you to implement LLM’s for some personal or professional use case? Make sure to reach out to us as we are happy to hear from you.

Implementing Large Language Models for decision making

The script

Runtime!

Model choice, system prompt and sending the message

Notifications

Translating this idea into other concepts and applications

Conclusion

Ready to talk about your LLM strategy?

Let's talk!

Ready to talk about your LLM strategy?

From Strategy to Execution

Industry Experts

Knowledge that Drives Innovation

Learn, Grow, Innovate

Empowering Innovation Since 1997

Implementing Large Language Models for decision making

The script

Runtime!

Model choice, system prompt and sending the message

Notifications

Translating this idea into other concepts and applications

Conclusion

Ready to talk about your LLM strategy?

Let's talk!

Ready to talk about your LLM strategy?

Related articles

How to modernize your digital infrastructure with Cloudflare and SUE

How about not extending your VMware license and hoping for the best?

Solving tech challenges without disruption: Exploring rehosting, replatforming, and refactoring