How to create custom RSS feeds with WordPress

WordPress has many alternate built-in feeds: per category, per tag, per author, per search-keyword. But in some cases, you want feeds built with some more advanced logic. Let’s look at the available options.

WordPress advanced built-in feeds

You can create feeds for “unions” or “intersections” of tags, you just have to use a URL like /tag/foo,bar/feed/ (all articles tagged with foo or bar) or /tag/foo+bar/feed/ (all articles tagged with foo and bar).

You can also have feeds excluding a category, although that requires you to know the category identifier (and hardcode it in the URL like this: /?feed=rss2&cat=-123 where 123 is the category id that you want to exclude).

But there’s no simple way to have a feed that excludes articles with a given tag. The best solution found involves creating a custom feed. I’ll show you a variation of this below.

Creating a custom feed

  1. First of, install the Feed Wrangler plugin, it will take care of registering our custom feeds with wordpress.
  2. Go to “Settings > Feed Wrangler” in your WordPress administrative interface and create a new feed, let’s call it “myfeed”.
  3. You should now create a “feed-myfeed.php” file and put it in your current theme’s directory. The initial content of that file should be this:
    <?php
    include('wp-includes/feed-rss2.php');
    ?>
  4. At this point, you already have a new feed that you can access at /feed/myfeed/ (or /?feed=myfeed). It’s a complete feed like the main one.

Now, we’re going to look at ways to customize this feed. We’re going to do this by changing/overriding the default query that feed-rss2.php’s loop will use.

A feed excluding articles with a tag

If you want to create a feed that excludes the tag “foo”, you could use this:

<?php
global $wp_query;
$tag = get_term_by("slug", "foo", "post_tag");
$args = array_merge(
        $wp_query->query,
        array('tag__not_in' => array($tag->term_id))
);
query_posts($args);
include('wp-includes/feed-rss2.php');
?>

That was relatively easy, thanks to the “tag__not_in” query parameter. Now you can further customize the feed by adding supplementary query parameters to the $args array. The documentation of query_posts details the various parameters that you can use.

A feed excluding articles with a custom field (meta-data)

I went further because I did not want to use a tag to exclude some posts: that tag would have been public even if it was only meaningful to me. So I decided to use a custom field to mark the posts to exclude from my new feed. I named the field “no_syndication” and I always give it the value “1”.

This time it’s not so easy because we have no query parameter that can be used to exclude posts based on custom fields. We’re going to use the “post__not_in” parameter that can be used to exclude a list of posts. But we must first generate the list of posts that we want to exclude. Here we go:

<?php
global $wp_query;
$excluded = array();
$args_excluded = array(
    'numberposts'     => -1,
    'meta_key'        => 'no_syndication',
    'meta_value'      => 1,
    'post_type'       => 'post',
    'post_status'     => 'published'
);
foreach (get_posts($args_excluded) as $item) {
        $excluded[] = $item->ID;
}
$args = array_merge(
        $wp_query->query,
        array('post__not_in' => $excluded)
);
query_posts($args);
include('wp-includes/feed-rss2.php');
?>

A feed with modified content

You might want to add a footer to the articles that are syndicated. I use the Ozh’ Better Feed plugin for this but it applies to all your feeds.

You could do that sort of transformation only in your customized feed by using the WordPress filter named the_content_feed.

Here’s a simple example:

<?php
function myfeed_add_footer($content) {
        return $content . "<hr/>My footer here";
}
add_filter('the_content_feed', 'myfeed_add_footer');
include('wp-includes/feed-rss2.php');
?>

I’ll stop here but obviously you have lots of options and many ways to tweak all the snippets above. They have been tested with WordPress 3.0.4.

Note that in all those examples, I took care to not duplicate the code from feed-rss2.php, instead I used include() to execute it. That way my custom feeds will automatically benefit from all the future enhancements and fixes made by the WordPress developers.

But if you have to modify the XML structure of your custom feeds, you can paste the content of feed-rss2.php in your file and change it like you want…

Assembling bits of history with git: take two

Following my previous article, I had some interesting comments introducing me to git-filter-branch (which is a new function coming from cogito’s cg-admin-rewritehist). This command is really designed to rewrite the history and you can do much more changes… it enabled me to fix the dates/authors/committers/logs of all the commits that were created with git_load_dirs. It can also be used to add one or more “parent commits” to any commit.

In parallel I discovered some problems with the git repository that I created: the tags were no more pointing to my master branch. This is because git rebase won’t convert them while rewriting history.

This lead me to redo everything from scratch. This time I used git-filter-branch instead. The man page even gives an example of how to link two branches together as if one was the predecessor of the other. Here’s how you can do it: let’s bind together “old” and “new”… the resulting branch will be “new-rewritten”.

$ git rev-parse old
0975870bb1631379f2da798fa78736a4fe32960a
$ git checkout new
$ git-filter-branch --tag-name-filter=cat --parent-filter \
"sed -e 's/^$/-p 0975870bb1631379f2da798fa78736a4fe32960a/'" \
new-rewritten
[...]
Rewritten history saved to the new-rewritten branch

Short explanation: the only commit without a parent commit (thus matching the empty regex “^$”) is the root commit and this one is changed to have a parent (-p) which is the last commit of the branch “old”.

At the end, you remove all the temporary branches, keep only what’s needed and repack everything to save space:


$ git branch -D old new
$ git prune
$ git repack -a -d

Assembling bits of history with git

The dpkg team has a nice history of changing VCS over time. At the beginning, Ian Jackson simply uploaded new tarballs, then CVS was used during a few years, then Arch got used and up to now Subversion was used. When the subversion repository got created, the arch history has not been integrated as somehow the conversion tools didn’t work.

Now we’re likely to move over git for various reasons and we wanted to get back the various bits of history stored in the different VCS. Unfortunately we lost the arch repository. So we have disjoints bits of history and we want to put them all in a single nice git branch… git comes with git-cvsimport, git-archimport and git-svnimport, so converting CVS/SVN/Arch repositories is relatively easy. But you end up with several repositories and several branches.

Git comes with a nice feature called “git rebase” which is able to replay history over another branch, but for this to work you need to have a common ancestor in the branch used for the rebase. That’s not the case… so let’s try to create that common ancestor! Extracting the first tree from the newest branch and committing it on top on the oldest branch will give that common ancestor because two identical trees will have the same identifier. Using git_load_dirs you can easily load a tree in your git repository, and “git archive” will let you extract the first tree too.

In short, let’s see how I attach the “master” branch of my “git-svn” repository to the “master” branch of my “git-cvs” repository:

$ cd git-svn
$ git-rev-list --all | tail -1
0d6ec86c5d05f7e60a484c68d37fb5fc31146c40
$ git-archive --prefix=dpkg-1.13.11/ 0d6ec86c5d05f7e60a484c68d37fb5fc31146c40 | (cd /tmp && tar xf -)
$ cd ../git-cvs
$ git checkout master
$ git_load_dirs -L"Fake commit to link SVN to older CVS history" /tmp/dpkg-1.13.11
[...]
$ git fetch ../git-svn master:svn
$ git checkout svn
$ git rebase master

That’s it, your svn branch now contains the old cvs history. Repeat as many times as necessary…

More fun with Linux and serial ports on slow hardware

This is a never ending story for me. The first time I’ve had problems with Linux’s handling of serial UART dates back to 2005 (see my previous blog post on buffer overruns). At that time I could improve the situation by applying two patches (kernel-preempt and low latency).

One year later, I have a situation where the buffer overruns are again easily reproducible at slower baudrate (54 kbauds), arguably there’s more than a serial application running this time, and it looks like the load generated by other processes (mainly watching digital I/O) renders the system less reliable with respect to its handling of serial ports.

This time I follow the advice to try out the 2.6 kernel because many “real-time improvements” (coming from the -rt branch, check its wiki) and “embedded improvements” (coming from linux-tiny) have been merged.

So I tried the stock kernel with very bad result. Results are better than the stock 2.4 but they are worse than the patched 2.4.
So I decide to try the -rt patch on 2.6, but this patch doesn’t work on my CPU card and my bugreport didn’t lead to any fix (nobody responded even though I tried hard to include the necessary information and I was ready to do whatever I would have been asked to try).

At the same time I explain my problem on the linux-kernel mailing list.
The discussion doesn’t answer my questions but still brings two ideas to try out. In the end, with two simple tweaks to the stock 2.6 kernel (mainly configuring the UART to send the interrupt as soon as the first character arrives, instead of waiting for 8 chars to accumulate in the FIFO) I have been able to get something better than the patched 2.4. And it turns out my first choices for the 2.6 kernel configuration have not been very wise so the comparison between stock 2.4 and stock 2.6 above doesn’t mean much.

Unfortunately, even if better than the patched Linux 2.4, it still doesn’t give good results in some conditions. So my primary question remains: is there a way to patch my kernel so that it will handle the serial related tasks (servicing interrupts from the UART mainly) as its primary job ? I don’t mind if such a change impact negatively the speed of the system if I can make sure that my serial exchanges are reliable.

And by reliable I mean of course no buffer overruns, but there’s a second similar problem that has been discovered: when using software handshake, the system can send out between 10 and 25 characters after the partner has sent its XOFF. Sending up to 6 characters after the XOFF is ok, more is asking for troubles because the UART on the other side will probably encounter a buffer overrun…

If you have any idea on how to resolve my problems, by any means, let me know.

Serial overrun on Linux

Working with serial lines can sometimes give you big headaches. I have an embedded PC based on 386 SX 40 processor. This PC doesn’t make much but it has programs using the serial line intensively. Things didn’t work as well as expected so I looked carefully what was going on … the beast was loosing bytes ! My information has been promptly confirmed by the /proc/tty/driver/serial entry. If you have “oe: X” (where X is a positive number) there, it means that one of your UART detected overrun errors.

So what’s an overrun ? An overrun happens when the UART receives data while its FIFO buffer is full. Why is the FIFO full ? Because Linux didn’t treat the serial interruption quickly enough. Why is Linux so slow ? Linux is not a real time OS and it doesn’t guarantee any response time to interruption, so Linux is not so slow but my PC really is … what happens is that interruption related to the network are treated before serial interruptions. Furthermore IDE disk interruptions can take too long too. Worst case is of course, you’re treating a disk interruption, then you have to treat the network interruption and only after that you can treat the serial interruption which in fact happened right after the beginning of the disk interruption

So fixing serial overrun is a rather complex problem since it’s really a kernel related problem. Googling on the subject I have found several ideas to explore :

  • configure IDE disk to use DMA hdparm -d 1 /dev/hda, use of DMA will shorten the time where IRQ will be masked to the kernel (in my case it doesn’t work since I’m using DiskOnModule which do not support DMA)
  • make disk IRQ interruptible with hdparm -u 1 /dev/hda
  • use irqtune to re-prioritize the IRQ on the interruption controller. This software is no more maintained and it doesn’t work out of the box on kernel 2.4.x.

Using hdparm -u wasn’t enough to solve my problem… so I continued to look for a solution and I found one ! I recompiled my kernel with the low latency patch and the preemptible kernel patch. Those are usually used for multimedia applications where you need good responsiveness in order to deliver content in real-time but the fact is that they do work for my purpose too !

My serial overruns are completely gone at 9600 bauds. However I can still have some when running at 115200 bauds. Moreover I can create serial overrun by running a find / -type f | grep -v /proc/ | xargs md5sum in the background… I can’t make miracles with this slow processor… if you have more ideas to further improve the situation, I’m willing to try !