Design the perfect Atom feed for WordPress

This article describes how to quickly convert the default Atom 0.3 feed that WordPress v1.5.2 produces to the Atom 1.0 Specification. It also provides some brief info about enhancing the Atom feed with RSS extension modules.

This article describes how to quickly convert the default Atom 0.3 that WordPress v1.5.2 produces to the Atom 1.0 Specification. It also provides some brief info about enhancing the Atom feed with RSS extension modules.

The goals are:

  1. To convert the 100% deprecated default 0.3 Atom feed to one that conforms with the 1.0 Atom Syndication Format.
  2. To append or correct some of the feed’s elements so that our blog produces a full-featured feed.

Prerequisites

This small howto assumes that you have at least some knowledge of the basics of programming.

You should also have the Atom Syndication Format web page open in your browser, so that you can refer to it.

Furthermore, make sure you read this excellent article, "Moving from Atom 0.3 to 1.0" by Rakaz. It describes the modifications you need to make to the Atom feed elements‘ and entry elements‘ names in order to conform to the 1.0 specification.

Wherever we need to insert a function that adds the time or date to the feed, we will use the international time standard, UTC, instead of the local time. This time is also known as GMT or simply as "Zulu" Time. You will not need to modify any of your blog options. This is just an informational notice.

The Hack

Open the wp-atom.php file in your favorite text editor and let’s start editting it. Whenever I refer to line numbers, I strictly refer to the wp-atom.php file which is shipped with WordPress v.1.5.2.

Special Note: Always keep backups of the original files.

The XML namespaces

First of all, we will need to declare the namespaces for all the elements we are going to use in our feed. Begining from line 15, substitute this part:

<feed version="0.3"
  xmlns="http://purl.org/atom/ns#"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xml:lang="<?php echo get_option('rss_language'); ?>"
  >

with this one:

<feed
  xmlns="http://www.w3.org/2005/Atom"
  xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xml:lang="<?php echo get_option('rss_language'); ?>"
  >

Generally, namespaces define which elements can be used in our feed. The first one is the new Atom 1.0 XML namespace and it’s the only one we need to declare in order to use the elements shown in the Atom Syndication Format web page.

Apart from that, as it is clearly stated in the "extensibility section" of the above document, we can use elements from other namespaces, including RSS 1.0 and RSS 2.0 modules.

We add the Creative Commons RSS Module XML namespace, so that we can include elements, which are defined in this namespace, in our Atom feed. This module will provide us the ability to add a Creative Commons License to our feed or/and to our feed entries.

We should keep The Dublin Core Element Set v1.1 namespace (xmlns:dc…), because wordpress v1.5.2, by default, adds the post’s categories to the Atom feed using this RDF schema.

Title, Link, Subtitle and ID feed elements

Next, we need to define some critical properties for our feed. These include the feed title, the proper link to the feed itself, but also some other alternative links to our content, the feed’s description and the feed’s ID. The id element is mandatory to be defined for the feed, but also for each of the feed’s entries.

Beginning from line 20, substitute this part:

<title><?php bloginfo_rss('name') ?></title>
<link rel="alternate" type="text/html" href="<?php bloginfo_rss('home') ?>" />
<tagline><?php bloginfo_rss("description") ?></tagline>

with this one:

<id><?php bloginfo_rss('atom_url') ?></id>
<title><?php bloginfo_rss('name') ?></title>
<link rel="self" type="application/atom+xml" href="<?php bloginfo_rss('atom_url') ?>" />
<link rel="alternate" type="application/rss+xml" href="<?php bloginfo_rss('rss2_url') ?>" />
<link rel="alternate" type="text/html" hreflang="<?php echo get_option('rss_language'); ?>" href="<?php bloginfo_rss('home') ?>" />
<subtitle type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
		<strong><?php bloginfo_rss('description') ?></strong><br /><br />
		Insert more <strong>info</strong> about your blog here.
	</div>
</subtitle>

I am sure you have noticed that the tagline element has been renamed to subtitle in the Atom 1.0 specification. I’ve already told you to read Rakaz’ article…

The "Updated" feed element

The modified feed element from the 0.3 specification has been renamed to updated in the 1.0 spec. Substitute line 23:

<modified><?php echo mysql2date('Y-m-d\TH:i:s\Z', get_lastpostmodified('GMT'), false); ?></modified>

with this one:

<updated><?php echo mysql2date('Y-m-d\TH:i:s\Z', get_lastpostmodified('gmt'), false); ?></updated>

This is the time you have last edited any of your posts.

Rights, License and Generator feed elements

These add copyright and license info to our feed. You should edit the contents of these elements according to your custom license and copyright info. This is an example. The time you last published a post (in GMT always, as I have already written before) is used in order to display the copyright year.

Beginning from line 24, substitute this part:

<copyright>Copyright <?php echo mysql2date('Y', get_lastpostdate('blog'), 0); ?></copyright>
<generator url="http://wordpress.org/" version="<?php bloginfo_rss('version'); ?>">WordPress</generator>

with this one:

<rights>Copyright <?php echo mysql2date('Y', get_lastpostdate('gmt'), false); ?> <?php bloginfo_rss('name') ?></rights>
<creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/2.5/</creativeCommons:license>
<generator uri="http://wordpress.org/" version="<?php bloginfo_rss('version'); ?>">WordPress</generator>

The line that contains the license info can be excluded if your feed is not published under a license.

Icon, Logo and Author feed elements

With the author element you can define the author of the feed. You can also use a custom logo and a small icon for your feed. I am not really sure what kind of images are supported as icons. Normally, a 16×16 image should be suitable, but it also depends on the feed reader. For example, I use Liferea and as a feed icon it always fetches the web site’s favicon. If you have more info on this please drop me a line.

Right after the previous feed elements you can append these. Be sure you do not leave any blank lines.

<author>
	<name><?php bloginfo_rss('name') ?></name>
	<uri><?php bloginfo_rss('home'); ?></uri>
	<email>JohnDoe@example.com</email>
</author>
<icon>/images/mylogo.png</icon>
<logo>/images/myicon.ico</logo>

I would strongly advise to remove the email element.

Entry elements

Now that we have finished editing the feed elements, we should do the same modifications for the entry elements. I won’t go into much detail here because it would be pointless to rewrite the same things. Read some notes in the next section of this article.

Beginning from line 28, substitute this part:

<entry>
	<author>
		<name><?php the_author() ?></name>
	</author>
	<title type="text/html" mode="escaped"><![CDATA[<?php the_title_rss() ?>]]></title>
	<link rel="alternate" type="text/html" href="<?php permalink_single_rss() ?>" />
	<id><?php the_guid(); ?></id>
	<modified><?php echo get_post_time('Y-m-d\TH:i:s\Z', true); ?></modified>
	<issued><?php echo get_post_time('Y-m-d\TH:i:s\Z', true); ?></issued>
	<?php the_category_rss('rdf') ?>
	<summary type="text/plain" mode="escaped"><![CDATA[<?php the_excerpt_rss(); ?>]]></summary>
<?php if ( !get_settings('rss_use_excerpt') ) : ?>
	<content type="<?php bloginfo('html_type'); ?>" mode="escaped" xml:base="<?php permalink_single_rss() ?>"><![CDATA[<?php the_content('', 0, '') ?>]]></content>
<?php endif; ?>

with this one:

<entry>
	<id><?php the_guid(); ?></id>
	<title type="html"><?php the_title_rss() ?></title>
	<link rel="alternate" type="text/html" hreflang="<?php echo get_option('rss_language'); ?>" href="<?php permalink_single_rss() ?>" />
	<link rel="related" type="application/rss+xml" href="<?php echo comments_rss(); ?>" />
	<author>
		<name><?php the_author() ?></name>
		<uri><?php the_author_url(); ?></uri>
	</author>
	<updated><?php echo mysql2date('Y-m-d\TH:i:s\Z', $post->post_modified_gmt, false); ?></updated>
	<published><?php echo get_post_time('Y-m-d\TH:i:s\Z', true); ?></published>
	<?php the_category_rss('rdf') ?>
	<rights>Copyright <?php echo get_post_time('Y', true); ?> <?php the_author() ?></rights>
	<creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/2.5/</creativeCommons:license>
	<summary type="xhtml">
		<div xmlns="http://www.w3.org/1999/xhtml">
			<?php the_excerpt_rss(); ?>
			<br /><br />
			(<a href="<?php comments_link(); ?>">Comments</a>)
		</div>
	</summary>
<?php if ( !get_settings('rss_use_excerpt') ) : ?>
	<content type="xhtml">
		<div xmlns="http://www.w3.org/1999/xhtml">
			<?php the_content('', 0, '') ?>
			<br /><br />
			(<a href="<?php comments_link(); ?>">Comments</a>)				
		</div>
	</content>
<?php endif; ?>

If you want to include the author’s email address, add the following line inside the author entry element:

<email><?php the_author_email(); ?></email>

But, I wouldn’t recommend it.

Also, make sure you edit or remove the creativeCommons:license and the rights entry elements according to your copyright and license info. The above code adds the post’s author as the copyright holder and a Creative Commons (NY-NC-SA) 2.5 license as an example.

The above code adds a hyperlink to the post’s comments inside the post’s summary or its full content, whichever is shown. Feel free to add your own code or remove it.

Finally, the post’s comments RSS 2.0 feed has been added as a related link to the entry.

Some notes about the entry elements

There is one element that did not exist in the feed section. I’m talking about the published entry element. This makes a total of two entry elements that contain a timestamp (date and time), the updated and the published. By default, wordpress puts the date&time that a post was published into both of these elements. Usually, this is not what someone would want.

The above code adds the date&time the post was last modified to the updated entry element and the date&time that the post was published to the published entry element. If this is not what you want, substitute the contents of the updated entry element with those of the published entry element. This is the wordpress’ default.

Another thing you should know is that you can add more than one author entry elements or that you can add one or more contributor entry elements, if there are any. The syntax for the contributor entry element is the same as the author element’s syntax.

Further Enhancements

There are some RSS 2.0 modules you can use in an Atom feed. A list of them exists here. Some of them are very interesting and maybe you would like to implement them in your feed. Keep in mind that you have to declare each module’s namespace as we did for the Creative Commons License module.

Validate your Atom feed

You can check if your Atom 1.0 feed is valid at the following address:
http://validator.w3.org/feed/check.cgi

A list of the RSS namespaces the validator recognizes can be found at the following address:
http://feedvalidator.org/docs/howto/declare_namespaces.html

Further Reading

These have been my sources of information for this article and I strongly recommend them for reading:

  1. Atom Syndication Format
  2. Atom Syndication Format (in depth)
  3. Moving from Atom 0.3 to 1.0" by Rakaz
  4. List of RSS 2.0 modules that can be used in an Atom feed
  5. Relevant "ticket" at the WordPress Trac
  6. WordPress Template Tags at the Codex
  7. Atom and RSS feed Validator

Final Notes

The code that I provide in this article substituting the original default WordPress code is under the GPL. The rest of the article’s license info is stated below.

This is not an official fix, but rather an approach. If you find any typos, please report them ASAP.

Design the perfect Atom feed for WordPress by George Notaras, unless otherwise expressly stated, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright © 2005 - Some Rights Reserved

About George Notaras

George Notaras is the editor of G-Loaded Journal, a technical blog about Free and Open-Source Software. George is a GNU/Linux enthusiast, a self-taught programmer and system administrator. He strongly believes that "knowledge is power" and has created this web site to share the IT knowledge and experience he has gained over the years with other people. George primarily uses CentOS and Fedora and spends some of his spare time developing open-source software. Follow George on Twitter: @gnotaras

12 responses on “Design the perfect Atom feed for WordPress

  1. George Permalink →

    George –

    This is off the Atom topic and do do with your RSS feed.

    What do you think the problem was that caused your 4th of december post to have this title:

    A quick AWstats guide

    in Google’s blog search?

    You can see it with this search [GNot: search link removed]

    Interesting problem.

    – George

  2. George Notaras Post authorPermalink →

    I have noticed that too. This problem with the google blogsearch exists because I had set the title entry element to type "xhtml" and so I had to declare the div tag’s XML namespace, like I have done with the summary and the content elements.

    In the atom 1.0 specification it is clearly stated that the title element can be set to type "xhtml", but google does not seem to support it. Anyway, the code I provide above sets the entry’s title element to type "html", which should not produce such a problem.

    What you see on google are the results of my experiments with the Atom feed. I guess that setting the entry’s title element to type "xhtml" was a bad moment. But, that’s life ;-)

  3. George Notaras Post authorPermalink →

    The code above was edited.
    I removed the "xhtml" type from the rights element too, just to be on the safe side.

    George, thanks for your feedback.

    UPDATE [December 9th, 2005]
    In the above code only the following elements are set to type "xhtml":
    subtitle : subelement of the feed element
    summary : subelement of the entry element
    content : subelement of the entry element

    This should not produce any problems with Google or with any other feed reader or engine or bot. At least, I haven’t encountered any.

    The problem George indicated in the previous comment does not exist any more. The Google blogsearch lists the titles of my posts correctly.

  4. George Notaras Post authorPermalink →

    Tomjack: Hello, thanks for your good words, but let’s try to keep comments on-topic and things in-tact. You may use my email address if your comment is just a “hello” message. Thanks in advance for your understanding :-)

  5. Boris Permalink →

    Thanks for this little tutorial, I’m just on my way to get my WordPress Atom Feed to 1.0. But I have a problem to get my time stamp (in <updated>) valid.

    If I use it as comes with Your code, the Feed Validator at W3C declares the date-time expression as invalid according to RFC-3339.

    The given code mysql2date('Y-m-dTH:i:sZ' ... produces this date-time expression in my feed:

    2006-03-27CEST21:03:107200

    I tried to get around with the help of the validator’s help (ehm…) in some ways, but I’m rather confused now… Maybe You have some idea?

  6. George Notaras Post authorPermalink →

    Hi Boris,
    You are right. The references of the form: mysql2date('Y-m-dTH:i:sZ' ... should be:

    mysql2date('Y-m-dTH:i:s' ...

    T and Z need to be escaped, otherwise they are translated to the current timezone and the difference from UTC time in seconds respectively. These are not needed because we use the UTC time in the timestamp.

    I wrote this article in WP 1.5.2 and I am pretty sure it showed the backslashes correctly inside the pre HTML tags. Now I need to put 2 backslashes, so that one is shown in the article. WP really pisses me off sometimes… or this was wrong from the beginning and neither I nor any other reader had noticed it. ;-)

    Thanks for pointing this out!

  7. Boris Permalink →

    Thanks George, it works now!
    While trying myself I escaped the ‘T’ and somehow lost the ‘H’… grrr…
    Now WP 2.0.2 seems to be completely valid.

  8. Ahmad Gharbeia Permalink →

    In modifications starting from line 20 (of wp-atom.php v.1.5.2):


    <link rel="alternate" type="text/html" hreflang="<?php echo get_option('rss_language'); ?>" href="<?php bloginfo_rss('home') ?>" />

    if you replace type="text/html" with the function call type="<?php bloginfo('html_type'); ?>", then make sure that the blog’s actual content-type is always present, if it ever gets changed, instead of hard coding it.

    (posting again with escaped code, was stripped)

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>