Articles for Tag: atom

NetNewsWire Drops Microformats Support

:(

Word from NetNewsWire developer Brent Simmons is that the next update to the great OS X feed reader will drop microformats support. Quoting the post:

My favorite thing was deleting the microformats parsing code. Any CFXML-based code should be nuked. (I have no inside knowledge about CFXML being deprecated or anything — but it should be.)

It also means a slight performance boost when opening news items and web pages, as NetNewsWire now doesn’t look for embedded microformats. (It still looks for feeds in web pages, though, of course.)

NetNewsWire Microformats UI

Much of software development is a balancing act, and I can’t fault Brent for the choice here, but I also can’t help to feel a bit down on this loss.

I’ll happily admit that the large majority of feed items I see do not contain microformats, but for those that do include events, contact info or related info this was a great feature. Its important to note too, that with the use of RSS / Atom feeds for so much more then blog posts that there are some really great use cases for this functionality [feed of latest users added to an application being one example that comes to mind].

Some time ago — before this support had been added to NNW — “I had written some simple scripts to do microformat detection and extraction from feed items. Perhaps its time to resurrect that old code.

Microformats Hit 2, Entering Maturity

I didn’t want to let today go by without a post acknowledging the 2nd birthday of Microformats.org and the related community.

Thought the first year was huge for microformats, the second one has seen additional growth in all areas from format maturity, to huge growth in the community and sites using various markup constructs, to greater support from application vendors.

Here’s a recap of a few recent news items or tidbits incase you missed them.

Feeds For All With hAtom -- Part 2: The Code

Recently I introduced the idea of adding an Atom feed to any document you want by using hAtom along with a local ‘proxy’ script to generate feeds to pages on your site that otherwise wouldn’t have them. The post seemed well received, but it didn’t feel complete to me without some code to allow people to quickly try it for themselves. So here’s the inevitable followup with an example PHP5 script to show how you can make the hAtom to Atom conversion transparent to a site visitor and add feeds to static pages or pages that otherwise don’t have a more typical Atom feed.

This certainly isn’t a robust script—it doesn’t account for the use of Tidy to manage non-valid or non-XHTML documents, multiple domains or different request methods. It works for my purposes and I left it at that. But the core idea is simple enough that it shouldn’t take long to re-write it for your needs or in your language of choice if they’re different then mine.

The PHP Proxy Script


<?php
ini_set('display_errors', '0');
header('Content-type: application/atom+xml');
//
// configuration
$h2axsl = '/home/user/app/hAtom2Atom.xsl'; // path to hatom2atom.xsl
$domain = 'example.com'; // domain with the hatom content
$permalink_stub = 'hatom2atom.php';// public path to this file
//
// parse request
$requested = $_SERVER['REQUEST_URI'];
$requested = substr_replace($requested,'',0,strlen($permalink_stub)+2);
$requested = urldecode($requested); // deal with encoded #
$docurl = 'http://'.$domain.'/'.$requested;
//
// grab file contents
if (function_exists('curl_init')){
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $docurl);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 5);
$file_contents = curl_exec($ch);
curl_close($ch);
} else {
$file_contents= file_get_contents($docurl);
}
//
// pass the file through the transform and send it out
$xsl = new DomDocument();
$xsl->load($h2axsl);
$inputdom = new DomDocument();
if ($inputdom->loadXML($file_contents)) {
$proc = new XsltProcessor();
$proc->importStylesheet($xsl);
$proc->setParameter('', 'source-uri', $docurl);
$newdom = $proc->transformToDoc($inputdom);
print $newdom->saveXML();
}
?>

The script can be broken down into the following parts:

  1. Configuration of paths for support files and requests.
  2. Extracting the location of the desired XHTML document that contains the hAtom content. This script is written so that requests made to hatom2atom.php/some/page would output the Atom feed for the document at http://example.com/some/page.
  3. Grabbing the contents of that XHTML document. In my case I’m making an http request to make sure the file is properly built/parsed, though if your files are static without includes you may want to use local file access methods instead.
  4. Performing the XSL transform and outputting the results.

You can copy and paste the above script, or download it as a .zip.

The XSL Transform

All the magic really happens via XSL and the trusty hatom2atom transform. So download both hAtom2Atom.xsl and uri.xsl and put them into the same directory somewhere on your server. When you do update the following line in the configuration portion of the script to the actual location of hAtom2Atom.xsl:

$h2axsl = '/home/user/app/hAtom2Atom.xsl'; // path to hatom2atom.xsl

Usage & The hAtom Document

Now that the proxy script is in place you should expose the feed to your visitors by including it in the head of the document or as links in the content. For example, if you have some hAtom content in http://example.com/page_with_hatom.html you can add the following link element in the head of that document:

<link rel="alternate" href="/hatom2atom.php/page_with_hatom.html" type="application/atom+xml" title="This Page's Atom feed" />

And any visitor with a browser that picks up on the link element would then identify the page as having a feed. If you want to be more specific about the location of the feed on the page, or you’re dealing with a document with multiple hAtom feeds and you want to specify the fragment ID for the root of the feed you’d do like so:

<link rel="alternate" href="/hatom2atom.php/page_with_hatom.html%23some_fragment" type="application/atom+xml" title="This Page's Atom feed" />

Notice I url encoded the # (as %23) when using a fragment id, this was because many user agents do not send the fragment ID as part of the request so I encoded it to make sure that the information was there for the proxy to manage and then had the script manage things from there.

Prettying up the Request

There are lots of ways to manage requests on your server – from MVC like dispatchers to mod_rewrite to nothing at all, in the case of this script I’ve just used the following rewrite rule that I added to the list of a number of rules I already had allowing me to pretty the url up a bit – and make it a bit more universal or future proof.

RewriteRule ^hatom2atom/(.*)$ hatom2atom.php/$1 [L]

But remember, if you do change the location of the proxy script to change the configuration line or url extraction methods in the script appropriately.

The Results

With the script installed and the feed linked up for users to find you now have Atom feeds for your readers to subscribe to that are generated from hAtom markup in a way that’s totally transparent to them. You can see this method in action in pages like:

Feeds For All With hAtom

I previously had tackled the issue of subscribing to documents with embedded hAtom content by writing a script for NetNewsWire that used its ability to run special script subscriptions on the “client” side.

While the script works great, and I’ve got a number of feeds I watch from other sites this way as a publisher I still longed for a more “feed”-like and more universal, and less technical solution.

This afternoon I got one big leap closer to a solution I’m happy with.

Detected feed as seen in Camino

Instead of offloading the work of parsing the html document containing the hAtom content to a client side application, or relying on a 3rd party proxy that I have no control over and may not be expecting a ton of regular traffic I’ve set up a script on my own server to act as a proxy and turn any found hAtom content at the specified address into more useful atom content1. Through the magic of the link element I can pass the new feed url off as you would with any other atom feed and the whole process is seamless to the user.

This even works for adding feeds to hand edited pages [yes, people still do that!] or pages that otherwise don’t have a database to draw on and build multiple feeds from. You can see it in action now for the version history hAtom feeds here and here and I’ll soon be implementing it on all the non-blog pages on ChunkySoup.net.

For the curious, the screen shot is of the feed detection code soon to be added to Camino.

EDIT: I’ve just uploaded the changes to all of the individual pages on ChunkySoup.net. Look for the feed titled ‘This Page’s Atom Feed’ on pages like this to watch them for changes.

1 using the usual suspect: hAtom2Atom.xsl