Thursday, December 28, 2006

The Corner's RSS feed is broken

Here's an interesting fact, classic right wing blog The Corner's RSS feed messes HTML entities.

Here's what the content of a recent post:

I didn#39;t realize that the Scott Johnson from the 2002 World Net Daily article who wrote to the State Department about Arafat#39;s responsibility fo... . . .

Normally, if for some reason you are worried that a single quotation mark (or 'apostrophe' you may be aware) isn't allowed at a particular point in HTML or XML markup, you may encode it using the following string of characters:

'

The Corner's blog appears to encode it as follows:

#39s;

Which will not actually work. It will just look like a literal '#39;'.

Whatever content management system the National Review uses, it looks totally custom. Some have noticed the strange format of their permalinks, which in fact appears to be a base64-encoded md5 or sha1 hash. Using base64 to encode such a hash will add no new information and only make the string longer. Also this completely throws the math in the linked article off. Maybe they meant to encode a binary hash but accidentally left hexadecimal mode flipped on.

The query fragment in this url, on the other hand: http://nrd.nationalreview.com/?q=MjAwNjEyMzE= decodes to '20061231', an obvious representation date. Maybe it base64 encodes all primary keys? And it uses (inefficient) fixed-length char strings as primary keys everywhere?

No comments: