Monday, April 09, 2007

Related links and RSS engine updates

Recently our related story links for each story was returning far too many false positives.

The algorithm we came up with was weighted to heavily towards chronological overlap -- what people were linking to at the same times. Which was fraught with error, because our spider can't be everywhere at once, and so as a result too often stories came up as similar because of sheer coincidence.

Today I fixed the related links algorithm up so that it uses a much more robust distance statistic. This will mean that the related links will now be much more accurate. Hopefully.

Internally we've been working a lot on our RSS and Atom engine, which when it goes to production will improve many aspects of the site.

With just a few more features we will be able to go past our undeclared beta stage, and there will be much rejoicing.

1 comment:

Anonymous said...
This comment has been removed by a blog administrator.