Thursday, January 01, 2009

More tweaks released today

Today I pushed out some more tweaks to the backend engine that you might not notice.

Blogrevolution tells what's popular by counting up the links that appear on the front pages of the bloggers in its database. Formerly we didn't could links if they were pointed to the same site. No self-linking! We've expanded this so that links from blogs to any page on the same domain name are ignored, rather than just the same site or subdomain. The reason for this change is that we were seeing a lot of promotional links to the same "family" of sites, such as msnbc blogs or blogs under the firedoglake aegis.

The second backend change is that the quotes we display for each discussion are now tested for similarity to the article description, and rejected if they are too similar. I am considering tightening the rules for rejecting quotes from different blogs that are too similar to each other.

More obvious changes are the "In earlier" edition section at the bottom of the page. I think that this fills out the page nicely.

Also the "related info" links underneath each story that show similar stories have been retitled "more like this". Hopefully this communicates the intent of the related links section better.

That's all for now! I'll be working on the site some more soon.

Friday, December 26, 2008

Happy Christmas and layout changes

I made some minor layout changes: "supporting links" is now called "discussions"! I can tell you're excited.

Few people seemed to understand what a "supporting link" was; I was never quite happy with the term anyway. The idea had been that, since blogrevolution builds its page based on the number of recent links it gets from bloggers, links back to those links to the articles should be called "supporting links".

Hopefully "discussions" will communicate that intent better.

I'm going to play around with layout changes that could be a bit more substantial, sometime soon.

Happy Christmas holiday.

Wednesday, October 29, 2008

Bumpy road ahead

For the next few days, 100% of the news is going to be about either McCain, Obama, or both, so news clustering isn't going to be as effective.

Also, around election night people will link to early election results and voting problems in small-town newspaper sites that have poor SEO. Blogrevolution has an easier time parsing sites that have good SEO, so the results may be interesting.

Monday, October 27, 2008

Minor new features: news and tour

... if you're reading this, you probably came from a new sidebar that shows recent site news on the front page!

The site news sidebar will show a few recent posts from the official blog (i.e., this). If, out of some fit of laziness, I don't update the official blog for awhile then the sidebar will automatically hide itself, and will return just as automatically when there is a new post.

The other new feature is the site tour. The tour is an attempt to explain many of the site's features. I always thought they were obvious but I learned that when showing the site to new people in person they often didn't understand what different parts of the site were doing. The curse of knowledge in action! Since I can't show the site to everyone in person, I created a screengrab-heavy page with an explanation of the site's features (and foibles).

Saturday, October 25, 2008

New favicon

Yesterday I came up with a new favicon for the site.

The previous favicon -- actually favicons -- were shrunk-down letters cut from a now very old version of the logo. I couldn't decide which one so I made a favicon for every letter in the name and had it randomly select one every time the page was rebuilt.

The old favicons, though not without a certain charm, I decided were getting a little too out of step with the new design so I decided to make a make a new favicon based on the new logo, more specifically, its first letter b.

My first thought was to have just the first letter with a radial gradient from white to transparent. This would create a cool, unexpected halo effect when the icon would be displayed in front of background color (google's new favicon does this, if you haven't noticed).

So here's my starting point:


Sharper eyes will notice that I'm using a pretty old version of photoshop.

Anyway, shrunk down this looked like this:


And I thought that looked OK. The transparency effect when you selected a page with this icon  history menu in Safari looked so much cooler that I decided to experiment with that.

My next attempt was to try changing the radial gradient from white to the sky blue color that the site uses as a background.

This ended up looking something like this:

... which didn't really do it for me.

Next I tried reversing the direction of the gradient. The result of this attempt was:

Now we're getting somewhere.

But there's a problem: the little 'b' just looks like a random scribble outside the context of other letters from this font, especially when scaled down this small.

I rasterized the type layer and used the clone stamp tool to make a fake 'b' that looks more recognizable under these conditions, and played around with the gradient scale.

At full size, this attempt looked like this:
A resized version of this is the final version of the favicon.

And that's my favicon story.

Wednesday, October 22, 2008

New feature: quote quality

I've added a new feature to Blogrevolution: quote quality thresholds.

Under each story we show a list of the blogs in Blogrevolution's database that have linked to that particular story. The way Blogrevolution's parser grabbed these quotes wasn't always ideal and often included cruft that wasn't related to the story in question.

To address this I created a statistical model of high- versus low-quality quotes, and on the site this can now be seen limiting the number of quotes that are shown to a smaller number of higher quality quotes. The remainder are still shown without context under the residual final list item, as seen below:

The statistical model is a Bayesian model, like all of the other decision-making models that Blogrevolution uses. It's still something of a work in progress, attaining about 6 out of 7 percent accuracy for quote quality.

One of the interesting things about a model like this is that it discovers all sorts of unexpected relationships that you can nonetheless be sure of, through the magic of empiricism. For instance, presence of the word 'this' was a predictor of a high-quality quote, while appearance of 'that' was a predictor of a low-quality quote. Some of the others you can guess, like the percentage of all-caps words or use of passive voice verb constructions.

Additionally, there is an additional threshold on the uniqueness of the quote - if the context isn't unique enough compared to the other quotes it will get shunted to the residual "also" category of shame at the bottom.
Finally, all of the quotes from the different sites are now listed according to the order that the model scores them, rather than in order of recency as it had been previously.

A new look

I've pushed live a redesign of the site main user interface, partially as a result of some collaboration (read: my pestering for advice) from some friends.

The use of subtle gradients based on the existing color scheme really make the site seem to glide right off the screen, in contrast to the flatness that a predominantly solid color layout communicates. 

The new look looks more professional and more interesting, and perhaps it will enhance the experience of visitors to Blogrevolution.

Tuesday, December 11, 2007

New version up

We've got a new version out today.

About the biggest new feature is the ability to cluster stories by topic.

If you click the new "cluster" link at the top of the page, the stories on the page will be rearranged into groups by topic. The rearrangement is performed entirely using javascript and does not consume an additional page load. This is taken as a preference and on subsequent views the page will be rearranged automatically as soon as it loads.

Other features include improved sentence parsing, so the context information provided for related links will be clipped so that 1) not too much information will be lost 2) but otherwise will be cut at likely at sentence boundaries. The result is usually much more readable.

We've also included a link so that you can find related stories right from the main page, which previously you needed to click on there "permalink" link for each story in order to see, which most people may not have realized.

The next few enhancements will likely mostly be related to look and feel.