Saturday, January 31, 2009

The RSS Clutter

There's a fair amount of news out there. I use Google Reader to capture it all; however, it still can't make sense of the noise.

News and events are viral; they disseminate quickly throughout the internet. Take for example what happened today. Google accidentally labeled the whole internet as malware. Below every search result there was a nice warning: "This site may harm your computer." Even Google.com was labeled harmful. Interesting mistake.

While the dust settled and Google fixed the glitch, everyone blogged about it. Everyone. No everyone. I looked at Google News and there are 407 articles from 89 sources about the story.

As I went through my Google Reader articles for the day I heard about it at least 5 times. Makes sense, I follow tech blogs. But we can do better. A nice feature for Google Reader would be analyzing the content of my unread items and showing the unread articles by story. A way for it to say, "this article about Google malware is similar to these four; look at all of them now." Or one step further, look at your unread item by category: iPhone, Kindle, Google, etc...

This is hard; language is complcated. The fortunate part is that we know much more about the articles then what they say. We know when they were published, what sites and press releases they link to, the main subject (the tags), and the source. These content clues along with learning from the users - allowing them to teach the computer and say "Yes they're similar." or "No they're not" - you could build a very good content grouping engine.

Google News does it on a larger scale with the "All news and articles" link below each result. Let's try and leverage that for Google Reader.

No comments: