OddThinking

A blog for odd things and odd thoughts.

On Blog Categories

Categories

WordPress, like much blog software, allows the author to define a number of categories, and to assign each post into one or more of them.

I am not entirely happy with the choices of categories on OddThinking, which has lead me to ask the obvious question: “What is the problem that that categories are expected to solve?”

What are they good for?

I have argued before that the difference between newsgroups and blogs is that in a newsgroup your subscribe by topic, and in a blog you subscribe by person.

Killfiles in news-readers somewhat blur that distinction, allowing you to avoid reading posts from known plonkers.

Similarly, categories allow you to blur the distinction by allowing you to restrict yourself to look at all the posts on a certain topic. If a blogger has two hobbies: discussing astro-physics and taking cute photographs of kittens, they can elect to categorise each article, and readers can be sure to read all of the articles that interest them, while not bothering to read the ones that don’t.

At least, that is what I think categories are meant for, and what I think categories currently fail to achieve.

Why aren’t they good for it?

There are two flaws, but I think both are surmountable.

A key aspect of the blog is support for RSS feeds (I’m using RSS in the generic sense of including. ATOM and the like.) Yet, typically blogs have no ability to subscribe by category. Until categories become “first-class citizens”, by allowing subscription, I believe that they will never have much weight on the blog.

This has an easy technical fix. I will be considering adding such a patch to OddThinking, as a low-priority. I’d be interested if blog authors who have tried this have found it to be popular. If so, I’d like to see it appear in some of the default and more popular themes, so that the idea becomes familiar with the blog-reading public.

The second flaw is more substantial. Personal blogs tend to be fairly eclectic, and category-tagging tends to be unreliable. In such an environment, categories are better used as filters rather than as additive. Using the example above, an unemotive, but avid, astro-physicist would be be better off subscribing to all posts except kitten photos, rather than just subscribing to astro-physics discussions. This is analogous to kill-files, which reject posts that fit recognised patterns, rather than accepting only posts that fit recognised patterns.

After discussing this with Richard A, we imagined a subscription page which provided a list of categories, and allowed you select “all articles with these category tags, except never any articles with these category tags”. The result would be a URL of an RSS feed ready for your aggregator.

I haven’t done much research to find out if such a technology already exists. It doesn’t appear to be available on many of the blogs that I wish had a higher signal-to-noise ratio.

OddThinking and Categories

Am I using categories in a way that helps my readers? I am not sure, which is what prompted this consideration.

Most of my categories are not about the topic of discussion. I don’t have a category about Editing Documents (despite several posts on or near the topic). I do, inexplicably, have one about hula-hooping though.

I think the reason for this is that I am not attempting to have a blog about a small handful of subjects (e.g. contrast it that to Gluk, which only contains news for jugglers.) Most of my categories are about the style of the writing, e.g. whether it is a rant or geeky or humourous.

Aside: I originally chose to place the categories under the posts, because I think irony works best when you don’t broadcast that it is an attempt at humour before the delivery. However, on the Internet, it is safer to make sure that the irony-blind are made aware of it afterwards. The recent changes in the site’s structure have unfortunately undermined that.

I guess categorising by style serves the same purpose as categorising by topics – people can elect to read items based on their format as much as their content.

However, I am using the categories for an ulterior motive. To help me see the overall direction of the blog, what sort of stories are preferred, and to make sure I don’t stray too far.

For example, blogs with lots of articles about blogging and being a blogger sometimes seem to disappear up their own tail (to mix a metaphor), and become cliched. Therefore, I want to keep a tight reign on the percentage of blog articles about this blog. (Yes, I see the irony of posting this here, and I am aware I have strayed into the red zone in the past few weeks.)

So, to provide a way of monitoring my drift, I still plan to keep the categories even if I don’t address the flaws that I raised above. I feel guilty, though, about keeping a user-visible blog feature for my sake rather than yours. This article is partly to assuage that guilt by apologising and pleading mitigating circumstances.


Comments

  1. Nice to know I exist in your universe.

  2. Sunny,

    Yes, of course, I believe in you, but only in a philosophical sense. I posit the existence of “Sunny” as a “reader” as an abstract entity of the Platonic Universe.

    I am still trying to understand Descartes well enough to prove that you exist. “I blog, therefore I am read.”

    Hmmm… Now this is a little off-topic, which raises a whole new question of whether blog categories should be modified as the comments come in, or perhaps comments should have their own categories.

  3. Coming from the Groupware world, specifically the Lotus Notes/Domino platform, categories and folders have always been something that was available to users for organizing data, but not required or even very useful. The thing that kills categories and folders is full-text searching, something that Notes has had for over 10 years and Google understands with their Gmail product. As far as blogs are concerned, they may be a bit more useful for new visitors to get an idea of what the owner writes on a given topic, but are not as necessary than the month groupings that everyone has. Search engines have really made the “physical world” type of organizing an out-dated idea.

    That being said, I still use categories on my blog and allow users to subscribe to a few of them via RSS. The Domino platform makes this so easy that I can easily envision giving users the ability to personalize their feeds based on what they want to read. Currently, my category only RSS feed URLs look like the following:

    http://www.phigsaidwhat.com/Phigmentb/Phigment.nsf/catcontent.rss?openview&restricttocategory=Article

    But they could easily be modified to handle keyword searches or fully boolean search conditions. This might be an example:

    http://www.phigsaidwhat.com/Phigmentb/Phigment.nsf/searchcontent.rss?openagent&restricttosearch=Article%20and%20Domino%20near%20Exchange

    As long as the search is valid, sending back a personalized RSS feed should be a snap to do. The only issue would be performance because each feed would have to be generated on the fly when the user accessed the page, but this is something the Domino does very well. The key is to have an integrated search engine in your platform of choice.

    Sean—

  4. Personally, I think all one needs is a “greatest hits” list — and that’s not even a category, but a simple flat list of your most popular/notable posts.

    “If you’re reading this, you are a low-value demographic”

    http://www.codinghorror.com/blog/archives/000421.html

  5. Sean,

    That’s an insightful comment. Maybe categories should be considered somewhat passé.

    I recently read seven arguments against meta-data by Corey Doctorow (via Jeff Atwood’s article) which somewhat supported your thoughts even further.

    I’ve been pondering both of these over the last few days.

    In the end, I couldn’t bring myself to call for the abolition of categories, even here. Perhaps in a blog which remained on topic (I give the example of Gluk, above), it may be feasible. However, for a blog like this one, I think users would rather choose broad categories, like humour or rant, which is not easy to search for.

    I think there is room for both Doctorow and me to be right here. Much of Doctorow’s arguments boil down to the idea that you can’t trust random strangers to markup meta-data correctly. I think he is right.

    However, within the context of a blog, using past behaviour of the author, you can start to trust their category markups to be better than nothing. I hope that readers will start to trust my categories to that level. (If you are finding them unsatisfactory, I beg you to let me know! Jeff, I will respond to your suggestion soon.)

    The only issue would be performance because each feed would have to be generated on the fly when the user accessed the page

    I can see your concern. I am still in two minds about whether RSS can really scale. (Heaps of real-world evidence suggesting that it is working day-to-day on large web-sites is hardly sufficient to change my mind! 😉 )

    One solution here would be to change to a cached query approach. Rather than your clean solution of allowing users to specify the search terms in a URL, have them submit their terms to a web form, store the terms in a database and return an RSS feed with an record identifier.

    Each time a post is added, it would be tested against each of the RSS queries, and the record marked as appropriate.

    Each time a subscriber fetches the RSS details, it would be a simple database lookup to find the feed and the corresponding articles marked as matching the feed.

    This approach would be more appropriate if you had few posts, and many subscribers with similar interests (e.g. lots of them sharing common search terms).

Leave a comment

You must be logged in to post a comment.