Monday, January 14, 2008

8 things about blog post tagging, labeling, and taxonomy

Comments continue on the "8 things" meme.

Justin Kestelyn refers to it as Blogtaggate:

Whether a blog-tag = spam is not a topic I would have foreseen in this venue. Just goes to show you that a community has organic attributes - it comprises many mutually balancing interests, such as ego/atruism, privacy/desire to share, and so on. Which is another way of saying: one person's nourishment is another person's poison, and often you don't know which is which until after the fact.

I should send Justin a tweetgift for his comments...well, maybe not.

Dennis Howlett, after reading Kestelyn's and Eddie Awad's thoughts, added some thoughts of his own:

I had not thought of this game as something that would cause so much angst but it opens an interesting question. OraNA is seen as an aggregtator for business communications so what the heck are people doing polluting it with fun stuff? This is the kind of reaction I would expect to hear from industry management concerned with workforce productivity....

I’m enough of an optimist to believe that the workplace has not become as Dilbertized as the objectors would have us believe. I certainly don’t see such memes as spam. Instead I see such learning as an opportunity to gain a deeper insight into the people with whom we interact. This becomes incredibly important as the notion of a distributed workforce takes hold and more people are working remotely.

But I think we're at the point where everyone's having fun with this. Doug Burns has written a post and entitled it 8 Useful Technical Posts....

Well, almost everyone. Howard Rogers has replied to my previous post that proposed some possible solutions. Let me recap my solutions (more details here), then I'll post Howard's reply.

  • Leave OraNA as is, but be selective in what you read.

  • Modify OraNA to include opt-in content.

  • Roll your own.
Here is a selection from Rogers' comments:

The problem I have with this post is that all your solutions mean someone has to do something that they needn't have done had not this nonsense been started.

Number 1 is based on a false premise, by the way. It seems to assume that people aren't selective in what they read and they need to start being so. But I already select out a lot of the Apex, Business Intelligence stuff because it is not especially relevant to me. Lots of people do likewise already, too. The real trouble, however, is that you can't skip over a flood. As new spam posts arrive, they drive ones with more useful content down the page until they disappear off it. One of my own posts disappeared off the front page in about 12 minutes, for example. You can't selectively read things which aren't there at all!

2. Modify OraNA: so now Eddie Awad (and let's not forget there are other Oracle blog aggregators out there, too) has to modify his code because of the actions of others? And I see you suggesting that blog owners now have to remember to tag their posts properly. So because about 50 people (and growing, unnfortunately) have decided to "play a game", we all have to change the way we do things? Does that seem proportionate or fair to you? It doesn't to me.

3. Roll your own. Yup, everyone's saying I just need to learn how to use an RSS feed reader. As if I don't already! How do you propose I install such a tool on a work PC that has group policies set so that only system admins can install software? How do you think I'll go introducing such a piece of software in an environment that has a SOE that doesn't include one?...

I'd also take issue with your opening sentence here: OraNA got clogged with...things of a non-technical nature. I welcome non-technical blog posts. I make them all the time. I've blogged about the backyard wallabies, Benjamin Britten, a visit to Melbourne, Christchurch, conducting, books, music... you name it. I encourage all technical blog writers to share their non-technical sides whenever they want to. It's not the content that's at issue here. It's that this particular "8 things" content is enclosed in a viral wrapper ("pass it on to 8 others") that was explicitly designed to engineer an exponential growth in such posts. It's the flood of posts that's the issue. It's the fact it's a pyramid scheme that's at issue. It's the fact that it's driving out all other content that's the issue. And it's the fact that those participating in the flood either don't understand the consequences or (which is much worse) don't care about them. The latter in particular have given a giant, collective two-fingers 'up yours' to the Oracle community.

I'll confine my response to the second issue, that of tagging (although I'll note that I use Google Reader, which does not require software installation).

As time goes by, I have become convinced that tagging and labeling is a good policy to follow. When I started my first blog (the Ontario Empoblog) in 2003, it had the same "all-encompassing" spirit that this blog does. As time went on, I determined that I needed to segregate some of the content. Using the technology that I was aware of at the time, this meant that I had to start some entirely new blogs, most notably the Ontario Technoblog (which eventually became the repository for my observations on Oracle, my Motorola Q phone, etc.) and Oppose Traffic Calming Obstructions, a blog that dealt with the most important political issue that we face today. (When will the California presidential primary candidates share their views on speed bumps?)

When I decided to consolidate everything into a single blog, mrontemp, in February 2007, I specifically decided to use Blogger's labeling technology so that people interested (for whatever bizarre reason) in my technical views wouldn't have to read my religious views, and vice versa. As of this morning, my eight (heh) most frequently-used blog labels are as follows:
In addition, I have been able to create labels "on the fly" which have allowed me to address specific issues as needed. The three labels that deserve mention are openworld07, msnphotosserver,and adobenortonzyxelcpu, each of which has allowed me to focus on a specific event or technical issue.

Incidentally, the one thing that I didn't realize back in February is how often my posts overlap multiple labels. Rather than categorizing each post into the most appropriate label, I have tended to apply all applicable labels to a post. Ignoring my "stove top" link collection posts for a moment, this has resulted in some interesting behaviors. Although I haven't measured it, I'd be willing to bet that the vast majority of my "business" posts also have a "technology" label on them. This merely indicates that I'm primarily interested in technology businesses, or in the impacts of technology on business. Whether this detracts from the usefulness of the labels is up to you.

Anyway, if the issue were merely the segregation of personal vs. technical posts (which it isn't - see Rogers' comments on the "viral wrapper" nature of the meme), labeling or tagging would be an ideal way to flag certain posts as personal, certain ones as technical, and so forth. As we pour more and more information out there, some type of meta-labeling of the content will soon become essential, and will become the responsible thing to do. In fact, it's possible that untagged/unlabeled content may be ignored, just because we won't have time to take the time to figure out what the content's about.

As for the taxonomy of posts, this itself needs to be explored further. I do need to read this at some point. And, of course, this gets into how hashtags are defined, etc., etc., etc.

I should note that my taxonomy was initially derived from the way that I used my original account, before I eventually dumped altogether. (Still have to write about THAT decision.)

