Over at Many2Many (which is an excellent group weblog btw!) there is an intense discussion about tagging and categorization.
The arguments used are wide-ranging and the discussion often strays into the realm of "the future of the web". Here's the latest post at the time of writing. You need to read the rest of it to be able to follow it.
Here's my 2 cents however.
From an information management point of view any kind of categorization is a good thing. If seen in a broad sense tagging is just another way of categorization. Although the discussion about the relative values of having professionally created tags for stuff versus the "total chaos" currently going on on Flickr and del.icio.us (amateur tagging) is very interesting it is seems to bypass an more important discussion in my mind.
Do we really need it? What would be the value of this sort of thing?
Note that localized, contained, tagging (again, like Flickr and del.icio.us) can work rather well, but the item being discussed seems to be extending this to just about everything. Websites, blogs, etc.
While this seems like a nice idea, what would be the point? Google already finds what I need, if it is text. Adding a category is not going to help much. Think about it. There will be a gazillion websites tagged with "cars" (or "car" or "automobile" or "voiture" or...but that's another part of the discussion).
What use would browsing that category in whatever way be? You'd still end up using some kind of full text search to get your actual results. Whether or not these results are filtered up-front by demanding a certain tag to be present does little to help. It reminds me of all these advanced search-engine startups that promised to semantically analyze entire sites to give you more relevant hits. They all died because their results really weren't that much better.
From my own experience in working with large media content databases I can also tell you that although librarians and documentalists love the idea of having category-based navigation, it sees very little use in real life scenarios. People find it easier to just search for plain text. The same goes again for semantic results (i.e. search for "car" also returns results for things like "Ford", "Citroen", "Bus", "Formula 1" etc.).
Fuzzy searching sees some use in real life, usually for finding possible misspellings (Is it "Khadafi", "Gadhafi", "Ghadhafi" or...?). This could to some extent be applied to unify and sanitize various tag sets. This so that "car" and "cars" can be merged at some level, as well as combining similar tags across different sets (One from Flickr and one from del.icio.us for instance). No ontology or translation table needed, just some simple algoritms. Nice, but of limited use.
Of course you could monitor the popularity of tags, which is what some services are already doing (I really like Flickr's method of changing the size of the tag in question), but this still doesn't help for wide-scale deployment. The popular sites will be easy to find through Google anyway (more links etc.). The hard to find ones will fade even further into obscurity. Again, nothing is gained in terms of the retrievability of information.
There is one area where tags could see a lot of use, and that is non-text content. Looking at it in that way also means that as a whole Flickr's tags are more useful then del.icio.us' ones, as del.icio.us refers to websites that, usually, have more then enough text to be retrievable by Google.
Images, videos and audio-files may not have sufficient textual data around them to be indexed properly, so they will benefit from tagging. But again, the value is not gotten from the tag itself, but from the fact that someone took the time to add meta-data to the content in the first place. Whatever method is used (commenting could work just as well for instance) someone will have to do the work in the first place.
One other thing people are talking about would be to share your particular tags with others, or rather, let otherwise see what you tagged in what way. The idea is that you will be able to find relevant sites more easily, because your FOAF network will filter the results for you. This is basically what del.icio.us already does. In fact, Furl does it better, as it also indexes the pages added.
I might be missing something big and important here (please tell me if that is so!), but it seems the whole thing isn't really worth the hype...
Comments