Not such a killer, perhaps

tubes by Hey Paul
Why article tagging doesn’t work:
[Via Bench Marks]

Reading William Gunn’s recent blog posting, Could this be the Science Social Networking killer app? got me thinking more about the many online scientific reference list repositories like Connotea, CiteULike and 2Collab, and why they are failing to catch on. William is suggesting a Pandora-like system of expert reviewers tagging papers to set up a recommendation system. I’m not sure this would be really helpful–what you get from a scientific paper is very different from what you get from listening to a song, and their interconnectedness works in very different ways. And it brings to mind the failings of organizing your references by tags.

If you’ve ever dealt with any of these social bookmarking sites, you know how incredibly tedious they are to use. Even for journals like CSH Protocols, where we have buttons on every article to add it directly to these sites, you still end up jumping through hoops, filling out forms, writing summaries, adding tags. You’re on the spot at that moment to come up with a list of tags that will remind you about the content of that paper. As your worldview changes over time, and with it your research priorities, you’re probably going to want to revisit many papers and add additional tags. Even with all this time-consuming work, you still may not have added an appropriate tag to let you find what you want to find at a given moment. Did you add a tag for every method used in the paper? Every conclusion, every subject referenced? That band on the gel in figure 3 that you’re ignoring today might be very important to you tomorrow. How are you going to tag the paper in case you need to find it again?


I totally agree with David. There are two kinds of list-making people in the world – those that make lists and those that don’t. Applying tags to articles works well for the list-makers but many, many scientists are just too time-deprived to fill in boxes or check off squares.

But the real problem, as notes, is that in research the semantics change very rapidly. A paper that was really useful for a its description of a new cell surface marker may, at a later date, become important for a particular technique. How are you supposed to know beforehand which tags to use.

And in many cases since the research is at the cutting edge, there are no appropriate tags. So I make up one – call it IL-99. But someone else working on the same protein, adds a new tag called EDFWR. How in the world to the tags properly link these papers?

And, finally, no researcher gets any credit for really annotating a paper. Taking the time to do this, or to recommend a paper, is time they can be focussing on getting tenure, getting a grant in or writing a paper. Where is the payoff to the individual scientist?

Tagging research is not an easy problem to fix. We may all agree that it is worthwhile but we are a long way from any reasonable solution.

One thought on “Not such a killer, perhaps”

  1. Thanks for your comments, Richard. I agree we haven’t yet arrived at a reasonable solution, for many of the reasons you outline. As I attempted to make clear to David, my proposal was to “add value to scientists’ existing collections of papers, without requiring any work from them in tagging their collections.”

    If you’re familiar with the mechanism by which Pandora recommends songs, you’ll know that the songs are scored according to certain attributes that all songs have, like Major/Minor key tonality, presence of harmony, etc. This is different in two major ways from how tagging usually works. First, various items of content can have various tags, from a fairly unrestricted vocabulary. Second, the tags are usually user-generated. With Pandora, the attributes of the songs are annotated by musical experts, working for Pandora, using a controlled vocabulary. These differences help explain why Pandora works well, whereas a folksonomy approach only sorta works.

    Having spent quite some time trying and testing social media tools for scientists, I’ve become even more jaded than Mr. Crotty himself about their potential. Anything that requires someone to put in even the minimal effort required to tag articles will probably see relatively low uptake. In contrast, a Pandora-like approach would allow a service implementing it to offer value to the user immediately, instead of only after building a large collection/tagging many articles/making a large number of contacts/etc. To get good recommendations from Pandora, all you do is enter a song or artist and then sit back and listen. Likewise, a corpus of papers which had been expertly scored according to a set of attributes such as “uses the technique of …”, “written by a student of …”, and so on could power a recommendation engine requiring little user input other than interesting/not interesting.

Leave a Reply