Science 2.0 and beyond
20 Oct
by tanakawho
Have a problem: Build a web resource:
[Via business|bytes|genes|molecules]
Via a post on Hacker News I ran into the Tulane School of Medicine Student Portal.
As one of the developers writes on Hacker News
Our goal is ‘making med school easier, one less click at a time’. We have no business model, just trying to make our own lives easier.
There is further description on the site
Hello, and welcome to the Tulane University School of Medicine’s Student Portal! This website came into existence over the course of the latter part of the 2007-2008 school year through the hard and volunteered work of a group of students concerned with making the lives of TUSOM medical students a bit easier. Our university community is a dynamic place with much do, and many resources to use on a daily basis. In an effort to reduce the amount of time and energy required to accomplish these activities, we developed this site. Our work has been led, and graciously supported by the Medical Student Government, the Office of Medical education, and the Deans.
This site was developed by students, and for students. As such, if there is a feature that you feel would benefit the community as a whole, please feel free to drop us a line at tmedweb@tulane.edu. We’ll take your suggestion into serious consideration, and see if it within our abilities to accomplish.
That’s the kind of initiative that one loves to see. With hosting cheap, good web frameworks and people increasingly looking to the web for information, not the last time you’ll see something like this either. The key is realizing that often, if you have a problem, you can solve it yourself, and relatively inexpensively. Sometimes you can build a business out of it, a la 37signals
They (or at least Niels Olson) have some pretty ambitious future plans as well.
Online tools are not only cheap, they are also mature enough that it no longer requires a CS degree and years of experience to put something together. Coupled with Open Source, an individual can create pretty complex sites.
A similar attitude is beginning to invade other areas, such as Biotech. So many of the support functions are now available from standalone companies (such as DNA synthesis, sequencing, etc.) that we are almost at the point of ‘garage biotech’, where a very substantial amount of work can be done through contract for much less cost that having to create the infrastructure yourself.
This will never be at the same level as IT sorts of processes but it can have a profound effect on the cost of doing the biotech business. Add in IT tools to facilitate rapid information dispersal and a small lab can accomplish things that required a much larger facility just 5 years ago.
Technorati Tags: Open Access, Web 2.0
16 Oct
by Marcin Wichary
An experiment in open access publishing:
[Via Bench Marks]
The new edition of Essentials of Glycobiology, ” the largest, most authoritative volume available on the structure, synthesis, and biology of glycans (sugar chains), molecules that coat cell surfaces and proteins and play important roles in many normal and disease processes” came out yesterday. What’s particularly interesting about this edition is that it is simultaneously being released online in a freely accessible version, which will hopefully allow the textbook to reach a wider audience.
The theory often espoused is that online release of books leads to higher sales of the print edition, and for us, this is a good test case. Quoting from the press release, John Inglis, Executive Director and Publisher of CSHL Press notes that,
“We will be tracking its usage and how readers of the site respond to the availability of a print version, for both research and teaching purposes.”
“This is an innovative development in the distribution of an established textbook that we hope will benefit readers, authors and editors, and the publisher,” says Ajit Varki, M.D., the book’s executive editor and a leader of the Consortium of Glycobiology Editors, which initiated the project. Varki is Professor at the University of California, San Diego. The Consortium also includes Professors Richard Cummings, Emory University; Jeffrey Esko, UC San Diego; Hudson Freeze, Burnham Institute for Medical Research; Pamela Stanley, Albert Einstein College of Medicine, New York; Carolyn Bertozzi, UC Berkeley; Gerald Hart, Johns Hopkins University School of Medicine; and Marilynn Etzler, UC Davis.
The online edition of Essentials of Glycobiology can be found here, and the print version can be ordered here.
This is a very interesting experiment. I knw that there are books I want to have to be able to access important data when I am not online, usually when I am writing. Being online can be distracting then.
But sometimes when I am online, I want a quick fact. Then finding them in an authoritative source is really important. I personally think that this sort of dual use could be very productive. It has been successful for some fiction works.
I too will be looking to see how well this works.
Technorati Tags: Open Access, Science, Web 2.0
15 Oct
by Corey Leopold
Knowledge wants to be connected:
[Via Science Commons]
That was the core message of a speech by Science Commons’ John Wilbanks at the Open Access and Research Conference 2008 a few weeks ago in Brisbane, Australia. The conference was an opportunity both to celebrate Australia’s burgeoning leadership in harnessing open approaches for advancing science and scholarship, and to talk about where the global open access (OA) movement is headed.
Thanks to the Web, we can gain knowledge about a meeting happening thousands of miles away. Then we can read what others thought of the meeting.
Here’s an excerpt from an article by the Australian Broadcasting Corporation’s Anna Sellah on the speech, which provides a succinct summary of the reasons why open approaches are vital for deriving value from the vast amounts of scientific data being produced:
“The value of any individual piece of knowledge is about the value of any individual piece of lego,” Wilbanks said in a keynote address to the Open Access and Research Conference held in Brisbane last week.
“It’s not that much until you put it together with other legos.”
He says the ability to connect knowledge brings scientific revolutions. For example Watson and Crick’s breakthrough on the structure of DNA involved them reading all the scientific papers on nucleotide bonding and encoding it in the form of a physical model, says Wilbanks.
But this kind of “human scale” analysis is no longer feasible in an age when automated laboratory processes generate vast amounts of information faster than the human mind can process it.
“For example, we have 45,000 papers about one protein or one gene,” says Wilbanks.
He says a scientist might once have analysed the impact of one drug on one gene, but now pipetting robots are capable of analysing 25,000 genes at a time.
“Most of the research says the smartest of us can handle five or six independent variables at once – not 25,000,” he says
You can read the full piece at the ABC website.
Those of you following news of the conference and developments in Australia may also be interested in Open Oz and Doing things with data, two posts by OA leader Dr. Alma Swan, who was also a keynote speaker at the event.
Social netowwrks evolved to deal wit large problems containing many variables (i.e. “what signs are present indicating that its save to plant?”) If we can have large groups examine the problem, many more variables can be looked at. A question would be “Does the number of variables increase linearly with network size or exponentially?”
Technorati Tags: Social media, Web 2.0
15 Oct
Many Eyes = Many Brains:
[Via The Scholarly Kitchen]
Socially networked data visualization becomes a reality with Many Eyes.
[More]
I’ve seen this before and it is a great idea. Lets make data visualization open and allow social networking approaches help come up with new ways to look at data.
Visit and have some fun.
Technorati Tags: Social media, Web 2.0
14 Oct
by Snap®
Current Biomedical Publication System: A Distorted View of the Reality of Scientific Data?:
[Via Scholarship 2.0: An Idea Whose Time Has Come]
Why Current Publication Practices May Distort Science
Young NS, Ioannidis JPA, Al-Ubaydli O
PLoS Medicine Vol. 5, No. 10, e201 / October 7 2008
[doi:10.1371/journal.pmed.0050201]
Summary
The current system of publication in biomedical research provides a distorted view of the reality of scientific data that are generated in the laboratory and clinic. This system can be studied by applying principles from the field of economics. The “winner’s curse,” a more general statement of publication bias, suggests that the small proportion of results chosen for publication are unrepresentative of scientists’ repeated samplings of the real world.
The self-correcting mechanism in science is retarded by the extreme imbalance between the abundance of supply (the output of basic science laboratories and clinical investigations) and the increasingly limited venues for publication (journals with sufficiently high impact). This system would be expected intrinsically to lead to the misallocation of resources. The scarcity of available outlets is artificial, based on the costs of printing in an electronic age and a belief that selectivity is equivalent to quality.
Science is subject to great uncertainty: we cannot be confident now which efforts will ultimately yield worthwhile achievements. However, the current system abdicates to a small number of intermediates an authoritative prescience to anticipate a highly unpredictable future. In considering society’s expectations and our own goals as scientists, we believe that there is a moral imperative to reconsider how scientific data are judged and disseminated.
Full Text Available At:
[http://medicine.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pmed.0050201]
[More]
There are also several other article links at Scholarship 2.0 that are important to read. In particular, there is a discussion of the Winner’s Curse This is an observation that the winner of an auction in which all the bidders have similar information will usually overbid. The idea of the Winner’s Curse was first observed by oil companies bidding on offshore leases.
These authors make the point that publication in scientific journals may also suffer from a bias that resembles the Winner’s Curse. The Winner in an auction presents a price beyond the mean in order to succeed. The authors argue that in a similar way, papers often present data beyond the mean in order to get published.
It is an intriguing speculation and one that might deserve further examination. The data that do get published may be misleading and this may be a reason why early published clinical trials are often not replicated later.
And they make the point that the huge amount of data being generated has not seen a corresponding increase in the places to publish the data or conclusions based on the data. This introduces greater likelihood that what is actually published does not represent the ‘real value’ of the data.
I expect this work to produce some interesting discussions. I would be surprised if it is entirely true but it does propose some changes that might be worth implementing.
Technorati Tags: Open Access, Science
1 Oct
by *Micky
Zotero facing a lawsuit:
[Via Bench Marks]
I’ve written about Zotero before, it’s an intriguing tool, essentially a Firefox plug-in for managing your reference list and other pieces of information. It’s a bit of a hybrid between online management tools like Connotea and things like Papers which you store on your own computer.
The bad news is that Thomson Reuters, the manufacturers of EndNote, are suing George Mason University and the Commonwealth of Virginia because a new version of Zotero lets you take your EndNote reference lists and convert them for use in Zotero. Yes, this is the same Thomson of Thomson ISI, secret gatekeepers of journal impact factors. They really seem to be going out of their way to lose what little goodwill they have left with the scientific community. It will be interesting to see if this reverse engineering for interoperability holds up in court as something that should be prevented.
This is sadly typical. I loved EndNote back in the 90s because it was a great Mac product. Much better for my needs than its competition, Reference Manager, which was much more of a Windows product. Niles Software really listened to what people wanted and added some very useful features, such as linkage of the library to a Word document. Then you could put the citation directly into Word.
I convinced others at my company to buy it. I had searches for a wide variety of topics. The purchase of Niles Software by ISI (now part of Thomson) started a period of fitful Mac updates and costly upgrades. I have since moved to other applications (most recently Sente) that did what I wanted for a more reasonable price.
This lawsuit seems like a losing gambit to me since any user can convert their library to Endnote XML that any other application can read. All it will do is drive users away from their software as the customers find new uses for the data.
Because the database really belongs to the user not to Thomson. But they try to obscure that by using a proprietary format. This hurts the enduser. Say I have an 10 year old EndNote library I forgot to convert. With say 8000 entries. And an old version of EndNote that no longer works in OS X. How am I supposed to move it to what I currently use without having to purchase EndNote simply for this one use? My favorite example of this horrendous process is the Mac cookbook program Mangia!
In the early 90s, this was the best program of its type, bar none. It permitted one to have a huge recipe library, that could be easily displayed, searched and also permitted an easy grocery list to be printed. Many of us love it but it does not work in OS X.
Mangia is no longer produced by anyone. The database created was proprietary and undocumented. The program had no export feature. Now it no longer even runs on any computer. So every user now has a database that they created that is unusable. There are workarounds to try and get to the data but they are not satisfactory. They also require the user to be able to run Mangia, which is really impossible for virtually everyone using a Mac today (I think my mother kept an old Mac around just so she could still use this one program. She has a huge library of recipes).
So all I have on my computer is a dead database of Mangia recipes that can never be used again. All that work over years to create a database and it is useless. This is why people need to be careful when they chose a database application.
Companies that respond to enduser innovation by suing, rather than innovating, are not ones that I see being very successful in the long run. There are other programs that can do the same thing. They are often created by companies that are more responsive and user friendly than larger companies. Suing users just drives people to more open formats.
More from Bench Marks:
More importantly, it’s yet again, a lesson in tying yourself to one locked-down proprietary format for your data and your work tools. If you’ve put a huge amount of time and effort into maintaining your EndNote list and a better tool comes along and becomes the standard, all that work may go to waste and you’ll have to start over again. A similar lesson was learned last week from anyone who purchased music downloads from WalMart. Richard Stallman recently gave a warning along the same lines about the much-hyped concept of “cloud computing”.
As you experiment with new online tools for your research, heed these lessons well. Demand tools that support open standards and open formats, tools where if you put in an effort (and most of these tools demand a lot of effort), you can get that work out again so you don’t have to repeat it for the next tool you try. Further discussion here and here.
This gets at the same topic. Who owns the data? There are some very important and useful aspects to having data in the cloud. It makes it very easy for people to access their data from everywhere. Small groups can have a slew of Web 2.0 applications up and running for their group with little cost for maintenance or upkeep. This has some very real benefits.
But it must be balanced against the possibility that you no longer control the data. Your work is on servers belonging to someone else. They can change ownership and all of a sudden the cloud is not so free. To me, cloud computing is great for things that need rapid prototyping, easy access and are, at heart, ephemeral.
There are many types of data. Some of it is short term. It used to only be found on yellow sheets of paper or perhaps the multiple drafts of a paper. These data fit quite well in the cloud. I have an email address in the cloud that I only use for online purchases. Anything going there is a result of those purchases and does not clog up my real email.
But it is foolhardy for any organization to put the guts of its data anywhere that it has does not have absolute control over. These are things that losing access to would have severe ramifications for the business.
So, echoing David, stay away from anything that ties you into a specific, closed format. It can come back to bite you big time.
Technorati Tags: Open Access, Web 2.0
17 Sep
I’ve mentioned some of the work by Everett Rogers on technology adoption. The bell curve seen refers to the adoption of innovations by a community. But what about individuals? Is there a process whereby they adopt new technology?
Turns out there is. You can read the work by George Beal and Joe Bohlen in 1957. There is a five step path that each individual appears to go through, although some people are slower to transition between steps.
Beal and Bohlen also described what sources of information were used at each stage. Through the first two, mass media and government agencies were most important.
This was really an attempt to get an ‘unbiased’ viewpoint since friends and salesmen (saesmen always came in last) were the next two sources. But for the last 3 stages, neighbors and friends were the largest source of information, moreso than any other group.
So, early in the diffusion process, unbiased experts are sought. But when the evaluation process is started, the experiences of close ties within a local social network become the most important. For most people, the opinions and personal experiences of their friends are most important for adoption of a new innovation than any external source.

Now the innovators in a community race through these steps. They often are connected to outside groups and use social interactions unavailable to others in the community to more rapidly move through the last 3 steps.
The early adopters take information from the innovators and use their own connections to move through the stages, not as fast as the innovators, but with reasonable speed.
But it is the majority of the community that relies on the early adopters and innovators within the community to inform themselves. Research has shown that they require much more information from trusted sources within the community than innovators and early adopters. Without this information from peers, they will not progress rapidly through the last 3 stages.
The laggards are the slowest to move through the 5 stages. They do not trust most outside sources, so the awareness and interest stages are slowed. Plus they will only listen to certain trusted sources within the community. Until those trusted sources make their own way through the 5 stages, the laggards will not progress.
So, to alter the rate of diffusion of innovation in a community, increased lines of communication must be available, increasing the information that can be provided to individuals.This helps with the first 2 steps. but mostly only for the 16% of the community at the left side of the curve.
However, of greatest importance are the connections between members within the community, particularly the thought-leaders found in the early adopters. About 70% of a community will not adopt new innovations unless they hear clear reasons why, from trusted individuals within the community.
No amount of salesmanship or external proof will easily move them. But, tgiven he right opinion from a community thought-leader and they will rapidly make the transition.
This is an area that Web 2.0 technologies can be of real value. Not only do they make it easier for members of a community to disburse information, they also help the community more accurately identify who is in each group, permitting more focused, explicit approaches to be used to move individuals through the 5 steps.
The thought-leaders can more rapidly progress through the stages and can extend their opinions much more rapidly to the majority because they are not required to be in the same place at the same time as the others in the group. Thus there will be more opportunities for their viewpoints to be assimilated by the majority.
Increasing the rate of diffusion of innovation in a community really means increasing the speed with which each individual progresses through the 5 step.
Technorati Tags: Knowledge Creation, Open Access, Science, Web 2.0
2 Sep
by freeparking
London Science Blogging Conference on Friendfeed:
[Via Confessions of a Science Librarian]
Boy, do I ever love Friendfeed.
You can follow what’s going on at today’s London Science Blogging Conference in its very own Friendfeed room. Each session has it’s own thread with multiple people commenting on the proceedings. It actually gives a very good and surprisingly understandable impression of what’s going on in the sessions. Most of the sessions have dozens of comments. Check it out.
You can also check me out on Friendfeed (join, you won’t regret it). Michael Nielsen has also created a room for the upcoming Science in the 21st Century conference.
As a sort of aggregator of one’s life, FriendFeed can be especially useful for all sorts of ad hoc social meetings, such as conferences. I wonder what the ‘room’ looks like for a really large conference, the 10,000 attending ones? I’ll be sure and check out the Science in the 21st Century room.
Technorati Tags: Open Access, Science, Web 2.0
19 Aug
by mandj98
Mendeley = Mekentosj Papers + Web 2.0 ?:
[Via bioCS]
Via Ricardo Vidal: Mendeley seems to be a Windows (plus Mac/Linux) equivalent of Mekentosj Papers (which is Mac OS X only, and has been described as “iTunes for your papers”). In addition to handling your PDFs, it has an online component that allows sharing your papers and other Web 2.0 features (billing itself as “Last.fm for papers”).
Here, I’m reviewing the Mac beta version (0.5.6). I am focusing most on the desktop side and compare it to Papers, because I have a working solution in place and I would only switch to Mendeley if the experience is as good as with Papers. (I.e., my main problem is off-line papers management, Web 2.0 features are icing on the cake.)
By Mac standards, the app is quite ugly. Both Mendeley and Papers allow full-text PDF searches, which is important if you want to avoid tagging/categorizing all your papers. Papers can show PDFs in the main window, copy the reference of the paper and email papers. Mendeley in principle can also copy the reference, but special characters are transformed to gibberish in this beta version. Papers allows you to match papers against PubMed, Web of Science etc., while Mendeley only offers to auto-extract often incomplete meta-data. This matching feature is extremely useful as you get all the authorative data from the source, and most often Papers can use the DOI in the PDF to immeadiately give you the correct reference. Update: Mendeley also uses DOIs to retrieve the correct metadata, if available. (Thanks, Victor for your comment.)
[More]
Well, this is a beta being compared to a product on the market (and Papers is quite a good application). I would expect some of the rough edges to come off as it progresses. What will be interesting to see is how the Web 2.0 aspects turn out. They could provide a route for useful filtering of information as people’s paper databases build up. By having these accessible, it will be much easier to see which papers are really being read and used.
The links between literature libraries, online profiles and readership are potentially very interesting. Something to keep an eye on, particularly as the edges are evened out.
Technorati Tags: Science, Social media, Web 2.0
19 Aug
by sylvar
It has been about a month since Science published Electronic Publication and the Narrowing of Science and Scholarship by James Evans. I’ve waited some time to comment because the results were somewhat nonintuitive, leading to some deeper thinking.
The results seem to indicate that greater access to online journals results in fewer citations. The reasons for this are causing some discussion. Part of what I wlll maintain is that papers from 15 years ago were loaded with references for two reasons that are no longer relevant today: to demonstrate how hard the author had worked to find relevant information and to help the reader in their searches for information.
Finding information today is too easy for there to be as great a need to include a multitude of similar references.
Many people feel the opposite, that the ease in finding references, via such sites as PubMed, would result in more papers being cited not less. Bench Marks has this to say:
Evans brings up a few possibilities to explain his data. First, that the better search capabilities online have led to a streamlining of the research process, that authors of papers are better able to eliminate unrelated material, that searching online rather than browsing print “facilitates avoidance of older and less relevant literature.” The online environment better enables consensus, “If online researchers can more easily find prevailing opinion, they are more likely to follow it, leading to more citations referencing fewer articles.” The danger here, as Evans points out, is that if consensus is so easily reached and so heavily reinforced, “Findings and ideas that do not become consensus quickly will be forgotten quickly.” And that’s worrisome–we need the outliers, the iconoclasts, those willing to challenge dogma. There’s also a great wealth in the past literature that may end up being ignored, forcing researchers to repeat experiments already done, to reinvent the wheel out of ignorance of papers more than a few years old. I know from experience on the book publishing side of things that getting people to read the classic literature of a field is difficult at best. The keenest scientific minds that I know are all well-versed in the histories of their fields, going back well into the 19th century in some fields. But for most of us, it’s hard to find the time to dig that deeply, and reading a review of a review of a review is easier and more efficient in the moment. But it’s less efficient in the big picture, as not knowing what’s already been proposed and examined can mean years of redundant work.
But this is true of journals stored in library stacks, before online editions. It was such a pain to use Index Medicus or a review article (reading a review article has always been the fastest way to get up to speed. It has nothing to do with being online or not) and find the articles that were really needed. So people would include every damn one they found that was relevant. The time spent finding the reference had to have some payoff.
Also, one would just reuse citations for procedures, adding on to those already used in previous papers. The time spent tracking down those references would be paid out by continuing usage, particularly in the Introduction and Materials & Methods sections. Many times, researchers would have 4 or 5 different articles all saying the similar things or using the same technique just to provide evidence of how hard they had worked to find them (“I had to find these damned articles on PCR generated mutagenesis and I am going to make sure I get maximum usage out of them.”)
There are other possible answers for the data that do not mean that Science and Scholarship are narrowing, at least not in a negative sense. A comment at LISNews leads to one possible reason – an artifact of how the publishing world has changed.
The comment takes us to a commentary of the Evans’ article.While this is behind the subscription wall, there is this relevant paragraph:
One possible explanation for the disparate results in older citations is that Evans’s findings reflect shorter publishing times. “Say I wrote a paper in 2007″ that didn’t come out for a year, says Luis Amaral, a physicist working on complex systems at Northwestern University in Evanston, Illinois, whose findings clash with Evans’s. “This paper with a date of 2008 is citing papers from 2005, 2006.” But if the journal publishes the paper the same year it was submitted, 2007, its citations will appear more recent.
[As an aside, when did it become Evans's rather than Evans'? I'd have gotten points of from my English teacher for that. Yet a premier journal like Science now shows that I can use it that way.]
The commentary also mentions work that appears to lead to different conclusions:
Oddly, “our studies show the opposite,” says Carol Tenopir, an information scientist at the University of Tennessee, Knoxville. She and her statistician colleague Donald King of the University of North Carolina, Chapel Hill, have surveyed thousands of scientists over the years for their scholarly reading habits. They found that scientists are reading older articles and reading more broadly–at least one article a year from 23 different journals, compared with 13 journals in the late 1970s. In legal research, too, “people are going further back,” says Dana Neac u, head of public services at Columbia University’s Law School Library in New York City, who has studied the question.
So scientists are reading more widely and more deeply. They just do not add that reading to their reference lists. Why? Part of it might be human nature. Since it is so much easier to find relevant papers, having a long list no longer demonstrates how hard one worked to find them. Citing 8 articles at a time no longer means much at all.
That is, stating “PCR has been used to create mutations in a gene sequence 23-32” no longer demonstrates the hard work put into gathering those references. It is so easy to find a reference that adding more than a few looks like overkill. That does not mean that the scientists are not reading all those other ones. They still appear to be, and are even reading more, they just may be including only the relevant ones in their citations.
Two others put the data into a different perspective. Bill Hooker at Open Reading Frame did more than most of us. He actually went exploring in the paper itself and added his own commentary. Let’s look at his response to examining older articles:
The first is that citing more and older references is somehow better — that bit about “anchor[ing] findings deeply intro past and present scholarship”. I don’t buy it. Anyone who wants to read deeply into the past of a field can follow the citation trail back from more recent references, and there’s no point cluttering up every paper with every single reference back to Aristotle. As you go further back there are more errors, mistaken models, lack of information, technical difficulties overcome in later work, and so on — and that’s how it’s supposed to work. I’m not saying that it’s not worth reading way back in the archives, or that you don’t sometimes find overlooked ideas or observations there, but I am saying that it’s not something you want to spend most of your time doing.
It is much harder work to determine how relevant a random 10 year old paper is than one published last month. In the vast majority of cases, particularly in a rapidly advancing field (say neuroscience) papers that old will be chock full of errors based on inadequate knowledge. This would diminish their usefulness as a reference. In general, new papers will be better to use. I would be curious for someone to examine reference patterns in papers published 15 years ago to see how many of the multitude of citations are actually relevant or even correct?
Finally, one reason to include a lot of references is to help your readers find the needed information without having to do the painful work of digging it out themselves. This is the main reason to include lots of citations.
When I started in research, a good review article was extremely valuable. I could use it to dig out the articles I needed. I loved papers with lots of references, since it made my life easier. This benefit is no longer quite as needed because other approaches are now available to find relevant papers in a much more rapid fashion than just a few years ago.
Bill discusses this, demonstrating that since it is so much easier to find relevant article today, this need to help the reader in THEIR searches is greatly diminshed.
OK, suppose you do show that — it’s only a bad thing if you assume that the authors who are citing fewer and more recent articles are somehow ignorant of the earlier work. They’re not: as I said, later work builds on earlier. Evans makes no attempt to demonstrate that there is a break in the citation trail — that these authors who are citing fewer and more recent articles are in any way missing something relevant. Rather, I’d say they’re simply citing what they need to get their point across, and leaving readers who want to cast a wider net to do that for themselves (which, of course, they can do much more rapidly and thoroughly now that they can do it online).
Finally, he really examines the data to see if they actually show what many other reports have encapsulated. What he finds is that the online access is not really equal. Much of it is still commercial and requires payment. He has this to say when examining the difference between commercial online content and Open Access (my emphasis):
What this suggests to me is that the driving force in Evans’ suggested “narrow[ing of] the range of findings and ideas built upon” is not online access per se but in fact commercial access, with its attendant question of who can afford to read what. Evans’ own data indicate that if the online access in question is free of charge, the apparent narrowing effect is significantly reduced or even reversed. Moreover, the commercially available corpus is and has always been much larger than the freely available body of knowledge (for instance, DOAJ currently lists around 3500 journals, approximately 10-15% of the total number of scholarly journals). This indicates that if all of the online access that went into Evans’ model had been free all along, the anti-narrowing effect of Open Access would be considerably amplified.
[See he uses the possessive of Evans the way I was taught. I wish that they would tell me when grammar rules change so I could keep up.]
It will take a lot more work to see if there really is a significant difference in the patterns between Open Access publications and commercial ones. But this give and take that Bill utilizes is exactly how Science progresses. Some data is presented, with a hypothesis. Others critique the hypothesis and do further experiments to determine which is correct. The conclusions from Evans’ paper are still too tentative, in my opinion, and Bill’s criticisms provide ample fodder for further examinations.
Finally, Deepak Singh at BBGM provides an interesting perspective. He gets into one of the main points that I think is rapidly changing much of how we do research. Finding information is so easy today that one can rapidly gather links. This means that even interested amateurs can find information they need, something that was almost impossible before the Web.
The authors fail to realize that for the majority of us, the non-specialists, the web is a treasure trove of knowledge that most either did not have access to before, or had to do too much work to get. Any knowledge that they have is better than what they would have had in the absence of all this information at our fingertips. Could the tools they have to become more efficient and deal with this information glut be improved? Of course, and so will our habits evolve as we learn to deal with information overload.
He further discusses the effects on himself and other researchers:
So what about those who make information their life. Creating it, parsing it, trying to glean additional information to it. As one of those, and having met and known many others, all I can say is that to say that the internet and all this information has made us shallower in our searching is completely off the mark. It’s easy enough to go from A –> B, but the fun part is going from A –> B –> C –> D or even A –> B –> C –> H, which is the fun part of online discovery. I would argue that in looking for citations we can now find citations of increased relevance, rather than rehashing ones that others do, and that’s only part of the story. We have the ability to discovery links through our online networks. It’s up to the user tho bring some diversity into those networks, and I would wager most of us do that.
So, even if there is something ‘bad’ about scientists having a more shallow set of citations in their publications, this is outweighed by the huge positive seen in easy access for non-scientists. They can now find information that used to be so hard to find that only experts ever read them. The citation list may be shorter but the diversity of the readers could be substantially enlarged.
Finally, Philip Davis at The Scholarly Kitchen may provide the best perspective. He also demonstrates how the Web can obliterate previous routes to disseminate information. After all the to-do about not going far enough back into the past for references, Philip provides not only a link (lets call it a citation) from a 1965 paper by Derek Price but also provides a quote:
I am tempted to conclude that a very large fraction of the alleged 35,000 journals now current must be reckoned as merely a distant background noise, and as far from central or strategic in any of the knitted strips from which the cloth of science is woven.
So even forty years ago it was recognized that most publications were just background noise. But, what Philip does next is very subtle, since he does not mention it. Follow his link to Price’s paper (which is available on the Web, entitled Networks of Scientific Papers). You can see the references Price had in his paper. a total of 11. But you can also see what papers have used Price’s paper as a reference. You can see that quite a few recent papers have used this forty year old paper as a reference. Seems like some people maintain quite a bit of depth in their citations!
And now, thanks to Philip, I will read an interesting paper I would never have read before. So perhaps there will be new avenues to find relevant papers that does not rely on following a reference list back in time. The Web provides new routes that short circuits this but are not seen if people only follow databases of article references.
In conclusion, the apparent shallownesss may only be an artifact of publishing changes, it may reflect a change in the needs of the authors and their readers, it may not correctly factor in differences in online publishing methods, it could be irrelevant and/or it could be flat out wrong. But it is certainly an important work because it will drive further investigations to tease out just what is going on.
It already has, just by following online conversations about it. And to think that these conversations would not have been accessible to many just 5 years ago. The openness displayed here is another of the tremendous advances of online publication.
Technorati Tags: Open Access, Science, Social media, Web 2.0