spreadingscience

Science 2.0 and beyond

Archive for the ‘Science’ Category

Liking knol somewhat

drops by eye of einstein

So, after
writing about my knol dislikes, let me discuss some of my positive thoughts about knol. Knol is the newest feature from Google.

As I wrote earlier, I do not like some of the non-Web 2.0 features seen in some of the early essays (i.e. lack of any links to other pages, static text, lack of effective conversations). But I do like some of the principles behind this product.

Huge amounts of information are collected inside a person’s head or on their computer. And it is not accessible to anyone else. Getting this tacit information out, making it explicit so that others can use it, is an important goal of many Web 2.0 tools.

By providing singular authorship, knols allow a more ego-driven approach for making the information explicit than Wikipedia does.

That is, Wikipedia also provides a means for moving tacit information into the explicit realm. But, there is no real sense of authorship, nothing to really plant a flag and say I did this, I am providing this to the world.

Knols permit this to happen, which should enlarge the amount of information seen on the web. Because there are a plethora of experts who do not want an anonymous reputation built from Wikipedia but might want a renowned one from a knol.

Finding ways to transform tacit information into explicit are crucial in today’s world. Wikis can do this. Blogs can do this. And so can knols. Knols will not replace other approaches. They provide a new path for the transformation to occur.

Being a scientist, I have a pretty healthy ego (almost a necessity to do research) and I want people to know about the work I do, not simply as an anonymous entry but with my name attached. It is why I want to have my name on a paper that as many people as possible can read. I can then point to it and say This is mine. That is also part of the reason I write a blog. Ego is good.

So, Web 2.0 tools that provide an avenue for ‘named’ publication will find a market.

Plus, a knol entry would probably look a little better to the tenure committee for an academic than an Wikipedia entry or even a blog. The author could use the web stats to demonstrate the impact of the article in ways that current journal articles can not. It would be a novel approach for dissemination of scientific knowledge.

So, I can see a knol developing into some sort of secondary arena for publication of scientific research, for example. This would be a paradigm-shifting possibility. I will be very interested to see if this avenue is used much by the scientific community.

However, the same problem of filtering still matters. Knols could be a tremendous path to transforming tacit information but there will still be an information glut to deal with. My concerns come more from how these articles are found and dealt with.

In the scientific community, this sort of authorship is now dealt with by peer review and publication in reputable, high impact journals. If a knol is going to provide anything similar, especially when it comes to reputation, it will have to function a little differently than it does now.

I don’t worry strictly about people plagiarizing as much as diverting. An example off the top of my head:

I put something up about my latest, cutting edge research. Perhaps about some research in press but with more detail than normal. Perhaps I include some of the information from my grant requests.

The essay provides a spot for people to read about my work and comment, potentially providing insights and questions that can accelerate my ability to innovate, leading me to new areas of research while providing me with documentation for my performance reviews. Great.

This is what the Web provides that no other medium does. And everyone wins. I get my ego-driven scientific reputation enhanced and the world gets a lot of information made available for others to use.

But my article is written for a scientific audience.

Then someone (say a good science writer) takes that information, rewrites it and uses SEO approaches to make sure that people find that article before mine and create ad-driven revenue for themselves.

They get very popular (as I hope they would be, since good science writers are important), get found first from search engines and my page views plummet. So then I have to justify why I wasted my time publishing my research.

No one here is doing anything wrong but it becomes more of a zero-sum game now instead of a win-win. I would no longer have any incentive to use this approach to write about my work.

So, at this point, I am still a little worried about how the articles will be found, just how reputation will be determined and how the filtering of information will be accomplished.

But I do think there is a place for this sort of approach. It may not be there yet but having information ‘owned’ by someone will provide new avenues of dissemination not seen with more anonymous approaches.

It just may take a few iterations to get to perfection.

Technorati Tags: ,

  • 0 Comments
  • Filed under: Science, Web 2.0
  • What scientists are we talking about?

    research by SqueakyMarmot
    Obligatory Reading of the Day: Opening up Scientific Culture [A Blog Around The Clock]:
    [Via ScienceBlogs : Combined Feed]

    Why are so many scientists reluctant to make full use of Web 2.0 applications, social networking sites, blogs, wikis, and commenting capabilities on some online journals?

    Michael Nielsen wrote a very thoughtful essay exploring this question which I hope you read carefully and post comments.

    Michael is really talking about two things - one is pre-publication process, i.e., how to get scientists to find each other and collaborate by using the Web, and the other is the post-publication process, i.e., how to get scientists to make their thoughts and discussions about published works more public.

    Those of you who have been reading me for a while know that I am thinking along some very similar lines. If you combine, for instance, my review of Rainbows End by Vernor Vinge with

    On my last scientific paper, I was both a stunt-man and the make-up artist with Journal Clubs - think of the future! with The Scientific Paper: past, present and probable future, you will see a similar thread of thinking.

    But, what do you think?

    Read the comments on this post…

    Michael Nielsen’s essay is well worth reading, since it goes into some detail about the need for openness in science. It has a lot of depth and it very thought provoking.

    The comments are also very interesting, with an ongoing dialog between skeptics and believers. But a lot of these discussions only examine the barriers and pressures of a very small slice of the researchers in the US.

    The science that is discussed in these essays really only encompasses those scientists in research universities where tenure competition is the fiercest. Take a look at some recent statistics (2006):

    22 million scientists/engineers in US
    18.9 million actually employed
    69.4% work in the business sector
    11.8% work for the government
    8.2% work at 4 year institutions
    9.7% work in the business/industry sector for a non-profit

    This discussion seems to have focused on just a small fraction (but an important one) of the number of scientists who would benefit from these tools. These researchers are funded by grants and are in tenure-track positions at 4 year research universities.

    More scientists work at non-profits. What sorts of pressures are brought to bear there to prevent open collaboration? How different are these pressures from a research university? Those in business might also benefit from these approaches but have another set of barriers. Can they be surmounted?

    This discussion is really important but it also conflates a large number of scientists/engineers who have different needs and pressures. There are 12 million in business who will have different needs than the 1.6 million at research universities.

    How do Web 2.0 approaches impact them differently? Will some be more readily accepting of these tools than others?

    We need to realize that scientists encompass a much larger group than those in tenure track positions at universities.

    Technorati Tags: ,

  • 0 Comments
  • Filed under: Science, Web 2.0
  • Remembering is not enough

    teacher
    by foundphotoslj
    Why is genetics so difficult for students to learn?:
    [Via Gobbledygook]

    This Sunday morning at the International Congress of Genetics, Tony Griffiths gave an interesting presentation with the above title. He identified 12 possible reasons why students have problems learning genetics. His main argument: students should learn concepts and principles and apply them creatively in novel situations (the research mode). Instead, too many details are often crammed into seminars and textbooks. In other words, students often stay at the lowest level of Bloom’s taxonomy, the remembering of knowledge. The highest level, the creation of new knowledge, is seldom reached, although these skills are of course critical for a successful researcher.

    Andrew Moore from EMBO talked about the teaching of genetics in the classroom. He was concerned that a survey found that molecular evolution (or molecular phylogeny) was taught in not more than 30% of European classrooms. He gave some examples of how principles of genetics can be integrated into high school teaching.

    Wolfgang Nellen explained his successful Science Bridge project of teaching genetics in the classroom, using biology students as teachers. Interestingly, they have not only taught high school students, but also journalists and - priests (German language link here). Politicians were the only group of people that weren’t interested in his offer of a basic science course.

    Teaching is a very specific mode of transferring information, one that has its own paths. It is an attempt to diffuse a lot of information throughout an ad hoc community.

    But it is often decoupled from any social networking, usually having just an authority figure disperse data, with little in the way of conversations. There is little analysis and even less synthesis, just Remembering what is required for the next test.

    Bloom’s taxonomy is a nice measure of an individual’s progress through learning but it is orthogonal to the learning a community undergoes. Most instruction today is geared towards making the individual attain the highest part of the pyramid.

    How does this model change in a world where social networking skills may be more important? What happens to Remembering when Google exists? When information can be so easily retrieved, grading for Remembering seems foolish.

    The methods we use to teach at most centers of higher education are, at heart, based on models first developed over a century ago. It may be that they will have to be greatly altered before some of the real potential of online social networks will occur.

    Technorati Tags: , ,

    Medicine 2.0

    x ray by D’Arcy Norman
    Why Health or Medicine 2.0? [ScienceRoll]:
    [Via The DNA Network]

    While medicine is usually at the forefront of new technology for diagnosis and treatment, the patient-doctor interface has not followed. Perhaps that might change soon.

    Some interesting statistics have recently been published. According to Pharma 2.0:

    99% of physicians are online for personal or professional purposes
    85% of offices have broadband
    83% consider the Internet essential to their practice

    So doctors are online.

    At The Deloitte Center, you will find even more details about the web usage of health consumers. Yes, there will be much more patients who seek health-related information on the web and who want to communicate with their doctors via e-mail or Skype.

    And patients are ready.

    We have tools to work with:

    And we have concepts.

    So it will happen because patients and doctors need to have contact. The question is how long will it take?

    Technorati Tags: ,

  • 0 Comments
  • Filed under: Science, Web 2.0
  • Paper discussions

    conversations by b_d_solis
    Reputation Matters:
    [Via The Scholarly Kitchen]

    A new (and flawed) study reveals that reputation matters. In fact, it’s core to scientific expression.
    [More]

    While the study may not be definitive, the ability to have a conversation on it helps tremendously. Research usually does not progress in a straight, ascending line. It switches back and forth, sometimes having to retrace its steps in order to find the right path.

    Being able to discuss the results of a paper, what it did right and what it did wrong, is not something that usually has occurred in public. Now it can. I expect there to be more and more such discussions as time goes on.

    Technorati Tags: ,

  • 0 Comments
  • Filed under: Science, Web 2.0
  • Browsing clouds, not papers

    Commentary: Summarizing papers as word clouds:
    [Via Buried Treasure]

    The web provides entirely new avenues for decimating information and for visualizing it. It can be very time consuming to browse throught the literature, even though the most creative research often comes from the intervention of Serendipity (the Wikipedia article lists many examples).

    Lars discusses some interesting numbers and comes up with an intriguing solution.

    For use in presentations on literature mining, I did a back-of-the-envelope calculation of how much time I would be able to spend on each new biomedical paper that is published. Assuming that all papers were indexed in PubMed (which they are not) and that I could read papers 24 hours per day all year around (which I cannot), the result is that I could allocate approximately 50 seconds per paper. This nicely illustrates the point that no one can keep up with the complete biomedical literature.

    When I discovered Wordle, which can turn any text into a beautiful word cloud, I thus wondered if this visualization method would be useful for summarizing a complete paper as a single figure. To test this, I extracted the complete text of three papers that I coauthored in the NAR database issue 2008. Submitting these to Wordle resulted in the three figures below (click for larger versions):


    These sorts of rich figures could be very useful in a scientific setting, where being able to rapidly filter a large number of articles is important.

    However, he does notice that this approach may not work for all articles, unless there are changes made, either in how the articles are written or in the software that creates the visuals.

    …I think a large part of the problem is the splitting of multiwords; for example, “cell cycle” becomes two separate terms “cell” and “cycle”. Another problem is that words from different sections of the paper are mixed, which blurs the messages. These two issues could be solved by 1) detecting multiwords and considering them as single tokens, and 2) sorting the terms according to where in the paper they are mainly used.

    And it would be easy to adapt the visuals to scientific needs and then be able to track if they are actually useful in practice.

    Technorati Tags: ,

  • 0 Comments
  • Filed under: Science, Web 2.0
  • Now we have article 2.0

    ruby on rails by luisvilla*
    I will participate in the Elsevier Article 2.0 Contest:
    [Via Gobbledygook]

    We have been talking a lot about Web 2.0 approaches for scientific papers. Now Elsevier announced an Article 2.0 Contest:

    Demonstrate your best ideas for how scientific research articles should be presented on the web and compete to win great prizes!

    The contest runs from September 1st until December 31st. Elsevier will provide 7.500 full text articles in XML format (through a REST API). The contestants that creates the best article presentation (creativity, value-add, ease of use and quality) will win prizes.

    This is a very interesting contest, and I plan to participate. I do know enough about programming web pages that I can create something useful in four months. My development platform of choice is Ruby on Rails and Rails has great REST support. I will use the next two months before the contest starts to think about the features I want to implement.

    I’m sure that other people are also considering to participate in this contest or would like to make suggestions for features. Please contact me by commenting or via Email or FriendFeed. A great opportunity to not only talk about Science 2.0, but actually do something about it.

    While there are not any real rules up yet, this is intriguing. Reformatting a science paper for the Internet. All the information should be there to demonstrate how this new medium can change the way we read articles and disperse information.

    We have already seen a little of this in the way journals published by Highwire Press are able to also contain links to papers published more recently, that cite the relevant paper. Take for example this paper by a friend of mine ULBPs, human ligands of the NKG2D receptor, stimulate tumor immunity with enhancement by IL-15.
    Scroll to the bottom and there are not only links in the references, which look backwards from the paper, but also citations that look forward, to relevant papers published after this one.

    So Elsevier has an interesting idea. Just a couple of hang-ups, as brought out in the comments to Martin’s post. Who owns the application afterwards? What sorts of rights do the creators have? This could be a case where Elsevier only has to pay $2500 but gets the equivalent of hundreds if not thousands of hours of development work done by a large group of people.

    This works well for Open Source approaches, since the community ‘owns’ the final result. But in this case, it very likely may be Elsevier that owns everything, making the $2500 a very small price to pay indeed.

    This could, in fact, spear an Open Source approach to redefining how papers are presented on the Internet. This is because PLoS presents its papers in downloadable XML format where the same sort of process as Elsevier is attempting could be done by a community for the entire communtiy’s enrichment.

    And since all of the PLoS papers are Open Access, instead of the limited number that Elsevier decides to chose, we could get a real view of how this medium could boost the transfer of information for scientific papers.

    I wonder what an Open Source approach would look like and how it might differ from a commercial approach?

    *I also wonder what the title of the book actually translates to in Japanese?

    Technorati Tags: , , ,

    Better than I could say it

    Friday Fun: My discipline is more pure than yours, so there!:
    [Via Confessions of a Science Librarian]

    As usual, XKCD is right on target!

    Part of what I wrote a few days ago is much better illustrated above. It’s funny because it is true.

    Technorati Tags:

  • 0 Comments
  • Filed under: Science
  • Two a day

    hard drive platters by oskay
    15 human genomes each week:
    [Via Eureka! Science News - Popular science news]

    The Wellcome Trust Sanger Institute has sequenced the equivalent of 300 human genomes in just over six months. The Institute has just reached the staggering total of 1,000,000,000,000 letters of genetic code that will be read by researchers worldwide, helping them to understand the role of genes in health and disease. Scientists will be able to answer questions unthinkable even a few years ago and human medical genetics will be transformed.
    [More]

    Some of this is part of the 1000 Genomes Project, an effort to sequence that many human genomes. This will allow us to gain a tremendous amount of insight into just what it is that makes each of us different or the same.

    All this PR really states is that they are now capable of sequencing about 45 billion base pairs of DNA a day. They are not directly applying all of that capability to the human genome. While they, or someone, possibly could, the groups involved with 1000 genomes will take a more statistical approach to speed things up and lower costs.

    It starts with in depth sequencing of a couple of nuclear families (about 6 people). This will be high resolution sequencing equivalent to 20 passes of the entire genome of each. This level of redundancy will help edit out any sequencing errors from the techniques themselves. All these approaches will help the researchers get a better handle on the most optimal processes to use.

    The second step will look at 180 genomes but with only 2 sequencing passes. The high level sequence from the first step will serve as a template for the next 180. The goal here is to be able to rapidly identify sequence variation, not necessarily to make sure every nucleotide is sequenced. It is hoped that the detail learned from step 1 will allow them to be able to infer similar detail here without having to essentially re-sequence the same DNA another 18 times.

    Once they have these approaches worked out, and have an idea of the level of genetic variation expected to be seen, they will examine just the cgene oding regions of about 1000 people. This will inform them of how best to proceed to get a more detailed map of an individual’s genome.

    This is because the actual differences expected to be found among any two humans’ DNA sequences is expected to be quite low. So they want to identify processes that will highlight these differences as rapidly and effectively as possible.

    They were hoping to be sequencing the equivalent of 2 human genomes a day and they are not too far off of that mark. At the end of this study, they will have sequenced and deposited into databases 6 trillion bases (a 6 followed by 12 zeroes). In December 2007, GenBank, the largest American database had a total of 84 billion bases (84 followed by 9 zeroes) that took 25 years to produce.

    So this effort will add over 60 times as much DNA sequence to databases as have already been deposited! It plans to to this in only 2 years. The databases, and the tools to examine them, will have to adapt to this huge influx of data.

    And, more importantly, the scientists doing the examining will have to appreciate the sheer size of this. It took 13 years to complete the Human Genome Project. Now, 5 years after that project was completed, we can potentially sequence a single human genome in half a day.

    The NIH had projected that technology will support sequencing a single human genome in 1 day for under $1000 in 4 years or so. The members of 1000 genomes are hoping to be able to accomplish their work for $30-50,000 per genome. So, the NIH projection may not be too far off.

    But what will the databases look like that store and manipulate this huge amount of data? The Sanger Institute is generating 50 Terabytes of data a week, according to the PR.

    Maybe I should invest in data storage companies.

    Technorati Tags: ,

  • 2 Comments
  • Filed under: Science
  • Life scientists at Friendfeed

    Life Sciences likes this: Friendfeed:
    [Via OpenWetWare]

    FriendFeed
    I’m going to assume that only those currently using FriendFeed will understand the self reference in the title but if you didn’t that’s OK. Just keep on reading, you’ll get it, eventually.

    If you happen to be interested or work in the life sciences area I’d recommend you take a few minutes to read Cameron Neylon’s great post about FriendFeed and how it’s been embraced by the life sciences community.

    I won’t go into the details of how FriendFeed works, but it’s been rapidly gaining momentum as a medium for groups of users to network and discuss each other’s shared content.

    FriendFeed’s about page states:

    FriendFeed enables you to keep up-to-date on the web pages, photos, videos and music that your friends and family are sharing. It offers a unique way to discover and discuss information among friends

    The life sciences community has picked up on this great website like wildfire. A recently created room called The Life Scientists grew in a very short period (a week?) from just a few active online colleagues to a whooping 100+ users.

    FriendFeed rooms offer a way to share on-topic content and further discussion via comments. Commenting can be done on any shared items (yours or others). This has proven to be useful for rapid input and idea sharing amongst the room’s users.

    Amongst the 100+ users of the Life Scientists room, both Cameron from Science in the Open and Pedro from Public Rambling have found FriendFeed to be useful and explain why it works. Both great reads.

    This is the sort of tool that can very rapidly connect researchers, in ways that Twitter or Facebook do not. Not only can links be put up rapidly but comments are there very fast. It allows one to ask questions, post answers. It is a lot like how the Bionet newsgroup, which you can still access, used to be back in the old days (i.e. 1993-95) when Usenet ruled the Internet.

    This is the online equivalent of the water cooler where you can run into someone and strike up a conversation that could lead to innovative thinking. Only instead of two people having to occupy the same space at the same time, this approach decouples both, permitting a much wider circle of people to be involved.

    Technorati Tags: , ,

    Categories