Confusing will not work

key by ul_Marga

There is a possibly interesting paper in Genome Biology by Barend Mons et al: Calling on a million minds for community annotation in WikiProteins. I say possibly because the paper itself is quite confusing to me but the overall goal seems to be a cool concept. This group has created and is encouraging the use of “WikiProteins” a community annotation system for “community knowledge.” Sounds a bit fuzzy? Well, reading the paper does not completely help. For example here is the abstract

WikiProteins enables community annotation in a Wiki-based system. Extracts of major data sources have been fused into an editable environment that links out to the original sources. Data from community edits create automatic copies of the original data. Semantic technology captures concepts co-occurring in one sentence and thus potential factual statements. In addition, indirect associations between concepts have been calculated. We call on a ‘million minds’ to annotate a ‘million concepts’ and to collect facts from the literature with the reward of collaborative knowledge discovery. The system is available for beta testing at webcite.


This is an interesting attempt but the community they are asking for is not in existence yet. The goal is extremely worthwhile, since the best way to create knowledge from the huge mountain of data being created is to incorporate large social networks. But the community must be created first.

However, at the moment in the science community there is a large activation energy (yes, human social interactions also require energy to be expended in creating the network before the information flow can become self-sustaining). First, there needs to be demonstrable proof that putting time into community annotation will be productive and rewarding. There is no proof of this yet.

Second, most scientists are creatures of habit; they have developed a workflow that is successful. In order to get them to change, it had better be easy. Again, time is important, especially in the early phases of community building.

I spent some time at the site trying to get an idea of what was involved. I still did not really figure it out. I do not believe many working scientists will either.

However, this is an important site and one that should be watched. Simply because the initial site is not there yet does not mean it will not quickly get a lot closer to perfection. It is a beta. It is easy to incorporate feedback and move rapidly to something more usable. Lowering the barrier to entry would help a lot.

These sorts of tools are too useful for them to remain unused. A million minds will someday be involved in this work. But it will not happen until a strong community is created.

Online communities will be how we solve the difficult problems facing us. The sooner they are functional, the sooner we can begin finding solutions.

Technorati Tags: ,

Enterprise is next

city by * etoile
Why Web 2.0 Is No Bubble: Corporations Are Willing to Pay for It:

Everyone seems to want an answer to the question “When will Web 2.0 startups start making money?” The implication is that unless we can answer the question, the “bubble” of Web 2.0 will burst and all of us who believe in this stuff will be revealed as fantasists.

The fact is, it’s incredibly hard to make money as a Web 2.0 startup aimed at consumers.

There are hundreds of these companies, and they all clamor to brief us at Forrester. Each has its own twist on blogs, social networks, ratings, user generated video, or whatever. It’s hard to get people to pay attention to a new tool, and the value of the tool depends on lots of participation — the classic chicken-and-egg problem. Your competitor is always one twist ahead of you. Some of these startups will succeed but the odds are one in a thousand — you need just the right idea, at the right time, with the right push or set of potential customers, and you need to take off with such velocity that you leave the competition in the dust.

So much of the focus is on the consumer area of Web 2.0. That is where the juice is but it is not where the really long term effects will be seen. That comes from enterprise uses. Computer games have had a real influence on our lives but the introduction of the PC into the enterprise is where real change occurred.

Web 2.0 is just beginning to alter business protocols and shifting the paradigms to more decentralized creation of knowledge. Some organizations embrace this and will be the first to reap the rewards of enhanced decision making. There are a host of small companies, this one included, that are working to bring these tools into the enterprise.

The amazing thing is that there are a class of startup companies making good money right now from Web 2.0. They’re not flashy and they don’t grow like mushrooms. But they’ve got all the business they can handle and they are growing. I am talking about companies that serve corporate social application needs. This isn’t the typical Web 2.0 business paradigm, since serving corporate customers means lots of client service, which is people-intensive — it doesn’t lift off miraculously like a pure technology startup. In fact, in many of these companies, the technology itself is positively mundane. But the startups grow because they deliver value for which they can charge a premium and get customer loyalty. The customers of these companies don’t defect when something shiny and new comes along, because they like the service they’re getting.

Research organizations, particularly biotechnology and Big Pharma, are a little slower to embrace these tools, especially the research departments. The sooner these get up to speed with Web 2.0 the sooner they can begin to harness the problem solving aspects of online conversations.

Technorati Tags:

Build it before you need it

seedling by Thiru Murugan
Build Your Network Before You Need Them:
[Via Web Strategy by Jeremiah]
Jeremiah brings up some very important points. A network is not useful if you need it before it really exists. It needs to already be present.

Here’s a few things I’ve learned, and hope you intake, invest, and pass on:

1) You’re always looking for the next opportunity, simply shutting down what else is in the market is fool hearted. It doesn’ mean you need to jump ship before 1 month, or 1 year, but it means you should be talking to recruiters, companies, and hiring managers to see what next skills are needed now, and in the future. This will actually help your current employer, as you continue to skill up, take on new projects, as they invest in you. Remember, even if you work for someone else, you are a company of one.

2) Those who ignore the party/conversation/network when they are content and decide to drop in when they need the network may not succeed. It’s pretty easy to spot those that are just joining the network purely to take & not to give. Therefore, be part of the party/conversation/network before you need anything from anyone. Start now, and continue to build relationships by giving now: share knowledge, help others, and become a trusted node and connector, not just an outlying ‘dot’ of a comet that swings in every 4 years or so.

That is how things work if we need a network while looking for a job. An online social network is similar. It is much better to begin creating it now than to wait until a better time.

A small seed now can naturally grow quite rapidly. It is harder to grow if it is under a lot of other pressures.

Technorati Tags:

Tending a garden

garden independentman
Getting Conversation Ready:

[Via Beth’s Blog: How Nonprofits Can Use Social Media]

Holly Ross wrote a good reflection piece about public conversations on blogs and how to get your audience ready for that conversation. She makes the point:

What I am saying is that your audience may not be ready to have the conversation that social media enables. That’s because social media does not just enable conversations.It enables PUBLIC conversations.

I think we have to remember that it takes time build the community to have the conversation and that it doesn’t happen right away. You have to be ready as conversation facilitator. Alexandra Samuel did a workshop called “Bringing Your Community to Life” at Netsquared and offered some terrific practical advice about you get the conversation started.

Some key points:

Key points to encourage participation:

Focus on promoting conversation

Make it happen, don’t wait for it

Connect like-minded participants

Connect complimentary threads

Plan pro-actively, implement reactively

A community is not built rapidly and a conversation does not always easily begin. It requires nurturing and time, just like a garden. It has to be curated by active,enthusiastic members. They have to reach out to others, to begin the dialogs that will enhance the entire network.

Just as an outstanding garden does not spontaneously come into being, an online community requires active management. A lot of work, somettimes. But like a well-tended garden if given the right care, it can pay off handsomely.

Technorati Tags:

I love the title

VIDEO: If the CIA can collaborate with Web 2.0 tools, so can you:
[Via Enterprise 2.0 Blog]

Having trouble trying to sell in Web 2.0-style collaboration to the higher ups in your enterprise organization? Are there VPs and CXOs that are shying away from wiki-style knowledge management because they don’t get it or they fear confidential information will be passed carelessly among employees and partners? Do they feel that the information is

Intellipedia is a greta example of how well a wiki can work, even in organizations where access control is most important. The embedded video give some nice details on how to sell this technology and how their greatest detractors became their biggest fans,

Technorati Tags:

Fun inside the firewall

monument valley by Wolfgang Staudt
IBM Builds LOTS of Social Apps:

My friend Luke sent me this BusinessWeek article about enterprise social network tools. There’s lots here.

First, take away from this that the social network technologies you know about in the consumer space are being rebuilt inside the firewall for business. Why? Those apps are perfect for business, because they do a better job of communicating information the way humans figure it out.

Second, understand that there are people looking for more from their social applications than food fight and super fun wall. If you’re developing, consider what might make for good business applications.

Third, bear in mind that what you might be doing for fun and leisure right now on the social networks might give you an edge on using collaborative technologies in upcoming months. It might just be the thing you’re doing at work, and not just the thing you’re doing at home.

What do you think about all this?

Welcome to the new world. Entertainment is driving the leading wedge of Web 2.0 but the rocket it lights under social media is not escaping the notice of business. Being able to not only capture tacit information but also to help create new knowledge are two of the most useful aspects of Web 2.0

When applied to science, these social tools will help get the right information to the right places. By making it useful to the end user, these tools will help workflow and increase productivity.

In this new world, weirdly, IBM may be leading the way, a distinct difference from how it dealt with the personal computer revolution.

Perhaps even the largest of companies can learn from past errors.

Technorati Tags: , ,

Irony abounds

thesis by cowlet
Case study of the IR at Robert Gordon U:
[Via Open Access News]

Ian M. Johnson and Susan M. Copeland, OpenAIR: The Development of the Institutional Repository at the Robert Gordon University, Library Hi Tech News, 25, 4 (2008 ) pp. 1-4. Only this abstract is free online, at least so far:

Purpose -The purpose of this paper is to describe the development of OpenAIR, the institutional repository at the Robert Gordon University.

Design/methodology/approach – The paper outlines the principles that underpinned the development of the repository (visibility, sustainability, quality, and findability) and some of the technical and financial implications that were considered.

Findings – OpenAIR@RGU evolved from a desire to make available an electronic collection of PhD theses, but was developed to become a means of storing and providing access to a range of research output produced by staff and research students: book chapters, journal articles, reports, conference publications, theses, artworks, and datasets.

Originality/value -The paper describes the repository’s contribution to collection development.

And it only costs £13.00. So an article describing an open archive is not itself open. What a shame because open archives will be the way to go. Learning how an organization put one together, especially one that contains more than just journal articles, would be useful.

But it did lead me to this which describes two organizations that will serve as open archives for any paper for which the authors has retained copyright. What it also makes clear is that most researchers still maintain the rights for any preprint versions of the work.

That is, the only copyright that is usually transferred is the one that was peer-reviewed and approved, Any previous version can be archived, At least for most journals. If the work was Federally funded, most journals permit archiving the approved version after a limited embargo time, such as 6 months.

There is a database that details the publication policies of many journals. Ironically, there is no copyright information for Library Hi Tech news, the publication containing the OpenAIR article.

Let’s look at some others.

For instance, Nature Medicine permits archiving of the pre-print at any time and the final copy after 6 months. They require linking to the published version and their PDF can not be used. So just make your own.

On the other hand, Biochemistry restricts the posting of either the pre- or post-print print versions. A 12 month embargo is imposed only for Federally funded research. Others apparently can never open archive. The only thing that can be published at the author’s website is the title, the abstract and figures.

Let’s see one journal allows reasonable use of the author’s copyright to permit open archiving and the other only permits what is Federally mandated. I’m going to investigate this database further because my choice for journals to publish in will depend on such things as being able to use open archiving.

If my work is behind a wall, it will be useless in a Web 2.0 world. Few will know about it and others will bypass it. Just as the work on OpenAIR is not as useful as it should be.

More irony. Susan Copeland, one of the OpenAIR authors, has done a lot of work on online storage and access to PhD theses. She is the project manager for Electronic Theses at Robert Gordon University and received funding from the Joint Information Systems Committee (JISC), as part of the Focus on Access to Institutional Resources Program(FAIR). She just received the 2008 EDT Leadership award for her work on electronic theses.

She has done a lot of really fine work making it easier to find the actual work of PhD students, something of real importance to the furtherance of science. Yet her article detailing some of her own work is not openly available to researchers.

And finally, ironically, the organization that funded some of her work, JISC, also funds SHERPA, the same database that I used to examine the publication issues of many journals.

In a well connected world, irony is everywhere.

Technorati Tags: ,

Helping people change

I was discussing with one of our execs the progress we’d been making on social media proficiency internally.

And he asked a great question that made me think:

“So, has anyone fundamentally changed their work processes because of the platform?”

And I realized this is the next frontier on what’s turning out to be a large-scale social engineering project.

Getting Business Value Out Of Our Social Software

As we make progress in this journey, I’ve got my eye out for different catagories of business value we’re seeing.  I suppose, at the same time, I should also be keeping my eye out for business value we’re NOT seeing yet.

And, as I’ve mentioned before, we’re seeing business value — in many forms — across the board:

People with specific interests are finding other people with similar interests
Rather than searching big content repositories, people are asking other people for help and answers
A pan-organizational “social fabric” has been created that wasn’t really there before
Folks who spend time on the platform are better educated — and more engaged — in EMC’ business

And more And, just to be clear, there’s no shortage of business benefits — I still stand behind the broad assertion that this has been one of the most ROI-positive IT projects I’ve seen in my career.

Interesting “value nugget” of the week: 

EMC runs a healthy program to bring a large number of interns and co-op students into the company.  They started introducing themselves to each other on the platform.

What started with “name, rank, serial number” blossomed into a wonderfully diverse set of conversations about careers, favorite hangouts, what it means to work at EMC, what is everybody doing, and so on.

I would argue that — whatever millions that EMC spends on this intern/coop program — we’ve now made it 10-20% more valuable, simply because we connected people to each other, and connected them all to the broader company. 

At zero incremental cost.

But we want more. Much more.

Right up front EMC can demonstrate easily how new technologies save money and create new opportunities. The problem comes from actually getting people to use the technologies.

Many companies are process-driven. If the process is working, why change? Of course, buggy whip manufacturers probably had a great process also. But if they did not change, they disappeared.

What is driving the world more and more is the rate at which innovations diffuse through an organization. This is a fascinating subject because there are also some hard data behind it, some of it generated over 70 years ago.

Using the rate of adoption of hybrid corn by farmers in the early 1930s, Ryan and Gross were to derive some very important insights. These two researchers interviewed 345 farmers in Iowa about their use of hybrid corn, when the farmers first heard about it and when they started using it.

Here is a figure from their classic paper ‘The Diffusion of Hybrid Seed Corn in Two Iowa Communities’. Even though the hybrid corn had many important advantages it took almost 13 years for this innovation to diffuse throughout the entire community. The actual adoption curve (from their 1943 paper) is compared with a normal distribution curve (in black).

corn curve

If the data are plotted as the cumulative adoption of the innovation, it looked like this:


Both of these types of curves have been seen again and again when the diffusion of innovation is examined. They seem to be derived from basic forces present in human social networks.

Ryan and Gross made several key contributions besides the identification of the S-shaped curve. One was the process by which the innovation diffused. The other was the type of farmer who used the innovation.

They found that there were five stages in the adoption of an innovation by an individual: awareness, interest, evaluation, trial, and adoption. And there were at least 4 different types of farmers, of which the early adopters were the most important.

Early adopters heard about the corn from traveling salesmen and tried small plots to see how well it worked. Later adopters relied on the personal experience of other farmers, usually the early adopters. When there were enough positive reactions from the early adopters, when there were more stories of personal experience, the adoption rate took off.

It was the human social network that was critical for the rate at which the innovation was adopted. The more social connections an early adopter had, the more cosmopolitan they were, the more likely it would be that others would adopt use of the innovation.

Everett Rogers was instrumental in codifying many of the principles of innovation diffusion. Here is his famous rendition of the distribution:


Only 16% of a population is usually made up of the early adopters, the ones that are critical for spreading the innovation to the early majority. The key to the adoption of any innovation is the rate at which early adopters can transmit the knowledge of the benefits to the early majority. In the case of the farmers, it would often take 4 or more years for this to be converted form awareness to adoption.

In many areas of our world today, this is much too slow. Technology is disruptive, meaning that the people who adopt this technology actually deal with the world in entirely different ways than those who do not. It is similar to a paradigm shift, in that those on either side of the shift have a hard time communicating with each other. It is almost as if they inhabit separate worlds.


This can cause some problems because the early adopters are required to communicate with the early majority if an innovation is to diffuse throughout an organization. If they can not, it creates a chasm, which has been described by Geoffrey Moore in his book.

The organization has to take strong action to recognize that this chasm is present and to span it, either with training or, more effectively, with people who have been specially designated as chasm spanners. In many cases using Web 2.0 technologies, they are called online community managers.

Disruptive innovations seem to arrive almost yearly. Without a directed and defined process to increase the rate of diffusion in an organization, if just standard channels of communication are used, innovation will diffuse at too slow a rate for many organizations to remain competitive.


Because there is usually not just one innovation disrupting an organization at a time. Life is not that clean. There can be multiple innovations coursing through different departments, moving early adopters even further away from the rest of the group and expanding the chasm. This only makes communication harder.

So, a key aspect of being able to increase the rate of diffusion is to create a process where early adopters are identified and strong communication channels are created to permit them to pass information to the early majority.

It can no longer be possible to simply let the early adopters go through their 5 stages of adoption and then tell others about it at the water cooler. Designated online community managers, with the training needed to enhance communication channels, will be critical in getting this information dispersed throughout an organization.

Organizations need to take pro-active approaches to span the chasm. Otherwise they will lose out to the organizations that do take such approaches.

Identifying and nurturing the 16% of the organization that are early adopters will be critical for this process. Having community managers who are well embedded in the social structure of the organizations will also be needed to help increase the rates of innovation diffusion.

Technorati Tags: ,

This is more like it

Copy number by dbking
Copy Number Variation Detection:
[Via Bench Marks]

With the sequencing of the human genome came the startling revelation that the number of copies of a given gene can vary widely between individuals. This Copy Number Variation (or CNV), contributes to our species’ genetic diversity but it has also been linked to genetic diseases. This month’s issue of Cold Spring Harbor Protocols features a new method for detecting copy number variation. Like all of our monthly featured protocols, it’s freely accessible for subscribers and non-subscribers alike.

Copy Number Variation Detection Via High-Density SNP Genotyping
describes the use of PennCNV, a new computational tool for CNV detection in data from genomic arrays. Developed in the laboratory of Maja Bucan at the University of Pennsylvania, the software is freely available for download. Analysis with PennCNV will provide a more comprehensive understanding of genome variation and will aid in studies seeking the causes of genetic diseases. More information on PennCNV can be found in this Genome Research article, PennCNV: An integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data.

I took the liberty of showing the entire post from David’s blog because, in contrast to my story below, this demonstrates a very good approach for publishing scientific work online.

It highlights a useful new protocol that can be downloaded for free. It also links to a Genome Research article that I can also download for free. Nice. I can quickly get up to speed on a novel protocol.

Protocols, particularly new ones are very useful to have. Making a small number available for free is a nice way to get people to check out the journal. I have it in my newsfeed. I like CSHP and enjoy David’s blog tremendously. Now I just need to find a way to become an adjunct professor at some research organization with an institutional license so I can read all the articles.

Technorati Tags: , ,

Some science journals are messed up

I posted this at my personal blog but thought it might be of interest here since it demonstrates just how current online tools have changed the way scientific research is published, presented and read.

flying snake by Beige Alert
Why snakes don’t have legs:
[Via 2collab public bookmarks]

Tags: Hox gene, Homeobox gene, Limb
Authors: Cunliffe, Vincent
Source: Trends in Genetics; 15, 8, Page 306; 1 August 1999
Sharing: Public

I’m providing a detailed examination of an online journey I took this morning that demonstrates how the Internet has altered the landscape for publishing of articles in scientific journals. Online access certainly changes how we search for and how we read articles. It is also changing where we chose to publish.

So I see this interesting name for an article – Why snakes don’t have legs – in my newsfeed. I click on thru (why it is on 2collab I do not know?) and get this page. Great. ScienceDirect which usually charges for journal access. But this is an article from 1999. Surely it will be open by now?

Nope. They want $31.50 for a nine year old article. With no abstract or any other way to determine whether this article is worth the price. $31.50! First off, few articles in science today that are nine years old are worth $5, much less $31.50. Secondly, with no abstract how am I to even figure out if it is worth the price?

This greatly limits access to the article and encourages other routes for getting the information than reading it. Why would a scientist want to publish an article that no one will read? We want as many people as possible to see our wonderful work. This is not like literature or art where older is better.

Seems to me that this is a losing business model. I can see paying a premium for up-to-date work. I understand someone has to get paid and can easily pay a reasonable price. But $31.50?! For an article that is almost a decade old!? That makes no sense in an online world.

Very few articles in biology that are ten years old retain much value. Just a few years ago, I would have been stuck but now I have other tools.

I went to PubMed, the database of journal articles, and did a search for “snakes AND legs”. Got 48 articles. The critical one appears to be by Cohn and Tickle “Developmental basis of limblessness and axial patterning in snakes” in Nature from June 1999. Great. Now I have a subscription to Nature so this article is available to me but if you wanted to read it without a subscription it would cost $35! Wow! But at least now it has an abstract.

The evolution of snakes involved major changes in vertebrate body plan organization, but the developmental basis of those changes is unknown. The python axial skeleton consists of hundreds of similar vertebrae, forelimbs are absent and hindlimbs are severely reduced. Combined limb loss and trunk elongation is found in many vertebrate taxa1, suggesting that these changes may be linked by a common developmental mechanism. Here we show that Hox gene expression domains are expanded along the body axis in python embryos, and that this can account for both the absence of forelimbs and the expansion of thoracic identity in the axial skeleton. Hindlimb buds are initiated, but apical-ridge and polarizing-region signalling pathways that are normally required for limb development are not activated. Leg bud outgrowth and signalling by Sonic hedgehog in pythons can be rescued by application of fibroblast growth factor or by recombination with chick apical ridge. The failure to activate these signalling pathways during normal python development may also stem from changes in Hox gene expression that occurred early in snake evolution.

Sounds really interesting to me but still not sure it is worth $35. But right above that link from PubMed is another one – from Current Biology with pictures. “How the snake lost its legs”. It is a ScienceDirect link also but this one is available for free. And it has nice pictures while discussing the Cohn and Tickle article.

So partial success. Now I have a better idea of the article’s content. All the other links from PubMed dealing with snakes and THEIR legs, as opposed to snakes and the legs they bite, have costs to access, up to $39.

Except for this nifty one from the Journal of Experimental Biology – “Becoming airborne without legs: the kinematics of take-off in a flying snake, Chrysopelea paradisi” (The picture above is of a flying snake.) Open access and more recently published. Not exactly on topic but it comes with movies! These were just not possible to see without online access. And the movies are really cool and help explain what the author of the paper was describing. You can actually see the difference between a J-loop takeoff and other modes. Plus, flying snakes sound like something from a B-movie.

Back to the topic. I went to Google and searched “Cohn Tickle snake”. The top response is from a USA Today article about why snakes do not have legs. In the article there are links to Martin J. Cohn and Cheryll Tickle. Clicking the Cohn link takes me to his page at the University of Florida. Not a lot here but there is a link to his personal site.

Now we get the Cohn lab page. I could just email him and ask for a copy of the paper (a slightly updated approach to the old method of sending reprint requests by snail mail). But there is a link to Publications.

And here we find the PDF to the paper I was looking for. A quick runthrough reveals that it is a paper I will find interesting (I love Hox stuff). But I would not have paid over $30 for it.

I certainly believe that downloading a paper from an open archive presented by the author of a paper is an ethical way to obtain the paper (It is just the online version of the reprint request, remember). So, it took me less than 10 minutes to find a copy of the article online. (And it turns out that if I had looked at my Google results just a little more, I would have found a direct link to the publications page, saving myself some time.)

I think that, except for the most highly paid of us, 10 minutes time would be less than $10. This seems about right. A paper for $5 I would buy immediately while much over $10 and I will go searching. I may not succeed but I can usually find an email link and request a copy from the author.

Online archives by the authors are becoming more common and are a basic aspect of many Open Access initiatives. Paying a small premium for access to a current article is a reasonable price, especially if it is convenient. But any business plan that wants to charge a huge premium for decade old work needs serious rethinking.

So, for a few minutes of my time I got the article for free and also got to see some nice movies of snakes flying. Not a bad way to travel in an online world.

Technorati Tags: , ,