Category Archives: Open Access

Data is useless without us

 62 153893226 C18E14A7A3 by NightRPStar
In science, data is nothing without purpose:
[Via business|bytes|genes|molecules]

In an article on TechFlash, a VC, talking about trends in 2010, had this to say while talking about increased IT needs in cleantech and biotech

Both areas are generating terabytes of data and it is no longer just about science — it is about digesting mountains of data.

For some reason that statement scared me. Digesting mountains of data is all about the science. If we forget that, we are in big trouble. Yes, from a pure technology perspective it is about digesting mountains of data, but (a) that has to be looked at in the context of science (sense-making?), and (b) the digesting is a necessary pre-requisite to getting to the science. You really don’t have much of a choice, but if you forget about the science, you will end up with noise, a whole lot of it.

My advice to all the VCs out there is simple. Yes, life science is increasingly data intensive, and to make any sense of that data, you need to look at computing as a core aspect, but never lose sight of the fact that collecting all that data has a purpose, understand our molecular machinery and figuring out how we work, and what makes us stop working properly. If we forget that, a lot of money is going to get wasted.

[More]

Data just exists. Human interaction with data provides context, transforming the data into information.

Concentrating on the data does us no good at all. Finding better ways to store it might be useful but without putting a lot of work into being able to extract the data for human purposes, it is useless.

We need better ways for humans everywhere to interact with the data so that we can deal with the inherent information created.

Technorati Tags:

Make it a pub

[Crossposted at A Man with a PhD]

pub by gailf548
Participation Value and Shelf-Life for Journal Articles:
[Via The Scholarly Kitchen]

Discussion forums built around academic journal articles haven’t seen much usage from readers. Lessons learned from the behavior of sports fans may provide some insight into the reasons why.

[More]

The scientific discussions that many researchers have found the most productive are often those sitting around a table in a informal setting, like a pub. These discussions are often wide-ranging and very open. They often produce really innovative ideas, which get replicated on cocktail napkins.

Some of the best ideas in scientific history can be found on such paper napkins. Simply allowing comments on a paper does not in any way replicate this sort of social interaction. But there already online approaches that do. We call them blogs.

Check out the scientific discussions at RealClimate, ResearchBlogging or even Pharyngula. Often the scientific discussions replicate what is seen in real life, with lots of open discussion about relevant scientific information.

If journals want to create participatory regions in their sites, they might do well to mimic these sorts of approaches. David Croty at Cold Spring Harbor has such a site. Although it has not reached the popularity of RealClimate, it is a nice beginning.

I would think that research associations, with an already large audience of members, would have an easier time creating such a blog, one that starts by discussing specific papers but is open to a wide ranging, semi-directed conversation.

Technorati Tags: , , ,

Staying up to date with twitter services

Part 1: What are Twitter Lists?:
[Via Pistachio Consulting Inc.]

This is Part 1 of a 3-part series cross-posted from adelemcalear.com

lists-header

WHAT IS IT?

Back on September 30th, Twitter announced on their blog that they would be launching their new Lists feature to a small group of users to beta test. Lists allow Twitter users to organize the people they follow into groups. By segmenting your following list into groups, you can then filter tweets from your main stream and just view the tweets originating from a selected list. You can also subscribe to other people’s lists.

[More]

Twitter is a social medium that has varying uses for different people. But it is obvious that it has some use for almost everyone.

When they introduce a new service, like lists, it is useful then to get up to speed quickly. This nice little series discusses the new Lists feature of Twitter. It helps prov ide some important insights into the potentials of lists and their drawbacks.

Technorati Tags: ,

Updated: Short answers to simple questions

fail by Nima Badiey

NIH Funds a Social Network for Scientists — Is It Likely to Succeed?

[Via The Scholarly Kitchen]

The NIH spends $12.2 million funding a social network for scientists. Is this any more likely to succeed than all the other recent failures?

[More]

Fuller discussion:

In order to find an approach that works, researchers often have to fail a lot. That is a good thing. The faster we fail, the faster we find what works. So I am glad the NIH is funding this. While it may have little to be excited about right now, it may get us to a tool that will be useful.

As David mentions, the people quoted in the article seem to have an unusual idea of how researchers find collaborators.

A careful review of the literature to find a collaborator who has a history of publishing quality results in a field is “haphazard”, whereas placing a want-ad, or collaborating with one’s online chat buddies, is systematic? Yikes.

We have PubMed, which allows us to rapidly identify others working on research areas important to us. In many cases, we can go to RePORT to find out what government grants they are receiving.

The NIH site, as described, also fails to recognize that researchers will only do this if it helps their workflow or provides them a tool that they have no other way to use. Facebook is really a place for people to make online connections with others, people one would have no other way to actually find.

But we can already find many of the people we would need to connect to. What will a scientific Facebook have that would make it worthwhile?

Most social networking tools initially provide something of great usefulness to the individual. Bookmarking services, like CiteULike, allow you to access/sync your references from any computer. Once someone begins using it for this purpose, the added uses from social networking (such as finding other sites using the bookmarks of others) becomes apparent.

For researchers to use such an online resource, it has to provide them new tools. Approaches, like the ones being used by Mendeley or Connotea, make managing references and papers easier. Dealing with papers and references can be a little tricky, making a good reference manager very useful.

Now, I use a specific application to accomplish this, which allows me to also insert references into papers, as well as keep track of new papers that are published. Having something similar online, allowing me access from any computer, might be useful, especially if it allowed access from anywhere, such as my iPhone while at a conference.

If enough people were using such an online application then there could be added Web 2.0 approaches that could then be used to enhance the tools. Perhaps this would supercharge the careful reviews that David mentions, allowing us to find things or people that we could not do otherwise.

There are still a lot of caveats in there, because I am not really convinced yet that having all my references online really helps me. So the Web 2.0 aspects do not really matter much.

People may have altruistic urges, the need to help the group. But researchers do not take up these tools because they want to help the scientific community. They take them up because they help the researcher get work done.

Nothing mentioned about the NIH site indicates that it has anything that I currently lack.

Show me how an online social networking tool will get my work done faster/better, in ways that I can not accomplish now. Those will be the sites that succeed.


[UPDATE: Here is post with more detail on the possibilities.]

A very big challenge for biopharma

Loose coupling and biopharma:
[Via business|bytes|genes|molecules]

A few days ago, via the typical following of links that is typical of a good search and browse section on the interwebs, I chanced upon a discussion about a presentation given by Justin Gehtland at RailsConf. The talk was entitled Small Things, Loosely Joined, Written Fast and that title has been stuck in my head ever since. Funnily enough, what was in my head was not software, and web architectures, cause today, I consider that particular approach almost essential to building good applications and scalable infrastructures, and most people in the community seem to understand that (not sure about scientific programmers though). What I started thinking about was if that particular philosophy could be extended to the biopharma industry.

Without making direct analogies, but without suspending too much disbelief, one can imagine a world where drug development is not done in today’s model, but via a system consisting of a number of loosely coupled components that come together to combine cutting edge research and products (drugs) in a model that scales better and does a better, more efficient job of building and sustaining those products. One of the tenets of the loose coupling approach to scalable software and hardware is minimizing the risk of failure that is often a problem with more tightly coupled systems and in many ways the current blockbuster model is very much one where risk is not minimized and one failure along the path can result in the loss of millions of dollars. I have said in the past that by placing multiple smart bets, distributed collaborations and novel mechanisms (like a knowledge and technology exchange), we can reboot the biopharma industry, reducing costs and developer better drugs more efficiently. I don’t want to trivialize the challenge, the numerous ways in which the process can go wrong, and the vagaries of biology, but resiliency is a key design goal of high scale systems, and is one we need to build into the drug development process, one where the system chooses new paths when the original ones are blocked.

How could we build such a network model? I know folks like Stephen Friend have their ideas. Mine are ill formed, but data commons, distributed collaborations, and IP exchanges are a key component especially in an age where developing a drug is going to be a complex mix of disciplines, complex data sets and continuous pharmacavigilance. I can’t help but point to Matt Wood’s Into the Wonderful which does point to some of those concepts albeit from a computational perspective

[More]

Designing great and awesome tools for researchers to use will be critical for successful drug development. But there also has to be a cultural change in the researchers themselves and the organizations they inhabit.

One is that the tools have to work the way scientists need them to, not what works well for developers. This is actually pretty easy now and many tools are really starting to reflect the world views of researchers in biotech, who, more times that expected, are somewhat technophobic.

This leads to the second area- researchers often need active facilitation in order to take up these sorts of tools. They need someone they trust to actually help convince them why they should change their workflows. Most will not just try something new unless they can see clear benefits.

Finally, the last thing is better training for collaborative projects. Most of our higher education efforts for training researchers makes them less collaborative. They are taught to get publications for themselves in order to gain tenure. Plus, with the competition seen in science, letting others know about your work before publication can often be harmful Large labs with many people often can quickly catch up to a smaller lab and its work.

Like in the business world, being first to accomplish something can be overtaken by a larger organization. So, many researchers are trained to keep things close to the vest until they have drained as much reputation as possible form the work.

But many of the difficult problems today can not be solved by even a large lab. It can require a huge effort by multiple collaborators. Thus, there is a movement towards figuring out how to deal with this and assign credit.

Nature just published a paper by the Polymath Project, an open science approach to the discovery of an important math problem. They addressed the problem of authorship and reputation:

The process raises questions about authorship: it is difficult to set a hard-and-fast bar for authorship without causing contention or discouraging participation. What credit should be given to contributors with just a single insightful contribution, or to a contributor who is prolific but not insightful? As a provisional solution, the project is signing papers with a group pseudonym, ‘DHJ Polymath’, and a link to the full working record. One advantage of Polymath-style collaborations is that because all contributions are out in the open, it is transparent what any given person contributed. If it is necessary to assess the achievements of a Polymath contributor, then this may be done primarily through letters of recommendation, as is done already in particle physics, where papers can have hundreds of authors.

We need to come up with better ways to design useful metrics for those that contribute to such large projects. Researchers need to know they will get credit for their work. As we do this, we need to also help train them for better collaborative work, because that is probably what most of them will be doing.

Technorati Tags: , ,

Changing social rules

10 Golden Rules of Social Media:
[Via Nonprofit Online News]

Yes, the title is linkbait, but I like it anyway. Aliza Sherman has been doing this almost as long as I have and her digestion of 20 plus years of experience into 10 Golden Rules of Social Media are utterly simple and powerful. They could easily be a checklist for any social media project or campaign: (1) Respect the Spirit of the ‘Net. (2) Listen. (3) Add Value. (4) Respond. (5) Do Good Things. (6) Share the Wealth. (7) Give Kudos. (8) Don’t Spam. (9) Be Real. (10) Collaborate.

In fact, it makes me think that I ought to see if I could build some research around this list. Unfortunately, the most important one (indeed the one that leads to all the other nine, as far as I’m concerned) is a challenging one to test. “Respect the spirit of the Net.” I have a solid idea of what that means and oddly, I think it’s a large part of what people pay me for. But could I build an instrument for it? I’m not so sure.

[More]

One thing the Internet is doing is requiring us to change and adapt social interactions, to create rules that work in the new environment. Society does this in order to control behavior so that the interactions are the most productive.

We see this in many of our day-to-day interactions. It is found in how line cheats are frowned upon, how we decide who goes first through a door, how we move to the right when going up some stairs. There are many social rules that we use to function smoothly.

Now not everyone follows all the rules but the rest of us sure notice when they are broken. I would guess that much of the road range seen is due to the apparent breaking of social rules that may not actually be appropriate when in the car.

The Internet is also a new social environment and we are creating social rules just like anywhere else. These 10 rules of social media are a good start. They all help enhance what makes the web so powerful. As we gain more experience with this new social setting, we will do w better job of training each other how to behave.

Not that trolls and spam will disappear but, just as we do with someone who to break some of our current social rules, we will do a better job of isolating and ignoring their behavior so it does not do as much damage.

Technorati Tags: ,

Mashing up

200910202344.jpg by foodistablog

One of the great things about openness and transparency is the ability for people to mash together various things to suit themselves. So, look at this:

Listening to: Death of an Interior Decorator from the album “Transatlanticism” by Death Cab For Cutie.

I added that with a single click in ecto, the blog editing software I use to create and publish posts. Ecto has a nice add-on that grabs the info from the song I am listening to and puts it in the post. I can set up templates with formatting so it has the links, etc. But the original template created Google search links. I simply remade the template so it links to iTunes.

I’m doing the same thing with Twitterfeed. This has allowed me to push blog posts from my different blogs (Spreading Science, Path to Sustainable and A Man with a PhD). Now I’m seeing if I can push posts to my Facebook account.

So, a simple posting can also copy the post to both Twitter ad Facebook. It looks like I do a lot but it all comes from simply clicking one button. That is what open APIs and other aspects of the web allow us to do.

It all makes it easier for the right people to get the right information at the right time.

Using crowds to solve problems

crowd by James Cridland
Get Ready to Participate: Crowdsourcing and Governance:
[Via Confessions of an Aca/Fan]

Crowdsourcing and Governance

by Daren C. Brabham

It’s been three years since Jeff Howe coined the term “crowdsourcing” in his Wired article “The Rise of Crowdsourcing.” The term, which describes an online, distributed problem solving and production model, is most famously represented in the business operations of companies like Threadless and InnoCentive and in contests like the Goldcorp Challenge and the Doritos Crash the Super Bowl Contest.

In each of these cases, the company has a problem it needs solved or a product it needs designed. The company broadcasts this challenge on its Web site to an online community–a crowd–and the crowd submits designs and solutions in response. Next–and this is a key component of crowdsourcing–the crowd vets the submissions of its peers, critiquing and ranking submissions until winners emerge. Though winners are often rewarded for their ideas, prizes are often small relative to industry standards for the same kind of professional work and rewards sometimes only consist of public recognition.

Recognizing that not all creativity and innovation resides in-house, some organizations are looking for connections to outside innovators. New social tools allow them to make connections, through such sources as InnoCentive. When done well, these approaches can not only produce new ideas but help vet these ideas for suitability.

This approach can work in areas other than for-profit settings. Think non-profit biomedical institutions or government.

Though you’d be hard pressed to see them ever use the word “crowdsourcing,” one such example of crowdsourcing in governance is Peer-to-Patent. Begun in June 2007, Peer-to-Patent is a project developed by New York Law School’s Institute for Information Law and Policy, in cooperation with the U.S. Patent and Trademark Office (USPTO). The pilot project engages an online community in the examination of pending patent applications, tasking the crowd with identifying prior art and annotating applications to be forwarded on to the USPTO. The project helps to streamline the typical patent review process, adding many more sets of eyes to a typical examination process.

Another attempt to use crowdsourcing in public decision-making is Next Stop Design, a project with which I am involved that asks the crowd to design a bus stop for Salt Lake City, Utah. With Thomas W. Sanchez and a team of researchers from the University of Utah, we’re working in cooperation with the Utah Transit Authority (UTA) and funded by a grant from the U.S. Federal Transit Administration. On the Next Stop Design Web site, you can register for free, submit your own bus stop designs and ideas, and rate and comment on the designs of others. Launched on June 5, 2009, the project runs through September 25, 2009, and the highest rated designs will be considered for actual construction at a major bus transfer stop in Salt Lake City. Winning designs will be publicly acknowledged and included on a plaque affixed to the built bus stop.

It will take some changes in viewpoints but the ability of the public to directly engage important aspects of government should only enhance policy. Obviously, this approach could not be used in every area but careful positioning of the approach could have real consequences.

There is much potential for crowdsourcing in government, certainly as one of an array of social media methods quickly being embraced by all levels of government. President Obama has made his intentions with technology and transparency in government clear. His appointment of Beth Noveck, the New York Law School professor who launched Peer-to-Patent, as Deputy Chief Technology Officer for Open Government, makes his intentions very clear. I predict over the next two years we’ll see in the U.S. a rapid proliferation of government by the crowd, for the crowd. Get ready to participate.

It will be interesting to see if this approach also harnesses some of the social commitments seen in the Millennials. This generation is already connected and has shown some strong willingness to work on social needs. I think that the impact of these approaches may be greater in non-profit settings than in for-profit. By engaging people in the charitable work in ways that easily make them a part of the process, non-profits have an advantage that few for-profits do.

Technorati Tags:

It’s the people, not the network

network by Arenamontanus
Too much networking?:
[Via Cosmic Log]

Open-source communities may suffer from “an overabundance of connections,”
an information policy researcher suggests in the journal Science.

Are geeks guilty of groupthink? A network expert argues that less social networking would produce more radical innovation on the Internet.

[More]

This is a provocative statement, that some communities are too connected, thus reducing diversity and the ability to innovate. I would make the statement that the level of connectedness is more reflective of the type of people involved and indicates a community that is not properly constructed to permit innovations to rapidly traverse the community.

That is, the problem stems from the type of people involved, not the network itself.

I’ve written a bit about how communities process innovations, how they propagate and how they are adopted. Each community has its innovators, its early adopters, the early and late middle and its laggards. In most communities, the relative numbers of each of these 5 groups follows a bell curve. Roughly 16% are early adopters and innovators, 16% are laggards and the majority is in the middle.

Diffusionofinnovation

One of the main differences seen between these 5 groups comes from the number and types of social connections they make. The early and late majorities mainly only connect with themselves and others in the community. These are the greatest sources of groupthink. They are the incremental thinkers, those that get together and talk about how to make small changes. They listen to each other and provide mutual support. They are often skeptical of new things but incremental works well for them.

The key measure of the majority is that they will usually only adopt a new innovation when told to do so by someone, from the community, that they trust and respect. They do not like to be the first one in a community to adopt an innovation. They are comfortable with what currently works.

The innovator group, on the other hand, have a large number of connections outside the community. They bring in the odd ideas, the weird bits of information that can generate new ideas and innovations.

They are the ones who say “Hey, my Uncle Bob heard about someone who fixed the a similar problem, only he used this really weird algorithm.” Innovators love to solve problems and search the world for data that can help them learn.

Now, innovators are generally not held in very high esteem by the community. They are disruptive and as often have ideas that are useless as they do ones that are useful. They love new stuff because it is new, not always because it is useful. They are seldom seen as community leaders and often have greater freedom, either financial or situational, that allows them to pursue the novelties they love. Because of their extensive outside connections, if their work is not supported by the community, they can often leave to find those communities that will support them.

A lack of innovators means that fresh, creative ideas are not easily brought into the community.

Early adopters are the important filters here. They often have enough outside contacts to be able to understand where the innovator is coming from. They are very good at figuring out which of the many ideas that the innovator tosses out are actually useful to the group. They’re the interface between the community and innovators.

Early adopters are usually community leaders. They are the ones that promulgate the great ideas that the innovators come up with to the rest of the community. By being right, by helping the community, they gain a lot of power and prestige.

So, the majority looks to the early adopters to push innovation and change, not the innovators. The latter are just too disruptive to the clean, stable processes preferred by the majority in the middle.

A lack of early adopters means that innovators are not easily supported by the community and that useful new ideas have a much harder time getting the notice of the majority.

There are not enough filters to properly present useful ideas to the community. Innovators simply appear disruptive. Useful new ideas do not traverse the community because there are not enough trusted people presenting creative ideas.

I would suggest that the problem is not the vast number of connections amongst the groups, that the problem is not the internet. It is that these online groups, may have coalesced in ways that diminish the power of this 5 group adoption curve. In most real life communities, at least the successful ones, the innovators and early adopters number about 16%. About 65% make up the early and late middle.

Perhaps these online communities have very different makeups. Perhaps the percentage of the middle is much larger, since it is now so easy to connect, and the middle feels much more comfortable connecting with those that already think like them.

In describing these networks, the author makes the point that they mainly connect to each other. This sounds exactly like a group of early and late majority. If a community is made up of mainly people like this, say 80%, then the lack of enough early adopters could have a strong effect on the adoption of innovation.

The early adopters are the gatekeepers for novel and useful ideas in the community. If the number of early adopters is lower than normal, the number of new ideas that can traverse the community is greatly decreased. Consequently, there will be less support for the innovators, who may very well go find other communities that they can innovate with.

The ease with which the Internet allows connections to be made means that innovators will have many more routes available to them. In real life, they can not easily move beyond the community they inhabit. On the Internet, it is easy. So they may leave to greener pastures.

This may also pull along some early adopters, who, after all, like to be the ones who act as filters and to gain the community respect that comes from helping to disperse new ideas. This could result in a positive feedback loop that greatly decreases the numbers of innovators and early adopters, leaving a community of mainly the middle. This would seem to fit the description of the article.

It is the makeup of the humans involved in the network, not the network itself, that is the problem.


I would suggest that the key bottleneck to innovation in Open Source projects is the lack of a sufficiently high number of early adopters.

This would explain the lack of outside connections, as early adopters and innovators have the majority of these. Without early adopters to funnel their ideas, innovators will leave for greener pastures.

On the flip side, if there were enough early adopters, their ability to pull in innovators who have ideas that would help the community, the key aspect of an early adopter, would allow the flow of innovative ideas into the community.

So how to increase the number of early adopters, which will then attract innovators and permit novel ideas to traverse the community? Well, one could advertise on Craig’s list, I suppose. Far easier would be to find a way to take the early adopter’s in the community already and find a way to increase their power, to artificially raise their numbers.

Many of the ideas suggested in the article, such as skunk work projects, are really just ways to isolate a group from the community. This would tend to increase the relative numbers of innovators and early adopters. They will be drawn to new things like ants to honey.

These are ways to prime the pump, to create a situation in the community where the early adopters have a much larger impact with higher representation than they do in the general community.

But this is somewhat indirect. Why not utilize the metrics available in the network to identify who falls into each group? Some companies are already doing this, because the way an early adopter appears in a network is different than a late majority.

Making a greater effort to identify and accumulate early adopters in the community by using the Internet itself would be very informative. Increasing the impact of early adopters would attract and support more innovators, providing more ideas to the community. If the level of early adopters is less than expected, say under 10% then efforts must be made to increase this percentage, either actually or relatively.

To bring in more early adopters would require a campaign of some sort to attract people with the right connections. Initially, this may not be easy. Better to artificially increase the relative numbers of early adopters.

So, take the early adopters that are already present and create a ‘new’ community, an artificial one, where their numbers would be much higher. Put them together, along with some innovators and let them go at it.

Again, this is kind of what is suggested in the article, but with much less discussion about why it might work. In the real world, early adopters and innovators are usually kept separate from the main group by putting them in places like Research. A difficulty with online networks is that there is not often a defined process to isolate these people and thus increasing their numbers to the point where their talents are actually useful.

In the large, efficient networks that are possible using Internet technology, the early adopters and innovators may get swamped out, becoming too small a percentage to actually affect change.

The solution is to find ways to identify and isolate them from the community but in ways that use their important attributes to help the group.

Technorati Tags: ,

An interesting view of IP issues

happy by Pink Sherbet Photography
Potential Confusion Avoided – rPath Video:
[Via Common Craft – Explanations In Plain English –]

Yesterday, we posted about a video by a company called rPath with the title “Cloud Computing in Plain English.” Read about it here.

The blog post came as a result of our unsuccessful efforts over six months to illustrate to rPath that their video, because of the combination of the “in Plain English” title and use of paper-cut outs on a whiteboard, was a source of confusion for Common Craft customers. Because rPath insisted on using legal means to communicate their stance, we chose to take a different route that didn’t involve lawyers.  We simply asked our fans to help us reduce confusion.

Over the course of the last 24 hours, we’ve learned a lot. First, let me say that we couldn’t have imagined the level of your response. We are very lucky to have people around us who feel passionately about helping us protect our brand. Within a couple of hours of the blog post, the message to rPath was clear and as you’ll see below, we have reached a resolution.  We thank you.

[More]

So here we have an established organization with a very easily identified image having to deal with a new company using the same approach, possibly causing confusion amongst clients. This is what trademark is supposed to deal with. But in some cases, the look and feel of the approach is also important. In the old days this might have taken a lot of lawyers and money to resolve.

Instead of having to deal with lawyers and pay them lots of money, the community responded to a request for help. It was able to deal with an organization who threatened the health of the community.

Because of the openness and transparency of the Internet, the community took action that allowed everyone else to see what was going on. It then resulted in an accommodation that works for everyone.

All without paying lawyers. While this approach might not work everywhere or every time, it is a nice demonstration of how a connected community can deal with some IP issues.

Technorati Tags: ,