User:WillWare/Automation of science

From BarCampWiki

Jump to: navigation, search

Contents

The Adam robot

In a laboratory at Aberystwyth University, Wales, a robot scientist called Adam is trying to find the genes responsible for producing some important enzymes in yeast. Based on existing knowledge, Adam is coming up with new hypotheses and designing experiments to test them. He carries them out, records and evaluates the results, and comes up with new questions.

http://bit.ly/bGWtiG

http://bit.ly/aqYhRM

http://bit.ly/oEeID

http://en.wikipedia.org/wiki/Adam_%28robot%29

Lab robotics in science is nothing new.

Generating new scientific hypotheses is new, and designing experiments to test them is new.

Adam's limitations

Adam works in a very narrow domain of science.

Adam works alone and does not participate in the broader scientific literature.

Eureqa

Eureqa is software that takes a bunch of data uses a genetic algorithm to find a terse mathematical function to fit the data. This is called regression.

http://bit.ly/9FWRbf

The form of the function may suggest a causal mechanism. That might suggest a hypothesis, and then predictions, and then design experiments to test the predictions.

http://bit.ly/bbh3dZ

Let's see an example

http://bit.ly/buOSd9

http://bit.ly/9Xi5oL

How powerful is Eureqa? Well it can derive Newton’s Second Law from the motion of a pendulum without any input on the physical laws of mechanics in just a few hours. Other researchers are hoping to have Eureqa find the mathematical relations in their own work which is much more complicated than simple Newtonian physics. If successful, Eureqa could speed up scientific research.

Machines as theoreticians

http://bit.ly/9S19ch

http://bit.ly/aIhu05

We have lots of machine experimentalists (lab robots). But we don't yet have a lot of machine theoreticians.

Create machine-readable languages and ontologies for scientific hypotheses, predictions, and experiments.

Then machines can fully participate in the scientific process.

Their insights will be different from ours. They won't replace human scientists. Instead they will open new vistas of scientific inquiry.

Machine-readable science language?

Probably best to start with things we already have.

Semantic web technology -- RDF / XML, inference engines, ontologies

Data mining and machine learning -- finding patterns in data

Afterwards

I was very fortunate to (A) have very little to say myself, so that I quickly got out of the way for others to discuss, and (B) have some very smart people in the room who got the idea immediately, some of them able to give the scientist's-eye view of this idea.

Discussion centered around a few topics. One was how comprehensive a role would computers play in the entire scientific process. There seemed to be consensus that computers could easily identify statistical patterns in data, could perform symbolic regression in cases of limited complexity and not too many variables, but that in the creation of scientific theories and hypotheses, there are necessary intuitive leaps that a machine can't make. Personally I believe that's true but I imagine that computers might demonstrate an ability to make leaps we can't make as humans, and I have no idea what those leaps would look like because they would be the product of an alien intelligence. If no such leaps occur, at least the collection of tools available to human scientists will hopefully have grown in a useful direction.

Another topic was the willingness of scientists to provide semantic markup for research literature. Only those expert in the field are qualified to provide such markup since it requires an in-depth understanding of the field as a whole, and the paper's reasoning process in particular. It's also likely to be a lot of work, at least initially, and there is as yet no incentive to offer scientists in exchange for such work. The notion of posting papers on some kind of wiki and hoping that semantic markup could be crowd-sourced was quickly dismissed. Crowd-sourcing doesn't work when there is a very precise correct answer and the number of people with that answer is very small.

There has been a lot of Twitter traffic around Bar Camp Boston, and I was able to find a few comments on my talk afterward. It looks like people enjoyed it and found it stimulating and engaging, so that's very cool. It turned out to be a good limbering-up for an immediately following talk on Wolfram Alpha. I found one particularly evocative tweet:

Has anyone approached a CS journal to have their content semantically marked up? #BCBos @BarCampBoston

Thinking about that question, I realized that computer science is the right branch of science to begin this stuff, and that the way to make it most palatable to scientists is to publish papers that demonstrate how to do semantic markup as easily as possible at time of publication (not as a later retrofit), how a scientist can benefit himself or herself by doing that work, and how to do interesting stuff with the markup of papers that have already been published. My quick guess is that some sort of literate programming approach (wiki) is appropriate. So lots to think about.

If you attended my talk, thanks very much for being there. I had a lot of fun, and hope you did too.

Personal tools