Chemistry on the Web: How Can we Crowdsource Chemistry to Solve Important Problems?

July 25, 2019 posted by



>> CAROLE: I am here to introduce Matthew Todd,
who's here at the University of Sydney to talk about Crowdsourcing Chemistry and the
Tropical Disease Initiative. So please welcome, Matthew Todd.
>> TODD: Okay. So, first, I should say… >> CAROLE: You don't need to use this one.
>> TODD: Oh, is it one of these two? Okay, I can stand here on this one, all right. So,
first, I should say, thanks, Carole, for introducing me and hosting the visit. Thanks also to,
Christy Burner, who also ultimately set up the visit. And, yeah, I'm an organic chemist.
So hopefully I'm going to do a bit of Chem 101 to make everyone familiar with what I'm
talking about. But also, I'm going to be talking a little bit about Open Science. And the purpose
of my visit today really is to preach that we need tools and applications. So, one of
the reasons coming to Google is that your apps are very intuitive and so on, and we
need things for conducting science in the open which we currently don't have, and one
of the reasons we don't have them is because people haven't designed very intuitive user
interfaces for things to be used by scientists to record their work in the lab. The other
real reason for coming to Google today is because I recently bought a Nexus One phone,
and I lost the little black sleeve that comes with it and I'm hoping to get a new one, so,
if you have one. Okay. So, there's going to be a little bit of chemistry and a little
bit of stuff about science and how we do it and maybe how we should we do it. So it's
quite a wide range of things and obviously, if you have any questions, please just stick
out your hand and we'll go. Okay, so a little bit of–I think I still got animation in here,
a little of bit of chemistry at the start. Okay. So I'm an organic chemist. I teach in
research organic chemistry in the University of Sydney in Australia. And what does that
mean? Well, we make molecules. I have graduate students and post-docs and undergraduate who
make molecules. So we put atoms together in specific ways. And one of the interesting
things about this is that as you do chemistry for a long time and you learn a lot about
organic chemistry, you learn about how to do this and you become proficient at doing
this both in your mind and also with your hands. So you become good at putting things
together, bringing atoms together in specific ways to make complicated molecules and this
can be done in several different ways. What we do is we buy things from commercial catalogues
and then we use chemistry in a rational way to put things together. And we make important
molecules that may be are useful for pharmaceuticals or agrichemicals or fragrances and so on.
And to do that, sometimes you want to make a really complex molecule which has certain
properties, and you have to know how to do that, how to put an atom here, an atom there
in specific ways. And it can be quite complicated. In some ways as it's shown on the top here,
the molecule might be going from the top-left might be obviously related to the thing that
you can buy. So on the right-hand side there, you got things in the box, which you can easily
buy and put these things together in a kind of like a Lego manner and build up a complex
molecule, which maybe has some nice property. In other ways, there maybe something available
from nature, like the molecule in the middle on the right, which you can easily transform
into something that you might want. So you can buy that in large quantities from some
natural source and you can convert it into something that you might want to use. So these
are kind of ways of using nature initially to make things which are kind of complicated.
But in many cases, the molecule that you might want, for example, that thing on the bottom-right
here is very–it's structurally unrelated to anything that you can buy or find. And
in order to make that you might have to think in a very lateral way about how you can buy
things and combine those things to make a complicated molecule. And frequently, we find
this, some molecule that we–that have some potent biological property, is not simply
made by a logical combination of starting materials. And this is the right creative
process, so a lot of people say that there's a lot of art in organic synthesis. To make
a complex molecule, you have to have a deep appreciation for the subject and think about
it a lot and perceive hidden patterns. Now, the reason why this is interesting for me
was because this struck a cord as the parallels between this and a chess game have made before.
A chess game also has certain rules that you follow. And in order to get to some final
point, some complicated position or some winning position, you have to follow a certain paths,
and the number of paths diverge from the starting point combinatorially. So the number of possible
games of chess, obviously, is a huge number. The number of different ways of combining
small molecules to get larger molecules is also colossal. So the question that came to
me was, well, "Can we analyze how to make a big molecule with a computer?" And, well,
the answer is obviously yes, but no one's done it. Well, people have tried, but the
progress is quite slow. The contrast that struck me is that Deep Blue, obviously a computer
program, Deep Gary Kasparov of chess, this is the defining moment for A.I., I guess,
for computer power and also software development. Something as complicated as chess could be
mastered by a computer and beat the current reigning world champion; well, this hasn't
happened in organic chemistry so far. For some reason, people have not designed software
yet, he would have made inroads, certainly, but haven't design software yet that can really
take on the masters, the big professors at various universities around the world, have
put molecules together. So this–I wrote this article about this, appealing for maybe some
progress and the application of modern computational techniques to making organic molecules. And
so far this hasn't happened. Now the thing I'm talking about today is related to this,
why haven't we–why haven't people developed tools that help scientists to do science online
and in the open? That is the–that is I guess the message, why can't we do that yet? Okay,
so away from chemistry for a second, I sub-reported it to my lab and one of them is working in
an area of Neglected Tropical Diseases. And now, there are various diseases in the world;
cancer, AIDS, big diseases; and malaria, too. There are some which are neglected, which
means purely that the amount of money being spent on them and the amount of time being
spent on them is relatively small compared to their impact socially. And there are several
examples, here's a graph from a website that lists several that usually have rather complicated
names to say, but the one that I'm interested in is this thing called Schistosomiasis, which
is used to be called Bilharzia. It's a parasitic disease carried primarily in the regions shown
on this map. So it's mainly a sub-Saharan problem. And it's a particularly nasty disease,
it's a parasitic disease and a parasite infects you and lays eggs in you and these are excreted
into fresh water and then this can be taken up–the parasite matures and is taken up by
a snail in fresh water and then the snail–the parasite matures again in the snail and that
excretes it into freshwater and then you pick it up again. There's a cycle rather like malaria
with a mosquito as the intermediate host, but instead, now, you have a snail and freshwater.
And it's unpleasant for several reasons, one is that the egg burden in you and your major
internal organs can become very bad and you begin to get very sick. You don't necessarily
die from this disease but it affects you by morbidity, so it makes you very sick, and
it means that children for example who get the disease are not–can't develop properly.
They tend to have stunted growth and they're very tired, they're not going to go to school
and this kind of things. So neglected tropical diseases like these often measured by something
called a DALY, a Disability-Adjusted Life Year, which doesn't take into account the
number of people who die, but it tries to quantify the impact of a disease on a society.
And by that measure, schistosomiasis is actually a big problem. It affects more than 200 million
people have the disease and another 200 million people are at risk, pretty colossal numbers
actually. Now this is unpleasant but thankfully there is a good drug for it. As with many
tropical diseases, actually, there are drugs available to treat these things. They're not
tremendously good drugs necessarily but they're inexpensive, small, easy to make, and a few
people around the world have been suggesting that we really need to focus on this, that
maybe the drugs aren't fantastic but at least they're there and we can use them and that
we can distribute them for a low price. So for example in schistosomiasis, the Gates
Foundation way back 2002, I think, but maybe I'm going to be corrected on that, funded
something called the Schistosomiasis Control Initiative which operates out of Imperial
College in London. The guy who heads it up is Professor Alan Fenwick. Now, his idea was
that we have one drug available to treat schistosomiasis, and I'll come to that in a minute. And what
we really need to do is distribute this drug enormously widely to reduce the morbidity
of the infected populations. So we take the drug and we just distribute it to whole populations
of countries, and this is not happening in select countries in Africa and other countries
in Africa have also begun their own national control program. So there's an article here
in Public Library of Sciences, neglected diseases, the whole journal devoted to neglected diseases–Africa's
32 cents solutions for HIV/AIDS. It turns out also that this drug used for schistosomiasis
can also be used to try and slow the transmission of HIV/AIDS in Africa so there's renewed interest
in the drug also from the position of HIV. So in general, the idea is that we try and
use things that we already have. So here is the drug that's used for schistosomiasis.
Now, this is a very small molecule. These–for those of you who dropped out of chemistry,
organic chemistry–the lines are bonds, right? So when a line changes directions, it's a
carbon atom. And where there seem to be double lines, that means there's two bonds between
the couple of the carbons, the oxygen's obviously the O, and the nitrogen is N. They have double
and single bonds, there are rings there, but this is a small molecule, this is a very small
molecule. And this drug was found through a screen. It's not a naturally occurring compound.
It was found through a screen of similar compounds. Initially, actually, for a similar disease
in cattle, and I don't know the story of how it was worked out but it helps people and
I don't necessarily want to know that story, but it was found out and the drug was developed.
Now, this is a very molecule and can be made cheaply. So this drug through market force
actually, is now made by chemical in Shin Poong out of South Korea who supply the schistosomiasis
Control Initiative with the supply of Praziquantel to this drug. And it's quite striking that
this drug is now available for around 12 or 11 Euro cents per gram, which is absolutely
remarkable. If you look up most study materials, you might want to buy to try and make this
molecule, they will be available from all the net. So really, it has been optimized
and optimate, it's of patent, obviously. The drug has been optimized and optimized and
is now available for a very low price. So this is great news. Sadly, the news may be
is actually too good. When a drug is this cheap and is being distributed to this many
people and killing this many parasites, you have a problem. If you–evolution tells us
that if you try and kill something, it's going to do something to try and stop being killed,
right? So the parasite presumably is going to become resistant to this or develop tolerance.
And this is a big issue for schistosomiasis because there are no other drugs available
to treat this disease. So we're in a very dangerous situation, we're using a drug to
treat literally millions of people. Sometimes whole countries and villages and cities are
being treated with this drug and there are no backups for when this drug fails. Now,
there are obviously some people who will say that resistance will not appear and others
who say it will appear, and there's this debate going on. My take it on this is better to
be safe than sorry so a lot of the research that we're doing in my lab to do with schistosomiasis
is to, for example, develop new analogs of this drug before they need it. So drugs with
a slight modification. It turns out in medicinal chemistry if you have a drug like this and
suddenly it becomes ineffective through the development of resistance, you can change
a little bit of it. You can introduce a little group on the left, a little group on the right
and you might be able to regain potency. So we were trying to look at–thinking about
simple modifications to the structure which is, I guess, that's what we do, we make molecules.
Another thing we might want to do is try to find how the drug works. And still after more
than 30 years of use, the mechanism of action of this compound isn't known, which is quite
extraordinary; apparently, quite a common situation in that in parasite medicine. However,
there's one thing that we can do right now. And we were in touch the World Health Organization
to try and discuss how we can keep this drug good for as long as possible. And the World
Health Organization's perspective is that we need to try and use this drug maximal while
it's still good. And there's a simple thing we can do to try and postpone the development
of resistance. And what you do is you increase the dose of the drug, right? So you give more
of the drug to try and kill off partially resistant parasites. One of my students was
giving a talk about this and accidentally, he said that what you should do is give more
of the drug to kill off the partially resistant people. You really get the wrong message.
The point is the parasites become resistant, and you don't want that to happen, so you
want–if there are some parasites which are partially resistant, you want to try and killing
those off by increasing the dose, 15-20%. Unfortunately, the amount of drug you have
to take is large. This is a 600 milligram pill. And the field workers who've come to
conferences which I've been, tell me the compliances is a real issue. If you're trying to give
literally tens of millions of people a drug, sometimes in very remote areas, if the drug
is too big or tastes bad, it won't be taken. People maybe naturally suspiciously of artificial
compounds, right? We call it pills. It also turns out the Praziquantel tastes terrible.
So there's another compliance issue. So if you can't necessarily make the pill bigger
but you want to increase the dose, so how do you do that? Well, we as chemists, we see
a way of doing this. Okay. This–just a little bit more Chem 101 something–this is an issue
in organic chemistry. Let's take a few minutes out. This is an issue of organic chemistry
that I wish the public understood. It's one of the most profound and beautiful aspects
of the universe. And this is not known generally by the public, a fundamental feature of organic
molecules. Okay, so organic molecules are–they're three-dimensional, unlike some of the drawings
which I put up of two-dimensional things, they're three-dimensional things, they're
real things with depth structure. And the larger the molecule is like proteins and DNA,
this structure becomes very large, they're three-dimensional objects. It turns out that
three-dimensional objects can have this property called Chirality. And this is all about a
mirror image. So if you mention some object, let's take a symmetrical object like ball.
And if you take a mirror image of that object that the ball you generate in the mirror looks
just like the first thing you start it with. So the two mirror images are superpimposable,
they're the same thing basically. Other objects–familiar objects don't have this property. The mirror
image is not superpimposable back on itself. Actually, the majority of things you see in
nature are asymmetric like this. Your hands are a good example. Your hands extensively
looks the same but if you'll try and put and back on each other, you can't they're not
superimposable, which is why your right hand doesn't fit in your left hand quite frankly.
This is very important in nature because it turns out the molecules in nature, almost
all molecules in nature above from water and ions and things have this property, that they
have a certain three-dimensional orientation in space. And almost all the molecules in
nature have one orientation, not the other one. So if you imagine just for a second that
you're walking along that street and you meet someone you know, and you want to shake their
hand. That works one way around, so if I get my right hand to someone and they use their
right hand, we have a good handshake, right? If one is using their left hand then handshake's
all wrong. And this is an important thing. When two molecules like these meet, you can
have different kinds of interactions. And it's very important if we get it right, so
all of these drugs that we now take which are now approved by the FDA, you have to define
exactly the three-dimensional arrangement of atoms in space and it can't be a mixture
of the two. This was made tragically obvious by the narration of the Solidimide story.
Solidimide is showing those two structures on the right of the screen. And the difference
only really is in the structure is that in one case, see what a nitrogen is, you kind
of have a hash line, that implies that, that atom is behind the rest of molecule. And on
the right hand molecule, the nitrogen has a kind of thick wedge going to it, that implies
to half the molecule is in front of the rest of the molecule. These two things are mirror
images, they're not superimposable. And if you take one, it has a very different effect
from the other. So it turns out that one acts as a sedative they're saying for morning sickness,
I think, but the–one of these other molecule inhibits the formation of limbs on fetuses.
And so this was tragically realized when these babies were born deformed through this drug.
And since then, it's been definitely required that we have to specify exactly what we give
people when they take drugs. It turns out actually this molecule inter-converts the
two when it's in the body so it's not that easy. But the principle remains the same,
that one of them is very bad for you and one of them isn't. This has rather nicer implications
in the case of the molecules on the bottom-left here, this is Limonene. And the reason I've
got this picture of my wife up there holding lemons and oranges is, one is a picture of
her and one's a mirror image and there's lemons and oranges in both hands. One of these molecules
is responsible for the smell of lemons and one is responsible for oranges, so they're
two mirror image molecules. If you smell one, you get lemons, if you smell a mirror image
molecule, you get lemons. So it's very important that we have molecules that specify exactly
the three-dimensional arrangement in space. Now, why am I'm telling you this? Well, probably
because I think it's the most beautiful thing, one of the most beautiful things I've ever
seen. That it is a very profound thing about the way the world is. And I'd love to explain
this more widely but of course I'm in a job. However, this is relevant to Praziquantel,
the drug that I'm talking about. So the structure I gave you before didn't have that little
wiggly H on the top. Now that little–that carbon in the middle there, where the H is
attached, is very special. It's got four different things attached to it. And that means that
makes a small molecule chiral. It means that in one case if you imagine the hydrogen that's
on the left there, it would kind of wedge in and it's coming towards you. And then on
the right side, you've got the hydrogen going away from you; these two molecules are mirror
images. You don't really see that because I've turned it around but they're mirror images.
The one on the left is the drug that works. And the one on the right, doesn't do anything.
In fact, it has mild side effects. And in fact, the one on the right that doesn't do
anything is the one that tastes bad. So we just want the one on the left, that's it.
How easy is that to do? Why don't we just–instead of making both and giving both to people,
so it tastes bad and there's too much drug, why don't we just give the one on the left
which has the right orientation? Well, that's actually quite difficult. So, these molecules
are difficult to make when you have to specify exactly the orientation in three-dimensions
of where all the atoms are. And to give you a sense of that, I guess if you think about
the ball here on the bottom-left, it's a symmetrical structure. It's very easy to make something
like this because you kind of like have a spinning wheel and it's simple to make something
that's symmetrical and round, right? In the middle plugging plus one, I wanted–on the
board a plus one, disclaimer, this is a more difficult thing to make. This is–it's not–sorry,
it's still symmetric but it's less symmetric than the ball. Suddenly you've got to make
the round part, which is quite easy, but putting the handle you going to have to do that by
hand. This is a more complicated structure. When you get to asymmetric structures like
this rodan sculpture, that's actually very difficult to make, right? Because you've got
to, with your hands, put things in various different places. So the less symmetric something
is the more difficult it is to make, I guess. And a lot of molecules in nature are extremely
un-asymmetric. They have little bit here and there that you have to install by not, well,
kind of by hand, except because the molecules are very small, so how we do this? Well, of
course, we have to design other molecules that act as our hands and install things in
certain places. It's very demanding. It's an area called asymmetric synthesis in organic
chemistry and hope that people throughout their whole careers like me too to this area.
How do we make molecules in three-dimension? So it's very demanding. And so, unfortunately,
Praziquantel has this feature. And if you want to make one rather than the other, it's
difficult. Making both at the same time, it's very straightforward, easy. But making one
rather the other is difficult. So this is where we were a few years ago, we thought,
well, the World Health Organization wants just the one active form of the drug, it's
called an Antimo, the active, in-antimo of the drug, they don't want an inactive one.
They want that because then they can reduce the pill size by half. And the pill doesn't
taste bad. It's smaller and you can increase the dose even a little bit. So it's a much
more effective pill. The problem is that as soon as you're trying to make one an antimo
rather than the other, it becomes an expensive thing. People like me have to get involved
and I have to think about ways of doing that. How do you install that hydrogen on one face
of the molecule rather the other when a molecule is so small you can't put it there with your
hands? So we were thinking about this, and unfortunately, drug is already very cheap,
so how can I justify assigning academic resources to reducing the price of something? I would
be fired if I do that, right? That's not academic work to try and try grossly reduce the price
of something, interest will be out in several years. This is not a problem that is solvable
by the academia by any means. We would–we, obviously, in the business we're trying to
get out high impact research in new frontiers. We're not in the business of reducing the
cost of anything. On the other hand, if we turn to industry, they would not be interested
at all in this problem because there's not money to be made in tropical diseases. There's
no money to be made in taking an already cheap drug and try to modify it little bit. So,
there's no real market value for that. This is already a very, very cheap process to make,
the combination of the two in-antimos. So, we were left with this problem a few years
ago. I was thinking well, this is an important public health problem that the World Health
Organization would like to solve. We can't sell it with academia and we can't sell it
with industry. And this seems like an en passe, right? What do we do? If I try to–this graph
is meant to indicate that if I try to assign people to increase the EE, this will be an
enantiomeric excess, which is a measure of how much of one of those molecules they have
rather than the other. And you try and increase by piling dollars in the programs, it doesn't
really–you get to this point where you can't anymore assign anymore resources to this problem.
You got to reach a breaking point we say, I simply can't justify this anymore. And simply,
as you assign more people to this problem, you think, "Well, is this really worth our
while? How we do this in traditional models, we can't do that." So in the case of this–we
felt, well, I was actually in my honeymoon at the time and I was thinking, "How do we
solve this problem?" What I need to do is try and collaborate with as many people as
I can. So this is new to me a few years ago. And it turns out that of course, well you
guys know a little about this. You guys know all about Open Source things. The idea of
Open Source is being somehow illustrated by this comparison between a cathedral and a
bazaar. So in academic research in chemistry, by and large we operate on the cathedral model.
I'm the Professor in charge of students and I have my resources and my grants. And we
work usually pretty much on isolation using the supposed intelligence of me and my students
to try and solve problems. We're this autonomous unit. We do collaborate occasionally, we select
people, but we are a closed unit. And we were thinking about how to try to solve this problem
and we couldn't. The contrast between a cathedral is the bazaar, where anyone can contribute
and everyone's opinion is valid and you listen to many people as possible to try to solve
this problem. Now, you guys will know more a lot than I do. So this is new to me and
I haven't read the requisite literature about this but this was the essential contrast that
we were seeing then. Basically, we don't operate in science on the bazaar model, we use the
cathedral. So we don't tend to discuss problems openly with strangers and a large number of
people we don't know and try and get a solution through the community, this doesn't really
happen. You tend to publish papers in academic journals and people might respond to that.
But it's a slow process, there's not direct discussion between people unless you actively
collaborate. These are very different models. And as I understand it things have gone very
well for the bazaar model and Open Source. So, from my naïve non-computer science perspective,
it seems as if projects like Firefox, Chrome, Wikipedia, these things have gone extremely
well, extremely powerful programs, they develop really quality products. I think a lot of
people tend to associate, outside the movement; people tend to associate Open Source with
endeavors which are purely done by volunteers and which are not funded. And of course this
is wrong. From what I hear, a lot of projects that are Open Source have involved a funded
kernel of activity to which people then respond and help out. The example that maybe that's
relevant here is as I understand it, the Chrome browser was developed by guys here. But it's
Open Source that people beyond these walls can help change, modify and update it. That's
my understanding and I hope to be proved right on that. And so, schematically, I put these
together. This is two–again rather naïve ideas by how you do things in two different
ways. I shouldn't have pointed that out. I should have put some women rather than men
icons, sorry, I just, I was in a bit of a rush. So in the left here, you have the traditional
way of doing science which involves people working in labs, submitting articles to peer-reviewed
journals, waiting a few months. And then the reviewers of that article, who are anonymous
usually, maybe one or two people say, "yes" or "no." The article gets published and then
people read that in a literature and then people design their own response to it, do
their own research. Publish an article again. The times scale is quite lengthy here, it
involves months of waiting around while the review process happens. And in some cases,
peer review is of course flawed, I don't want to get in that discussion right now but peer
review is an excellent system but it does have these big flaws that sometimes you can
get referees who are maybe not impartial, maybe they have vested interest, maybe sometimes
there's only like one referee on an article and then that appears in print in it's then
saying sanction as being valid. Typically, articles published in academic lecture don't
have feedback on them. If you want to criticize an article you have to publish a substantial
paper that refers back to the original. It's not a very interactive process. It's also
fairly slow, so the calendar icons indicates that it's a slow process and costs a lot of
money. We apply for a lot of money to do this. And in some cases, we're competing with people
who are doing similar research to us. So in many cases, we may be duplicating effort.
On the right, the idea here is that instead of doing that, why don't we post the problem
to the community, have as many people as we can reach helping us out with the scientific
problem and publish our results in real-time which are then in peer-reviewed after they
appear, so after publication. This is not a model that operates in science at the moment.
This is not how we do things at all. science operates pretty much on the basis of peer-review,
before publication not after publication. So Wikipedia is an aftermath to a lot of scientists
because corrections happen after something is made public. Notice that–since Open Science
is doing on the right there that data would then be published in real-time as it's acquired
and people would be able to respond as they see fit and collaborate with you in real-time,
even though you may not know who these people are. This is very important qualification,
is this not the same thing as an Open Access in journals. Open Access is where of course
you can read things for free, but the research may have been done in a very traditional way.
Open Access is a very worthwhile pursuit. Journals–it's very important that journals
are Open Access but it's not the same thing as an Open Source or Open Science, where community
participants can actually have an input into the project itself. So if, for example, I
design an organic synthesis of the molecule, anyone, and I posted that as an open science
project, anybody in the world could then come along and say, "No, I don't need you to do
that. You should try this and in fact, I'm going to try it on my lab and I'll get back
to you with what's going to–with the results of that synthesis." So anyone can change the
project and repost it for further community input. That's a very different way of doing
things. Okay. So I should mention that a lot of people are doing this already. Some people,
they know who they are. Up in the top-left there, for example, is Jean-Claude Bradley,
who's at Drexel University in this country who has enabled UsefulChem Project, the UsefulChem
Project, which is–its aim is to do Open Science where he's trying to make molecules that will
eventually be used to treat malaria. Jean-Claude actually practices something as a proponent
of something called Open Notebook Science. The extreme form I guess of Open Science where
you're lab work is on the web completely for every–everybody to see. So every single datum
is published on the web. There's a picture of Steve Cook who has a biophysics lab in
the University of Arizona, who also has–everything he's doing is on the web, and a bunch of other
people. Even Billy Clark, Cameron Neylon and Daniel Mitchell who are among several zealots
of the Open Science movement and they're very frequent commentators on how we should do
this. So there's a very passionate community, lots of people I haven't mentioned, passionate
community about this. But it's still incredibly small and we tend to have meetings where we
all get together and talk about this thing. And the outside world, the larger scientific
area doesn't tend to pick up on what we're doing, unfortunately. However, there are examples
on the web of lots of projects which we've used open methods. This is a small guide board
of all of these things and they vary a lot. So, for example, on the top-right, the GenBank
Initiative; that's not really Open Science, it's a depository of information where people
can deposit genetic information that can be retrieved free. This isn't really Open Science
because you're not really changing things and collaborating on the site. But it's open
data, a very important massive open data resource. The Fold It Program is an interesting one.
If you haven't seen it, it's a game which the public can play to try and help people
to work at how proteins fold. So in a very ingenious bit of software development, somebody
made a computer program to allow the public to get involved with them. It turns out the
public are very intuitive about this and have really good ideas about how protein should
fold amazingly. The Open Dinosaur Project is something where the public are being involved
in measuring bones of dinosaurs from the literature and collecting data. Galaxy Zoo is something
similar in astrophysics where people are classifying galaxies. These are what–these are projects
where public input is required and need to use effectively. On the other hand, there's
something called the Tropical Disease Initiative which was started by several people including
a guy who gave a talk here called Marc Marti-Renom, which is a sister site to the site that I'm
involved with. There's also a big movement for trying to find drugs, for example, for
TB called the Open Source Drug Discovery Project which is an Indian project, which was started
fairly recently. And then on the bottom-left here it has something called Chemspider which
is a community center resource for chemistry, very unusual. Chemistry has an unusual history.
We have a lot of very powerful, very wealthy organizations who've become involved in collecting
and curating chemical data. And these things are usually–these applications are usually
quite expensive and universities tend to buy subscriptions to these things to gain access
to information. If you know any university which has that kind of resource–that has
these resources, then it can be quite difficult to get your hands on that kind of information.
The Chemspider is something which is on the web and anyone can upload chemical data to
Chemspider. So it became a community centered resource and with recent approaches by the
Royal Society of Chemistry, who is keen to promote this. So this is a selection of different
things where open projects are involved or open data are involved. And some of them like
the Galaxy Zoo and Open Dinosaur Project are actually where this community input, so public
input to the project. Open Wetware, the last thing I want to mention, is a site where anyone
can have a lab book online, free and post data of any kind that they wish. It's a very
impressive initiative, operates on the basis of the Wiki. So in order to have a page on
this, you do have to be a little bit savvy with how to write Wiki pages. But it is completely
free and open and anyone can have a lab and a lab book on the site. So it's very impressive.
We sorted out, a few years ago, with this website called the Synaptic Leap. So if you
would like to see more about it, please just Google us and have a look at it. The idea
here is it's a basic–a blog functionality. And we posted our problems with schistosomiasis
on this website. The intention of the site is actually to be an Open Science site for
anyone to conduct any projects they want in anything to do with biomedical research. We
started with the schistosomiasis project because there's a philanthropic angle, I guess. The
idea of spending some of your spare time helping us out when this project is to do with a neglected
tropical disease, gives participants, you know, good karma, right? So people feel good
about contributing to this. But really, it's–the aim of it is to have something which is wider,
so you can do any kind of collaborative biomedical research on the site. So up on the top there,
there's the Schisto Research Community and on the bottom there's an example of something
that my post I recently did, which is a chemical reaction which we did in the lab. And beneath
that diagram is all the data about how that worked and what happened, including the raw
spectral data of what we did. Now I just want to mention something about this. This is a
limited functionality. It still operates as a blog. It's built on Drupal. And I don't
know about you, but whenever I see a blog post which is interesting and for which there
are several comments, maybe 10 or so comments, by the time the comments get to about 10 or
so, I chill down, like a comic book and I'm reading them. A blog functionality is really
quite limiting. It's–you can't do a huge amount with this. And in terms of community
input, it's actually quite limited in what people can do. What we really want is to have
this, but much more intuitive and functional. At the moment, it's very linear and very flat.
We're doing our best with this, but I think we need something that's much more intuitive.
And that's really what I'm trying to get at here. Just before I get on to the nitty-gritty
of that, just a couple of other advantages of Open Science; more generally, the idea
of doing this on the web. So the project we're posting here, of course, is asking the community
to help us device a synthesis for this molecule for a very, very low price. The other advantages
of doing science like this, on the left, is transparency. You might have heard about the
controversy surrounding the climate change emails in the U.K., the University Anglia.
This idea that the public thought of that was science that was being hidden from public
view, where perhaps some research allegedly was being suppressed because it didn't agree
with the idea of climate change. This doesn't do science any good to have this kind of public
perception of what we're doing. And it–there's a real advantage in doing Open Science is
that everything is transparent and you can see what's going on and nothing is hidden.
And I think in terms of public engagement with science that's going to be very important.
In the middle of this picture is a starfish, again, this is a probably an IT analogy, the
starfish and the spider. Open projects don't have leaders. I mean, I am the leader of the
project that we're doing at the moment, but I don't have to be. The project is the important
thing and if I decide to do something else or if I called to write to do something else,
then somebody else will take over. It's a leaderless organization, which is a big advantage
for open projects. Somebody can take over; anybody can take over and anyone can take
on site projects and lead things. The third picture is meant to imply speed. To me, one
of the big advantages of Open Science is speed of progress. I – my theory is the one thing
that we're trying to test out in the next few years is an Open Science project where
anybody can contribute and experts identify themselves, we'll operate faster than a close
lab project. That's the hypothesis. And we are hoping to try and test that in the near
future. But we're starting with our Synaptic Leap project for this drug. The picture of
the plankton here is meant to remind me that the one important lesson we've learned with
Open Science over the brief period that we've been doing it is something which, I think,
is very well known to you, IT professionals, who have been working in Open Source, which
is that it's not enough to post a problem and have the community input. You have to
post data first; you have to post results, a kernel of activity to which people respond.
And this was very clear to us with the Synaptic Leap which, for the first couple of years
of its existence, was very quiet because we had no funds in our resources to put people
on the project. We then went the long route and asked the Australian government for funding
for this project with the World Health Organization as our sponsoring partner. And we secured
a grant for this project in May 2008, which took a while to get signed off, but is now
active. Now, we have somebody working on the lab who's posting real research data as it's
going on. And that means the people have a lot more to get their teeth stuck into. We've
just started, but it means the people can now respond to us. It's crucial I think in
any Open Science endeavor, any Open Source endeavor to have sometimes a funded kernel
of activity which people can respond to. That's our important lesson. Okay, so I want to say
something about experimental science and then I want to do my appeal for applications. So
this is the real nitty-gritty. Being an experimental chemist, being an organic chemist is very
like being a chef. You have things, your resources, you buy things in, you combine them with various
apparatus, which you have in the lab. A lot of the things in the chemistry lab have a
collaborate that's in the kitchen, it's amazing. You have, you know, a gas flame in the kitchen,
we have Bunsen burner. You tend to boil things off, maybe you want to reduce something and
take water off something; we have something with that in the lab. Lots of things that
we use in the lab are very similar to the kitchen–kitchen kind of chemistry. There's
a picture on my students. (inaudible) who–his bench is right to the left there, and he has
all the stuff laid there and his team covered which is the thing he uses for toxic stuff
is behind him. A bench chemist will come in early in the morning, 8 o'clock in the morning,
to think about what they're going to do. They'll design an experiment, they'll get their chemicals
together and use glass and glassware and metal things and a bunch of different things to
do their chemistry. They'll run a reaction. They'll effectively taste it, as you do when
you're a chef by sticking a spoon in and then licking it. In the chemistry lab, you never
do that. But you–there are ways of testing what's happening in your reaction. And then
you isolate the thing you've tried to make. You analyze it with some instruments and then
you write up what you've done. This is what a chemist does in our life. The analysis is
kind of complicated. As a chef, you taste things because your tongue is a very sophisticated
thing. In the lab, you have to take the molecule that you think you've made and put it into
some very large expensive instrument, which then analyzes if that's the right thing. There's
a lot of data here. For example, the instrument we use the most, we take all molecules and
we put it into this large super conducting magnet. And we spin this molecule very quickly
and we blast it with electromagnetic radiation. And rather like when you hit a bell, you listen
to what comes off the molecule after you've done that. So, when you strike a bell you're
going to listen to the tone. And, of course, what you get off the bell is always a sort
of vibrational data which is very complicated, and your ear transforms that into one note.
Similarly, with something called animospecstropy in the lab, we take molecules and we blast
it with this radiation; we get this very complicated signature that comes off. We use to make all
the (inaudible) transform and get these lines on a piece of paper. And the signature on
those lines and how they appear allows me to say "Yes, that's the molecule we thought
it was" or "No, it isn't." Lots of big instruments' generating lots of data. But in a typical
day, a student will use all these different things. If that student is meant to tell another
student how they did something, how did they do that? Well, they can write up something
in the traditional paper. That often hides little things that you might have done which
are special, in a same way that a recipe book–often recipes don't work, you don't quite follow
the instructions right or something was just missed out or the decimal point wrong somewhere.
If you want to capture the research process, you need ways of doing that that are really
quite data-rich. So you want to be able to capture things with audio and video. You'd
like to be able to post raw data to a website rather than the interpreted lines that we
tend to get, so maximizing amount of data that you publish. And really, you want something
which is quite intuitive and rich because you want somebody to follow what you've done.
For example, also in the lab, that's me talking to one of my students, Althea. We have these
fumes covered with kind of prospects covers and often you write on it, you write on whiteboard
at the back here. If you're going to collaborate with someone, you want to be able to easily
collaborate with them as if you were sitting next to them with a coffee, talking about
science. And really, at the moment, we can't do that outside my lab. We can't collaborate
with people who are outside and sitting outside my lab. If we've got a problem, we go down
the corridor and talk to a colleague. But if we want to throw this project open to the
world, we need a really intuitive way of collaborating as we would in a normal lab. How do we do
that? And how do we maximize the input that people are going to give us? Well, something
which, I think maybe some of you will know as IT professionals, is something called "stack
overflow," maybe some of you have heard about, which is a site where you can post code and
ask people to help you out solving certain problems. With code, that works really well
because you can cut and paste code and stick it on a webpage and people can rapidly respond.
It's also a very nice idea because you can have medals awarded to you for valor of service,
right, and your reputation increases. So it's a good way of trying to develop a reputation
for yourself as someone who help people solve a problem. Of course that's good for text,
but for experimental science, this just doesn't really exist. Something has been started,
Chempedia Lab by a guy in San Diego, Richard Paloka, who has taken the basic functionality
and tried to use it for chemistry. At the moment, it's quite text-heavy, but it's a
very good idea to try and use the same idea in experimental science because what we need
is something again that's still very much more intuitive and allows data-rich things
to be posted, allows links to online pages where all your data are posted, allows links
to online Lab Books for science. So basically, the structure we need is something which is
an intuitive Lab Book online, where all of the data are linked with your experiment and
which can easily be analyzed by somebody else as part of a collaboration and collaborate
is composed to your webpage. >> [INDISTINCT]
>>TODD: …type text and that's fine. And then you can post something like that, but
there is no chemical content in this that works and they're not understood by a machine
to be referring to molecules or chemicals. So the text is quite dead and if anything
was going to search this, you wouldn't really have a lot of input from the computer about
what is chemical information, what isn't. To be able to take text and convert that to
something which is chemically rich, so where wood is associated with a molecule and can
be searched and analyzed and indexed would be tremendous for HTML XML. And a guy called
[INDISTINCT] wrote the language called CML which is chemically rich mark-up language
and has worked with Microsoft and it has a reword here. Where did Microsoft to develop
a chem. word add-in, where a word document can be searched by a machine and the chemical
information can be annotated and extracted automatically. So when you hover over a word,
you get a structure and you can change the structure and that changes the word. So in
a PhD thesis, for example, this become a very rich document where all the chemicals are
part of the actual fabric of the text and are not simply words. Now an example that's
just recently brought to my attention with something called chemicalized dot org, this
takes any given webpage and extracts chemical information. And you can see, what's happened
here is that in the usual Latin text you got on pages, it spotted beta carotene, which
is a molecule and if you hover your mouse over that, you get the structure of the molecule.
If you click on that, you get taken to a page where there's a bunch of chemical information
about the molecule. This is very useful and it makes, for example, HTML pages are rich
for chemists; that means it can be searched very effectively. We could do with something
like this actually for Drupal, so given that Drupal is Open Source. What we really need
is an extra button on this menu up here which says, "Okay, take the text that I've just
entered here, scan it for chemical information and please annotate these individual words
so the page, when it's published, it's clear that these molecules are in there." So if
I write the word "Benzinc" and I click this little button, it's [INDISTINCT] molecule
and then on the HTML pages published, that becomes an active word that commend the search
effectively and can be annotated in this way. That will be a very nice project that we could
do which would enhance Drupal a great deal for chemists and make the resulting web pages
that we make much more functional. Okay, so the last–just, just summary–the summary
of where we are–this is my son, Harvey; he's playing with his first molecule, which is
great; start him early you know? But this is–this is how I feel with this, with Open
Science. I have absolutely no idea where we are going to be definitely going and how we
are going to get there, but it feels right. It feels–so, science that is trying to open
where anybody can help us out and nothing is kept secret is the real spirit of science.
It's fast and it's transparent and it's generous of spirit. This project that we're doing where
we are trying to get the price down of this drug with the World Health Organization, we've
got another two and three quarter of years to solve that problem. It needs to work; we
have to be able to show that by massively distributing a collaboration like this, where
everything is in the open. We need to show that we can do that and we need input from
chemists all around the world, particularly process chemists who work in industry, to
help us with this problem. The price constraint is extremely severe and it's a real challenge
for organic chemistry. Of course, what we would like to do eventually with Open Science
is to move beyond philanthropy. We are doing a project which is organic chemistry but we
saw–it's, it's hard for physical science, but it does have philanthropic element where
people might contribute because they feel good. What would be nice is try and move into
an area which is academically hard where there's a lot of activity at the moment and people
are competing with each other to show that an Open Science project could actually generate
papers and results faster than traditional close collaborations. That would be quite
exciting, but it's not something that we're doing right now. Generally, the dollar sign
there indicates that countries weren't maybe thought by many people about Open Source.
Open Science also may well need funded kernels of activity where projects are–small projects
are funded and lots of the scientific community can respond to us. Now that to me, would be
extremely attractive if I was a funding agency who want to de-fund scientific projects–if
I was a government agency, I want to de-fund scientific projects. What I would do is trying
to fund a kernel of activity and then have, have a wider group of people help me out,
so we leverage more activity for my funding dollars. There is an advantage there, of course,
that I've covered before is that once the funding runs out, the project doesn't have
to stop; it's this leaderless organization, the project can continue. And I guess the
last thing is a more general point that open data are very important. The idea that Open
Science of course share data with many people as possible; and this is always going to be
a good thing; that if we have data in labs which the public have funded through taxes,
those data should be available for any body to see. And open science obviously necessitates
open data as part of its reason for being. And so, my main appeal, of course, though,
so is–just to close–is that at the moment, we do not have really good intuitive tools
for scientists to collaborate over large distances effectively. We have tools that are–that
require tutorials to use. My students are very busy; they're very busy making molecules
in the lab. And they–I know what's going to happen if I ask them to try and learn how
to write a Wiki page or sit through a tutorial about how to use something. They are going
to say they're just too busy. Those are my students; and some of them are most receptive
to this that I know of. Trying to ask an experimental science student to learn something before
they can post their data online, to me, is like asking, "Gordon Ramsey to learn Arabic"
right, this is silly. He wants to do his cooking and generate a product. He doesn't have to
learn a language to be able to do that. And with chemists and experimental science students,
we need applications–applications which do not have tutorials on them so that scientists
can rapidly gain and comment on each other's work and share data effectively. So, given
that I've never needed to look at tutorial for Google app; I can't hear, right. So, I
use Gmail and Google Docs and Picasso without reading anything. I just started using it
because it was intuitive. If we could develop that for a Lab Book, for an open shared electronic
Lab Book, that would be fantastic. We–my lab is collaborating with a guy called Jeremy
Frey of the University of South Hampton to try and link online electronic Lab Books to
machines in a chemistry school. So, data are meshed with an experimental technique. If
we could expand this to have a front-end that was incredibly intuitive to use, I think we'd
go places. And I think a lot of scientists would love to do–to share what they are doing
if they had an effective intuitive tool to do that. And that's really my appeal, is for
something about to happen. A dialogue between IT guys and experimental guys, PhD student
in the lab and know what they need to try and develop a real killer app for an online
shared electronic notebook. Okay, so with that, I won't take anymore of your time. I
just want to thank Carol again, and Chris, and thank you, guys for coming along and listening
to this idea. Thanks. >> [INDISTINCT]
>> Science is still going to get tenure and paying in all kind of stuff and how does that
work with conventions or spread outs or [INDISTINCT]. >> TODD: Yeah. Okay, so the idea that if you
open everything up to the world, you're going to commit professional suicide, yes. I mean,
I–that's–at the moment, I think that that's certainly the perception. There are some brave
souls who are doing it anyway, a couple of people I just mentioned. It's interesting.
The–there is this sense, there is this–I'm sorry, there's this prejudice that if you
publish data online, you can't then put it into an academic journal. And actually, for
some of the high impact chemistry journals, that's true. So I can't publish in certain
big journals if I've already released data. In many cases, that is not true. So I–if
I have published everything on the web and I've decided to summarize it on a paper and
then submit to Nature, they'll take that paper. So there are also journalists which are happy
to do that, but this is a test case. How many journals will take those kinds of papers and
how effective are they? I think that if you can do that, if you can still publish your
work in journals, which are traditional journals, then I think you'll be okay. The question
about whether you'll get scooped is another thing. So if you have a great idea and you
then put it on the web, what are you going to do? Of course, if something is commercially
sensitive and you want to patent it, then of course you can't do that; that's off limits.
That's something is not–and it's something which you don't foresee to have commercial
interests. My theory is that by opening it up and by recruiting more collaborators, your
science will actually go faster; that's my theory. But I–no one has done that yet; I
don't' know if it's going to work. There was a nice example last year, I think, a guy called
Shean Cuttler who is a biologist; he's with Riverside, I think, who had a nice result
that he found. He's a pump biologist–had a nice result that he found and rather than
publishing his nice result, he went to actively recruit his competitors. And then together,
they published a much larger piece of work in the journal science, which is an incredibly
high impact paper and amazing and well-sited. The idea is that you can go and actively get
people to help you out. It's an anthem with a lot of scientists who may be don't usually
want to trust their competitors with what they've done. But I think this is–this is
the next exciting frontier to me is whether it accelerates your science. If it does, then
I think [INDISTINCT] committees would be rather excited about it. All right, thanks–thanks
guys.

12 Comments

12 Replies to “Chemistry on the Web: How Can we Crowdsource Chemistry to Solve Important Problems?”

  1. Matthew Todd says:

    Well, Google the Open Source Malaria project for starters

  2. Riskteven says:

    What the world needs is Open Source Chemistry and then I mean, especially, Open source pharmacological chemistry. I have seen videos on here that suggest this need in the world.

  3. Matthew Todd says:

    By the way, if an hour is too much there's a 5-minute version. Google "open science ignite sydney" For the open scientist in a hurry…

  4. Matthew Todd says:

    @EndeligHelg I don't know. He wasn't there when I gave the talk. 🙂

  5. DJControllerC says:

    @Dottyeyes the analagy that comes to my mind is online forums- they have been ground breaking for me- because even if one poster gives bad onfo- this is usually caught in the same thread- and you can cross reference info and see how often it pops up- this gives you often or almost always a very good idea how reliable info is before you apply it- forums are now including photos and links to videos confirming the accuracy or innacuracy of data presented in a forum- its a remarkable thing

  6. Dottyeyes says:

    @Dottyeyes : by "lucrative" I mean in the way that gmail eventually became commercially lucrative for Google, but also in the collaborative, scientific discoveries that might happen.

  7. Dottyeyes says:

    Well done! I'm not in the field; I just stumbled across this. An intuitive lab-book interface sounds lucrative! And collective intelligence can work. Even in average-intelligence audiences on "Who Wants to Be a Millionaire?" the ask-the-audience lifeline usually gets the right answer. But then, those Jay Leno man-on-the-street surveys argue to the contrary. In any case, open-source research is a great idea whose time has come!

  8. j1n3l0 says:

    Great talk. I'd be interested to see if it works.

  9. Matthew Todd says:

    @ZerqTM We wanted to use Drupal since it's open source, but a linear blogging platform is not the best way to collaborate, so we're looking around for alternatives, and there aren't any. We need a new intuitive lab notebook – hence this appeal.

  10. Alex Holcombe says:

    Very nice talk!

  11. michalchik says:

    @robertvk I am not an organic chemist but I got A's in O-chem and have studied it some. The first slides just seemed like overviews of approaches, nit actual synthetic pathway.

  12. ZerqTM says:

    If you want intuative interfaces then stay the hell away from drupal lol X3
    I would go for dotnetnuke or mojoportal…
    PHP is just a rotten language with crapy unicode support…

    and drupal is just built wrong….

Leave a Comment

Your email address will not be published. Required fields are marked *