Petros Koumoutsakos | Computing for the Benefit of Mankind || Radcliffe Institute

Petros Koumoutsakos | Computing for the Benefit of Mankind || Radcliffe Institute


– What I would like
to talk about today is about my research, which
is not so diverse in the end. It’s all about computing. It’s just that computing
has applications in many different fields. And I would like to discuss
three elements that you will see repeatedly today. One is about laws. Not necessarily the law of
the land, but physical laws. It’s going to have
to do with data, and there’s going
to be about humans. The dots that you
see here is what we have been doing over
the centuries in science, connecting our driving physical
laws by human intuition, by observation, by
interaction with data. And then you would think that
has happened in the last, let’s say, 60, 70, or
even perhaps 100 years these computers. And these computers
are generating new paths that connect
all of these players, and I would like to discuss in
particular these connections between physical
laws and computers; between daytime computers
and in particular humans and computers. And I would like to
discuss if there is– if computing is or
can be structured for the benefit of mankind. Now computers and computing. You can say that the modern era
starts with people, like John [INAUDIBLE] and
Robert Oppenheimer, and perhaps with the
Manhattan Project. And the Manhattan Project
is infamous for this. [VIDEO PLAYBACK] [EXPLOSION] – There it goes. The fourth atomic bomb have
been successfully detonated. – So computers, or computing,
can trace its beginning back to the Manhattan Project. And as you can see, people have
been dealing with computers, or they have been concerned
with computers, early on. [VIDEO PLAYBACK] – Good evening. I’m David Wayne, and
as all of you are, I’m concerned with the
world in which we’re going to live tomorrow. A world in which a new
machine, the digital computer, may be of even
greater importance than the atomic bomb. – So indeed, computers
have been part of my life. I have a longer
relationship with them than I have with my wife, for
example, starting back in 1981. And you can see that this
picture goes all the way to 1997. And I stopped in 1997,
because after that I was not programming or doing things. I became a professor, and
then I had students of mine that actually began programming
and doing great things, and doing great things
in some of the biggest supercomputers of the world. What is important
to see over this 36 years is that, going from 1981
to 2017, the speed of computers has increased by a factor
of about 20 million. So you can take any technology–
imagine you take a car, and a car that runs
with 60 miles an hour after in the span
of about 36 years is able to run 20 million times. So this is a
tremendous technology, and I would like to show
you some of the achievements that we have been
doing with it today. So the people in
general they started becoming aware of computers
when computers actually started beating people in
what would be one of the signs of human intelligence,
and this being chess. And then people became very much
aware of the power of computers when the Deep Blue
beat Kasparov. Or perhaps, very few people
are aware that Deep Blue beat Kasparov because of a bug. In fact at some
point, what happened is that it picked actually
the best move, which is a medium term
best move, but then because of the bug
and the algorithm, it picked this medium term move. And it takes another move,
which was actually random. And then when Kasparov’s
saw that he said, that this thing is impossible. Only human could be
doing such a thing, like sacrificing a short term
advantage for something that is happening later. Now today, computers
have improved. We can beat actually
humans in other things that are even more
sophisticated, like the computer game
of Go, and more recently, a few– a couple of months ago,
even between humans on poker. And it’s not that they
are good at games. They are also good at replacing
testing for atomic bombs. And what we can do
today, we can actually do simulations that are
replacing the experiments and testing of atomic
bombs, precisely because we have developed
such computing capabilities. Now another game
changing technology that we are facing
today is data. So today there is about
five billion devices, or about 1 billion people who
have access to these devices. So it’s important to distinguish
that these devices are producing data they’re
not necessarily computing, but there is nothing that to
escape all these kinds of data that are coming together. And I would like to argue that
computers, humans, computing, and data are converging
today, and it’s a wonderful opportunity
that is presented to us. So a little bit about
what we do with computers, or what I call computing. So what is computing? It’s about information
processing, and it’s about what can be
computed and how to compute it. It’s not only about
computers, but it’s also about posing the right
question about creating mathematical models,
about making algorithms, creating software, being
aware of a particular computer architecture, performing a
simulation, analyzing the data, and eventually trying
to reach a decision. And in the end, of course,
you never stop there, but you keep looping and looping
and looping until you go back, and you change your question. And you try to get
better a decision. To give you an idea, let me give
you an example of from ecology. So this is foxes and
rabbits, and foxes are chasing of course rabbits. And there are some
bizarre biology that I’m going to be
telling you about, just to make the thing simpler. We’re going to have rabbits. They’re going to be eating
grass, and immediately giving birth. And they’re going to be dying
when they are eaten by foxes. We’re going to have
foxes that eat raw meat, and they give
birth, but they also die, perhaps out of old age. And I will introduce grass. Grass grows by rain. It dies by rabbits, and it can
also die because of pollution. So this is something
that maybe someone is interested to understand. What happens when you don’t have
just there a Fox and a rabbit, but when you have many of them. So you write down an
equation, and you write down an equation that tells you
how the population of rabbits is changing with time. This is this dr dt. And then it tells you
how the number of foxes is changing with time. And then you write an
equation that tells you how grass is changing with time. Now this is great. What do you do with that? Well, you take the equations. You do numerical algorithms,
and you create software. And today we are blessed to have
amazing capabilities of just writing one line that is able to
integrate all these equations, without even needing to develop
numerical algorithms, studies stability, etc. And then after you do that,
then you perform a simulation. So here’s a parameter that
I want you to pay attention. This is a parameter
that says that if grass is dying due to pollution,
it depends on the parameter epsilon. So if we set this
parameter epsilon to zero, it means that the grass is not
dying because of pollution. So we fix some other parameters
for rabbits and foxes. And this thing is
performing, and you see how the numbers
are changing over time. You see this predator
prey architecture. Now if you change
it, and you say, well, let’s have
some of the grass die because of pollution, then
the dynamics are changing. And you see that
indeed in the beginning you have that
rabbits are growing, and then the foxes are chasing
them and they’re growing. But then there’s less rabbits,
and then there is less foxes. But it’s interesting to observe,
that as the grass eventually is going to be
starting to die, you’re going to see that there is less
and less foxes, first of all. And then less and less rabbits
that will be happening. So this is some of the things
that we do in computing. Of course today, we don’t do
it only for three differential equations, but we can
do it for trillions of differential equations,
which I will also show you in a second. So why do we compute? What does it give us? Well, it can complement
theory in experiments. And there’s a lot of things that
we can do with theory that we cannot do with computing, and
there’s a lot of things that we can do with experiments that
we cannot do with theory or computing. But I think the three together
is a very potent combination. We can test hypotheses. There is a lot of activity
happening today about geo engineering, and there is
a lot of crazy things being proposed about how we can
interact with the environment. Perhaps we can test such
things in the computer first. We also have data. We can process and
we can analyze data. We can optimize and design, and
we can also, as I said earlier, decide. So it’s omnipresent,
and it’s omnipresent in every field of science. So in the end, in my opinion,
what do we do with computing it’s about acquiring
and creating knowledge, and it’s about
trying to predict. And I would like to elaborate a
little bit more on this topic. So what is knowledge? Knowledge, perhaps you
can start to classify it in different ways. One of the segments of knowledge
is how do you do something, and what do you know
about something. And you can say
that this how can be classified as classical
knowledge, and the what can be classified
as empirical knowledge. And if we go back
to the Greeks, we can call that the
episteme, and the techne. And the techne is what became
later artificiale became art. And then our episteme usually
leads to principles and laws, where techne usually is involved
with heuristics and structures. And in the end heuristics and
development or structures, this is the job of engineers. And the development
of principles and laws have been traditionally
the job of scientists. What has been happening,
in my opinion, in knowledge is that they have been creating
silos that have been created, and often the two
things they will not be talking to each other. And where I compute there,
consider that computing is enabling is that give
us a common language, and give us a fantastic
bridge to go between these two different types of knowledge. And I would like to
demonstrate that. So let’s look at
classical knowledge. Here is Leonardo da Vinci
“Old “Man and Vortices”. You’re looking at
the flow; you are looking at the flow
around the plate here. And you can think about that. But then there came Newton,
and he put down these equations of how the flow is happening. These are the famous
Navier-Stokes equations, I’ll be showing you
again and again. And what is the classical thing? The classical thing with
equations and principles is that you can make
precise predictions, and you can provide
explanations. At the same time, you can do
that by a few rather expensive, in terms of menu terms
data and observations. And at the same time
what you have developed may not be useful
for applications. And what is happening today
is that these capabilities are greatly enhanced by computing. An example, very often
maybe you look up in the sky and you see this contrail. This is the condensation
of air that gives you actually markers of vorticity. Vorticity is what
makes airplanes fly. And this is experiments
that can be done, and people are looking at how
these vortices behind aircraft are being destroyed. So you can do experiments,
and this is more or less the knowledge that he
acquired from experiments. But if you perform
a simulation then it is possible to get
inside these contrails, and look at the
muscles of the flow, look at what creates
the vorticity, looks at what creates
all these instabilities. And how very precise
and very detailed information about instabilities,
and about how you can interfere with the flow in order to
be able to learn about it and eventually to control
and to manipulate it. So yes, we can do
all these things, but there is a lot of things
that we are challenged with. And one of the
challenges that we have– I told you that computers
are 20 million faster. This does not mean that the
way we compute is 20 million faster. On the contrary, there is a
big challenge in computing, and the people who are
dealing with computing, that this software is not
catching up with hardware. In fact, one can
say that this is one of the biggest
gaps of computing, and we are failing
Moore’s law in practice. And about 90% of the software
is using less than 10% of the hardware, and this
is an optimistic view. Now what does this mean? This means that this is an
inherent problem that you have in energy, given how
many computers you have, and how much does it cost
to create the computers. So we make these
expensive machines, but then we are
under using them. And another thing, that
is actually data tell us, is that, for example, CO2
emissions from Germany are higher because of
information and communication technologies, cooling of
supercomputing centers, than the pollution that is
generated by air traffic. So one of the
challenges of computing is how to actually create
mathematics and software that can actually use the computers
in the most effective way. So we try to do that. And the way we did
that is by trying to solve an actual problem. We decided to write, perhaps,
the fastest or the best code that ever was, dying
computational fluid dynamics, and to try instead
of using 10% to see how high we can get on that. The problem we
decided to study is something that is called
cloud cavitation collapse. The idea is that when
you get to boil water, by increasing the temperature
of the water you get bubbles. There is one more
way to get bubbles, and that is by decreasing
the pressure of the water. You decrease the pressure of
the water when the water is get going very fast. So that’s how the bubbles are
generated over the airfoil, and when these bubbles
collapsed together they generate huge pressures. And these huge pressures, they
can actually be little bubbles, they can destroy propellers. And at the same time,
people are using them in order to destroy tissue that
exists between blood vessels and tumors, and
actually use them in order to destroy this tissue,
so that when drugs are coming out of the blood vessels
they can actually arrive on to the tumor. So they’re doing
that by putting– here they have injected bubbles
together with the drugs, and then by doing ultrasound
they’re cavitating– the bubbles are
collapsing destroying the tissue, the same way that
are destroying propellers. Now what people have
done in the past is that you understand
that by the destruction power of this process it’s very
difficult to do experiments. And by the complexity
of this process it’s very difficult
to do theory. In fact, the only
theory that exists is what happens to one bubble. Now what we do is we take the
equations, and I can tell you these are continuous equations. You have to discretized them. You put them into the
computer, and I’ll tell you in how many points we
discretized them, in a second. I want to show you the
complexity of this, and this is one of the
largest simulations. Actually, is the
largest simulation ever made in consideration
of fluid dynamics. This is the problem of cloud
gravitational collapse. You see a part of this bubbles. You see the yellow, which is
the pressure that is happening. At the center of the movie that
you see lasts a few seconds. In practice, it lasts
less than a microsecond, this whole process. So that’s what computers can do. They can go down to
less than a microsecond, and less than millimeters,
and be able to give you these processes. So we do these
simulations, but we do them in order to understand. And we understand
that nobody in a way that nobody ever seen before,
and that, actually, I’m happy to show you this
for the first time. We actually find that when the
bubbles started collapsing, what happens is
that there is fluid that is coming
inside the bubbles. It is being accelerated by
the stretching of the bubbles, and so we know now that
cavitation collapse can be derived as
interactions of micro jets. So I told you that we set out
to do the fastest code ever, and this is the
fastest code ever in computational fluid dynamics. And it goes to say, about the
international collaboration, that, coming from Switzerland,
we were able to get access to the Lawrence Livermore– Lawrence Berkeley and
Lawrence Livermore, in this particular
case, national lab. They gave us access
to 1.6 million codes. We were able to run
our simulations there, and what has been an
amazing achievement for us there is something
called the America’s Cup. As you know, for yacht racing. There’s the golden
bell award, that is also being something
that is done in computers. So we were fortunate, again,
thanks to collaboration with our colleagues between
Switzerland and the US, to be able to demonstrate
of doing simulations using trillions of
computational points. Today we can reach
100,000 bubbles, and we can actually
compress data by a factor of 100, which is
also an important achievement. Just to give you an idea
of what has happened, when we ran our code, we
took our code actually and we ran it on the
machines in Germany. And then machines in Germany
they will tell us we cannot run your code. And then we will tell them why
can’t you don’t run our code, and then this is
what they told us. You know when you
start to run, instead of 10% of 70% of the machine,
the machine started to shake, and they had to physically
go and screw down the cooling units so that
they could run our code. So this is actually counts much
more than the previous numbers that you have seen,
but all these things have been resolved. But this is actually
real engineering in order to be
able to run a code. I want to show you one more
landmark simulations that was done thanks to my students. This is another problem, that
I will tell you in a second. This is a microfluidics device. This is a copy in the computer
of microfluidics device that has been designed by
[INAUDIBLE] here at Harvard– at Harvard, at the MGH. And what is the problem? We’re looking into
cancer, and 90% of cancer is attributed to metastasis. When tumors are growing, they
are emitting growth factors. These growth
factors are actually cheating the blood
vessels to grow, and when the blood
vessels are growing, they arrive to the tumor. And the tumor finds their way
to escape through the blood vessels. So we want to find out if the
blood is contained in tumors. So the processing
that people have to do is you have to find one
circulating tumor cell in one billion red blood cells. So there is an
ingenious device, that was developed by the group
of [INAUDIBLE] whereby taking advantage of
some fluid dynamics processes, they’re able through
empirical ways, actually, to distinguish cells
by size by taking them to go in different
parts of their domain. And what we have
achieved, actually, is to be able to go down, and to
simulate every red blood cells up to 0.2 milliliters of blood. In this particular case, you are
visualizing 200,000 red blood cells in the same
microfluidics device. It’s a millimeter long
microfluidics device, to appreciate the
size of the domain. So we can repeat
the same experiment that the group of [INAUDIBLE]
does, and we can actually try to do optimization
of that different panels of the different geometries
that you see here. So we put down the
red blood cells. And then, as you
will see in a second, we create a model of a
circulating tumor cell. And what we do is we take
it and we implanted there, and then we try to see if
by continuously flowing these things we are able
to separate red blood cells from tumor cells. This is a homage to
the fantastic voyage, for those of you that are old
enough to have seen this movie. It’s quite a lot of effort
to be able to do that. But to cut a long story short,
you start particles there, time zero, and indeed,
after about the same time that it will take in an
experimental facility, of course it takes
much longer to compute. We are able to separate tumor
cells from red blood cells. So this is the classical
way, I call it, of knowledge, where we’re
setting up physical laws, and we’re using the computer
in order to advance it. My original training has
been on empirical knowledge. It has been about using
dependencies from data, or using experience
and [? CID ?] design. There has not been
a lot of science in designing [? CIDs. ?] There
has not been a lot of science in designing the
first airplanes. You are imitating what other
shipbuilders have done. You’re using here rustic
rules left and right, but somehow there is a
lot of the things that are flying today and
swimming that are based on this approach. So the empirical way is about
statistical predictions, at best. It’s not about the precision
that physical laws have. So it is expensive,
because it requires that you have a lot of data
and a lot of observations in order to do statistics. And it realize that this
data on observations have to come in a certain
way in inexpensively. And of course this inexpensively
is changing today about from computers. And the last thing is
that empirical knowledge has a practical utility, but has
a practical utility for a given application. When you do a training
as a naval architect, you cannot go and
become a cell biologist. You cannot go and even become
in aeronautics engineer. And so this empirical
knowledge is very useful, but it’s very limited. At the same time, this
is changing today, because of computing. And if you decide to focus
on computational aspects of empirical knowledge,
you are returning back to similar problems
that you encounter in classical knowledge. And for me, this
has been the pathway to go from naval
architecture to cell biology, by finding common
computational problems. So I would like to show you– well, these are
examples, actually, of empirical knowledge. This is about machine
learning algorithm, which I call empirical knowledge,
and I will argue later of looking at the images
being better than humans, looking at text being
better than lawyers, and using the power of computers
in order to derive things. Of course, when you have
a code that does not, it’s not the same code that
does that in terms of machinery, but both of them they say are
the aspect of being empirical in the way they proceed. We also follow
that, and I was very happy to meet here actually
Manuel Dominguez Rodrigo through– Manuel is the husband
of Mary Pendergrast. He’s also a fellow,
and we work together. And over lunch he
asked, can we use some of the machine learning
things you do in order to distinguish marks made
by weapons versus marks made by trampling over bones. So we did that. There has been a
wonderful collaboration. And we’re basically
trying to answer when did the humans start to hunt? So by distinguishing
what is a cut mark and what is a trampling mark,
through machine learning we’re trying to answer. I think this is a very
important question. Let me give you a more
elaborate example of how you can use data,
and to give you a Warning as to what
machine learning can do, by looking at trajectories. So here is trajectories
of viruses. And we have done classification
using support vector machines, where by looking
at characteristics of the particular
trajectories we can distinguish what is the
thing that the virus is doing. And we are very much
interested in places that are marked in red,
because these are the times that viruses are going
on straight segments. Now the knowledge about
viruses, and how do they move around inside the
cell is very limited. We know that they attach
themselves to microtubules, but microscopy is not
good enough to tell us how many attachments they have. And then we resort to movies. This is done by a company called
BioVisions, here at Harvard. These are microtubules,
they’re inside the cell. And then we get here
a vesicle, and there are proteins that are moving
on these microtubules. The viruses are recruiting them. And these proteins are bringing
them either to the center or to the outside of the cell. I was interacting on that with a
guy called Ari Helenius at ETH. And we actually wanted to
examine different hypotheses by looking at trajectory data. Looking at whether
kinesin and dyein, the blue and green are together. Whether you have only
one of the two, or there is another secret
molecule that is acting. The way we did that is we
took that trajectories. We analyzed them, and we looked
up straight trajectories, because that’s when we
thought that the virus exists on the surface of microtubules,
and is doing a directed motion. And we specify the model,
it’s a stochastic model. It contains six equations here. And there six equations
tell you, for example, that you have dyein not
tossed and dyein moves. You have kinesin tops
and kinesin in moves. They move in
opposite directions. You have that they unbind, and
you have them that they bind. And now the whole thing
here is– yes, you’re right, this model. But you say, I do not know
what are the rates under which these equations are operating. So you have to do some
reverse engineering, and use that data in
order to find these rates, because you cannot really
do first principles. So we do that, and we specify
probability distribution of the velocities that we get
from the data and the distances that the viruses are doing. We’re doing some
expensive optimizations, in the sense that
it takes millions of function evaluations
to do such optimizations. And the net result
is that we are able to find out what
are all these rates, or when the experiments
and the simulations are agreeing in terms of their
probability distributions. We know what are
these different rates, and then we can
specify that it has been a stochastic tug of war. But we can do even more
besides these things. We can actually find out
by playing a little bit with our model where
these proteins are attaching themselves. So by optimizing
our model further, we find that we
can predict better the data when the binding
sites are 10 and 15. Now when the viruses
are inside the cytosol, there is three
proteins where these that are existing
on their cupsid, and these three proteins
are the places where the kinesin and the
dynein can attach itself. So now, if you
take a microtubule, and you project it over
one of the sides over here, then you find out
that the amount of places that you can attach
yourself are between 10 and 15, indeed. And this has to do with
this hexon-hexon interfaces, not with this p 9. So we did that, and then
the biologists in our group they destroyed p 9,
and then after that, we put again the viruses. And the viruses were able
to move inside the cells. So this is a prediction
by a computer model, as to what part of
the virus you destroy. And what we find is we
published this paper, and actually a week
later another paper came out from a completely
experimental group that they actually have reached the
same conclusion by doing experiments and playing
with the different factors. They identified
that viral capsids– actually, the hexon
sub units is the place where kinesin and dynein
are attaching themselves, and they’re moving the viruses. So you can imagine that when
Google, or other places, have access to
your trajectories, they can process trajectories. And a lot of information
can be inferred. I will not be
surprised if people know what I have inside my
refrigerator by looking at what is my particular trajectory. Now I will come back to that. So this inference is
actually, again, this is a very powerful
process, and you can take data from trajectories
and look at structure. So this is, again, a comparison
of the two knowledges, and as I mentioned
earlier, I consider that now computing
is coming together, and is putting it together. We can actually use
first principles, and do statistical
type of calculations, because we can afford to
do many of these things. We can actually take
this possibility that the principles
are not useful, and combine them with
data and actually create new practical utilities
for given applications. But there’s two
more things that we can do that we could
not do before, I think, and this is thanks to computers. One is to, for the first
time, understand that Newton and all these people
that they put forward the equations
there is a problem. And the problem is that
there are imperfections in the ways people
derive equations, and there are uncertainties. This is the first thing that
I would like to tell you. And the second thing
is that heuristics, the same way that
first principles can give you guidance for
causation, heuristics, when combined with
first principles, can actually lead to causation. And again, both of them are
thanks to the capabilities of computers. So what is about the
imperfections and uncertainty? So I gave you a model
that I made earlier, but the model was about
foxes and rabbits. And then at some point, I
created a mathematical model about foxes and rabbits. Then I made the
computational model. I could have compared
with experimenters, as I have done in my other
simulations, but everywhere– everywhere when experiments
and mathematical models are being compared– when
computational and mathematical models are being compared
everywhere there are errors. There are errors in modeling. Maybe they are terms I forgot. There are error in experiments,
and errors in my models. There is discretization errors. You write down an equation,
but you corrupt it when you put it inside the computer. So out of all these
imperfections, you have to acknowledge
imperfections if you are to be able
to arrive to knowledge, and to be able to quantify
imperfections and uncertainties in your predictions. So this process of
uncertainty quantification and mathematical models
has been very important. And here is an example– [VIDEO PLAYBACK] – Tom Menino is
defending his decision not to close the schools there
today, despite a snowstorm. He says he got
conflicting forecasts. And when severe weather
is approaching– when you hear us talk
about the European models, we don’t mean anyone who wears
fancy clothes walking down a runway. Those multicolored spaghetti
lines on the weather maps when big storms
are approaching, those are all computer
models, predictions. Some are from our government
forecasters, but increasingly the European computer– – So this is the
big success story of the European computer models
was on the Sunday storm, where they had– [VIDEO PLAYBACK] – The question is,
do we need the push broom, or the great
big shovel out here? We really don’t need either. – Weather Channel’s
Jim Cantore– – So what does this–
what are all this– what this thing says,
basically, is that– that depending on the model
you get, you have parameters. You may get a different
forecast from one model. You may get a different
forecast from the other model. And I should tell you that
the European medium weather forecast model also
fails in other cases. So whenever you hear an
answer from a computation, the same way that
in the past we would go through experimentalists,
and ask them for error bars, you should go now to
modalists and ask them also, like people like myself, and
ask us also for error bars. So these error bars
are becoming important. For another, I’ll give you
another example about that. You look at the
experiments, people study how much water is
flowing in carbon nanotubes. The dimensions you see across
there is two nanometers. And then people are studying
how fast you can push water, and you can push that
orders of magnitude faster than if this
was a much larger pipe, due to specularities
of the molecular level. But what I want you to
observe is that these are order of magnitude. And you can see even
for the same length that you have two
or three orders of magnitude differences on
their results from experiments. Or one experimentalists
will run this experiment, and we get the factor of 1,000
different than the other. Both of them will
publish their papers, but the question is
why does it happen? And the same error bars
actually appear in simulations. So in simulations– we
do molecular simulations. We have parameters, because
of the models we use. The parameters we choose
the way we do the computing. And then we use
measurements to calibrate. So depending how
we calibrate, we can get also different models. And depending on what kind
of computing resources we can run, for example,
that weather model at the size of one kilometer
resolution or 100 meters– 100 kilometers resolution, and
that makes a big difference. So we go back to a theory,
that came back about the 1700s, and this is by Bayes. And the idea is that you
have to use theories, but the theories
have to be charged in terms of probabilities
in light of the evidence. This is Bayes formula,
and if you replace a here, what is the probability of
a certain hypothesis given the data? You have to do that by looking
at this thing over here, which is the likelihood
or the evidence. What prior information
that you have? And then you create
the posterior. So once you calibrate it,
and you have this probability distribution, then you can
go and predict something. But you predict only
with probabilities. So instead of giving
you a value for epsilon that they gave earlier
to be 0 or 0.2, now I would give a
probability distribution, because this probabliity
distribution came from data. So when I predict
how fast the water is going inside the
carbon nanotubes, I give again a probability
distribution about my results. Now where are the
computers in there? Well, the computers
are in there, because to do this
probability distributions, you have to do high
dimensional integrals. For every parameter, you
account for an integral for one dimension. This is a highly dimensional
problem, without computers. The theory of Bayes has
been there for 400 years. It’s only now that we
are able to access it, and we are able to do wonderful
things, in my opinion, with it. Last one, heuristics
and causality. This is about an algorithm
that you all know. And this is the algorithm,
by Charles Darwin. It’s about evolution. And we use this algorithm
in order to do design. Instead of having genes,
we consider the parameters of an engineering system. And then by evolving
this engineering system in a probabilistic
fashion, the parameters in a probabilistic
fashion, we’re able to do reverse engineering
in the most powerful way. In mathematical terms, you can
put evolution in that sense. I will not go into the
details, today at least, but I want to show you
some examples of what is it that we do. You can use that actually
to compute behavior, by reverse engineering,
again, data. So one of the data
that we look at– this is a little fish, and
he’s been chased by a big fish. And the little fish is
escaping constantly. So people have been fascinated
by the way the little fish are escaping in the water. So actually, if you want to see
it a little bit more up close and personal, is
what these guys do is they sense the water when
the mouth of the predator is coming. And then they’re able to escape. But they way they escape
is very, very peculiar, and it’s encoded in
their nervous system. What they do is when
they’re disturbed, they’re doing something
that is called a C-start, and the C-start is this
bending of the body so that it looks
like a C. And then you can see how fast it is
that the camera is actually losing it from the frame here. People have done experiments,
but you can do experiments. You can visualize
the flow, you still don’t know if this is optimal. But if you formulate it in the
computer as an optimization problem, you can find
out if nature is optimal. So what you do is you specify– you take the animal, and
you take the geometry. And you specify some parameters. You specify a cost
function, and then you give it degrees of freedom to
discover different motions. So you do that, and you
start with the motion that these fish like. And in this particular medium
or flow viscosity, if you like, it doesn’t go very far. But it goes very far when
it’s doing the C-start. So the C-start actually was an
outcome of the optimization. After 8,000 direct
numerical simulations, we were able to
discover that, and we were able to compare
with experiments. We find the center line
is the same deformation in the simulation and
in the experiment. But all this thing
is not explanation. The explanation comes
through a very nice idea of [? Vim Varis ?] who
is in the audience. So we look at this thing. This is vorticity. This is great, but what
[? Vim ?] and Mattia Gazzola did is they looked at what
this body of the fish is doing. This is a little animal,
so in order to push fluid, it has to use all its body. So what it does is it engulfs
us much fluid as possible, and then with a swing of the
tail is getting out of there. So using heuristics, using
these evolutions strategies, and using direct
numerical simulations, we have looked into
optimality in nature. And the last question that we
ask in science, if you’d like, is about fish schooling. So schooling is one of
these magnificent patterns that you see in nature. And you ask the
question, do these guys have a choice that they
school, or is it their fate that they are doing that? So we’re trying to
understand that, and we started by doing
it in a more humble way we start with two fish. So this is filmed
with an iPhone. And so you look
at these two fish, and you look again
at what they do. And you try to understand what
is it that they are doing. But before you do that,
you have to find out if this is efficient
to swim behind someone, or what is it that you gain. But to do that, you have to put
two fish to stream together. And to put two fish to swim
together it’s very difficult. First of all, because no
simulation capabilities existed to have two
self-propelled bodies, but we have resolved that. But what you see here is you put
two fish, one behind the other, and you give them
an initial motion. But then the second fish
is actually speeding up, and is catching up
with the first guy. So you say, aha,
that’s what happens. But then you move the second
fish a little bit behind, and this optimization
does not work. We threw every optimization
technique we have. This is not an
optimization problem. This is something else. This is a reinforcement
learning problem. An reinforcement
learning problem is Pavlov’s dog problem. The idea is that
you have an agent, and an agent is experienced
in the environment. Is acquiring information,
and the key thing is reward. And once you have
the reward, then you are able to find the optimal
policy by which you are moving. Now, Guido Novati,
is in the audience, has done a lot of
work about that. But just to give you an idea
about reinforcement learning, I use this example usually. It’s about bugs
solving problems. So here’s a bug, and
he has a problem. And the problem that he has
is what you see right here. So he has to solve it. So he tries very hard. This is a three minute movie
from the movie Microsmos. You’re not going to see
all three minutes of it, it has been edited. So you see he tries,
it doesn’t really work. And then what he’s going
to do, in a second, he is going to be almost
solving the problem– but not quite. Then he gets the idea
of trying another way. He goes around. Well then he is going to
totally miss the point. [LAUGHING] But to cut the long
story short, this guy is actually successful, after a
real time of 3 and 1/2 minutes. And he is able to feel like
applauding after seeing this three minutes. [LAUGHING] So this is reinforcement
learning without the math. So what we do is we take
actually fish and simulations, and we have actions,
which is how we move. We put states, which is
to see the other fish. And we put the reward to find
out if you are efficient. And what we find, it’s
actually the second fish has the option to not swim
behind the first fish, but because of reinforcement
learning, he finds out that there he can
actually go, and harness the wake of the
first time animal and increase its efficiency. This is more spectacular when
you see it in three dimensions. This is one of a kind simulation
again, and in my opinion it ranks up there with a
cavitation and the red blood cells, because the
first time that someone sees two self-propelled swimmers
that together actually have learned to follow its other. And this is an early learning. And this is a later
learning that you will see, that the second guy totally
in an automated fashion, has learned to avoid
the vortices that will disturb it from the first
guy, and goes in between. So through
Navier-Stokes equations, and for the first time,
Navier-Stokes equations and deep reinforcement
learning, we do that. And actually, what we
are very much interested is to revisit some experimental
results that people argue that fish stay behind
stones, and they are exploiting vortices to
reduce their muscle activity. So we’re moving into
things like cyber fish. This is how I fell asleep
by reading books to my kids. This is from a great
book called Flotsam, but I thought it
was magnificent, because these are
some of the ideas that we would like
to do, combining the cyber and the real. So cyber and the he
has a long history. So this is now we’re going
about computers humans and data. And there are things that
talk about my machina sapians, and how the humans and
the computers and the data are coming together. So here’s David Wayne. [VIDEO PLAYBACK] – Even the scientists
argue that. I don’t believe that we can
say yet that machines do think. I have a basic question,
which I always ask, and that is are these
producing anything really new. Until I see a machine
producing genuinely new things, I will not agree
that machines think. [VIDEO PLAYBACK] – I confidently expect,
that within a matter of 10 or 15 years,
something will emerge from the laboratories which
is not too far from the robot or science fiction thing. – This is Claude Shannon, of
Shannon Information Theory. And this is, again, a
movie from the MIT archives that I found from the 1960s. And this is perhaps
how humans think. This is the fixation
of The Thinker, one of my favorite paintings
by Georgia de Chirico. And talking about people that
have looked into machines, a lot of the things
happened back in the 80s. And if you are very fast
and you read all this, you will find people like
Stephen Wolfram in there, and you will look at all
the cellular automata was one of the early ideas
of putting machines together. You will find,
actually down here, the connection machine that
I had the honor to compute. And you will find
also one more person, which is Gerard Vichelac, which
is the husband of Judy, that actually happened to be
one of the pioneers working in this interface between
machines and intelligence. And actually at
that time, people were thinking how to
combine the two in order to create computers. Moving forward,
today this interface of machines and humans has
taken another dimension. We’re talking about
society, but we are actually coming to the level of talking
about the digital society. And now machine
learning can also be used for the digital society. [VIDEO PLAYBACK] – –machine learning and
artificial intelligence making Facebook different for people. How has it changed
Facebook, the platform? [VIDEO PLAYBACK] – Yeah, I think, you know,
AI machinery learning is really a key tool
that will help people manage just the tremendous
amount of information you see everyday. You know, every single
time you log into Facebook, there are hundreds or
possibly thousands of things your friends have shared. We can show you. And our job is to try to
find absolute things that are the best and
most interesting to you, the thing you
want to see and share. – So information is growing. And this is a picture of
me trying to cast an Uber, and, of course, this is the
picture that I get from Uber. And you know the recent stories
about what pictures Uber might be presenting you. So my information,
which is actually I consider it to be my
product and to be considered I consider it also to be my
property, is being shared. At the same time, I don’t
receive the same information from Uber. For example, I don’t know
how many people are waiting. So perhaps I can hitch a
ride with someone else. So the story of
the information is that this symmetry of
information that we produce, and information that
people have access, is becoming more and
more symmetrical. And this leads to inequality. And I would like to use here
a quote by Larry Lessig, who is a professor of Law at
Harvard, talking about, “A culture without property, or
in which creators can get paid, is anarchy, not freedom”. And I would lead people to
make their own comments, but there is a lot of agitation
that is happening today, because of all these
technologies, all these machine learning, that they told
you it’s a wonderful tool to do discovery. At the same time,
this possibility of accessing human
data raises the level of actually questioning
whether democracy will survive big data and
artificial intelligence. And there are other
voices who are actually questioning what is the
effect of computing in people. So there is this professor
from the University of Michigan, Kentaro Toyama,
who actually showed– this is the US poverty rate, and
he looks at digital technology. And then you would
have expected that when the internet appeared, that
poverty would stay and stay down, or when Google
appeared, something will happen to poverty. But poverty doesn’t
change, and it’s actually it’s indifferent to
all these developments on digital technology. You would think
that this would be a change in the charitable
giving in the United States. Digital technology does
not change that either. At the same time, what
digital technology is doing is increasing the productivity
of people, but at the same time the compensation of people is
not increasing proportional to their productivity. To put it on the other
side at the same time, we go back, and thanks to
UPS and thanks to Facebook and thanks to all these
technologies, people in very remote areas– this is an example,
from people from ETH, that they are actually going and
through a phone they actually can examine children in
remote areas in South America. You can actually use the Google
Street View images in order to identify information
about poverty, and perhaps one can do
something about that. So all I’m saying is
that we live in times that perhaps at
some point we have to choose in which way we want
to push these technologies. So I argue that this
is a critical time. And at the same time, something
that I have experienced that last time that
I’ve been here, you start to realize that
also the mind, the human mind, needs assistance. We believe in untruths. We cannot distinguish things. We need some assistance
in order to face this onslaught of information. So we have to put
computers in good use. And I told you
about Moore’s Law. Here’s where Moore’s
Law is going, and here’s the brain is going. So when you are able to
harness all the spectrum, I think there is
great potential. Another way that I see it
is computing as thinking, and there is this great
book by Daniel Kahneman about fast and slow. And I concede there– actually I conceded that the
classical to be the slow, and I concede
there the empirical to be the fast way of thinking. And I think its possible
to combine the two, and to come up with ideas
similar to the fast and slow questions that we
have, provided you know when to use what between
these two ways of approaching knowledge, and when to
combine the two together. Here is a suggestion that comes. Here’s a suggestion
about our scientists that came from an interview or
a visit of Alex Pete at Caltech. I get this magazine as
an alumni from there. What he says is that forget
about doing hypotheses as humans. Get as much data as
possible, and without too much of a theory became
to analyze the data. And the idea is
that there is would be some reinforcement learning
in there, and it will say, you have a particular
hypothesis. And then the machine
will look at the data, and then you will find that
your hypothesis may not be true. If you want my personal
opinion, I actually find this to be a great and
a very interesting idea, if it’s used at the right
and appropriate way, because in a sense, very often
when I advise my students, I also do perhaps what their
machine could be doing better than me. The same time, computers and
suggestions are dangerous. And thanks to Ratliff again, I
was exposed to Mark Lombardi. And after being exposed
to Mark Lombardia, I was exposed by my fellows
here to give the board. And actually, I
found the statement, by the board that was done
back in 1964, where it’s talking about there
is all aspects of technological development. It focuses on communications. And he says that
isolation of individuals, as well as control
of these individuals by means of
suggestions broadcast by all sorts of leaders. So we have to take
suggestions, but we have to think about
these suggestions. And we have to create the
knowledge by which we process this particular suggestions. Which brings us to one or two
before the last about this idea of computers, and about AI. AI, I consider it to be a
core technology in many fields today. AI has tremendous power
that goes beyond humans. And the same way
computing compliments theory and
experiments, I think AI can complement the classical way
that we are doing our thinking. At the same time, we
have to be able to create new ways of interfacing
with machines and computers. We have to be able to ask the
computer, why did you do that? When do you succeed? When do you fail? You should be able
to ask the computer, can I trust you in
what you gave me? Can you explain what
you just produced? It was a great effort–
this slide adopted from David Gunning, from DARPA. It is a great effort
that’s happening now, about explainable AI to
understand, to trust, and to manage AI. In closing, I think
I’d like to offer a definition for computing. Computing is knowledge that
helps discover, predict, and explain structure. It’s something that helps
you to join boundaries. To go from fish schooling
to nanoparticles for cancer therapy. And actually I think
there is a musical way to understand this
interdisciplinary. [VIOLIN MUSIC] And this is this. This is from a group
called Lambarena, and it’s a combinational music
from Bach and from Africa. It’s about listening
to the violin of Bach, it’s about listening
to the drums of Africa, and finding the way of
putting the two together in perfect harmony in order to
be able to create great music. This is where I think we have
the opportunity to go today. So computing is knowledge. I would like to
put the signs up, and say the most important
thing is the last one it’s about education. We have to change a lot of
things in the education. Of how do we teach
people about computers? How do we teach about
software making? How do we teach
about data handling? So it’s a lot about education,
education, education. And I think we should take into
account these different modes of thinking, and the way we try
to teach people of the future. And I like to tell you that
I think with the powers that we have today,
I think we’ve never been in a better position. Having something that
is 20 millions faster than it was 36 years ago. If we put it to
good use, I think we can address a lot of the
problems that exist today, and I think a lot of people are
using computers and computing to do just that. So I’d like to thank
you, and I would like to thank also
myself all my students and post stocks that
are listed up there. The ones that are
underlined have actually had an association with Harvard. Three of them are in the room. I’ll be happy to
take any questions. [APPLAUSE]

Leave a Reply

Your email address will not be published. Required fields are marked *