DiffDock
Summary
TLDRThe SBGrid YouTube channel hosts a webinar series where experts discuss advanced topics in structural biology. In a recent webinar, Gabriele Corso, Hannes Stark, and Bowen Jing presented DiffDock, a deep learning approach for small molecule docking. They explained that traditional docking methods can be time-consuming and sensitive to inaccuracies in protein structures, particularly with computationally generated structures like those from AlphaFold. DiffDock addresses these challenges by using a generative model based on diffusion models, which are adept at handling complex probability distributions. The model predicts the 3D coordinates of a small molecule's atoms relative to a protein without prior knowledge of the binding pocket. It samples from a noisy distribution and progressively refines the pose towards the true binding pose. The DiffDock model has shown promising results, outperforming traditional methods, especially when docking to predicted structures. The webinar also touched on the upcoming DiffDock-pocket, which enhances the original DiffDock by allowing for control over a specific binding pocket and predicting side chain rearrangements. The presenters concluded with a Q&A session where they discussed the potential for local refinement, the incorporation of reliability information into the DiffDock process, and the practical aspects of using DiffDock, including its speed and memory requirements.
Takeaways
- 🎓 **SBGrid Webinar Series**: The video is part of a webinar series by SBGrid, focusing on software tutorials, lectures by structural biologists, and unique content related to structural biology and computational methods.
- 📅 **Upcoming Talks**: The channel has scheduled talks on 'DeepFoldRNA' by Robin Pearce and 'DIALS' by Graham Winter from Diamond Light Source, indicating a commitment to continuous learning and updates in the field.
- 🤖 **DiffDock Presentation**: Gabriele Corso, Hannes Stark, and Bowen Jing present DiffDock, a deep learning approach for small molecule docking that predicts 3D coordinates of molecules in relation to a protein structure.
- 🧠 **Blind Docking**: DiffDock performs blind docking, considering the entire protein structure without prior knowledge of the binding pocket, which is a more challenging task compared to pocket-level docking.
- 🔍 **Methodology**: The method uses a generative modeling approach with diffusion models to handle the large search space and uncertainty in docking, as opposed to traditional regression-based deep learning methods.
- 📈 **Performance**: DiffDock demonstrates higher performance in docking tasks, especially when dealing with predicted protein structures like those from AlphaFold, where traditional methods struggle due to inaccuracies.
- 🔧 **Practical Usage**: The tool is designed to be used in practice with inputs including protein structures and small molecules, providing multiple candidate outputs with scores for further analysis.
- 🔗 **GitHub and Colab**: Detailed instructions, models, and Colab notebooks for DiffDock are available on GitHub, facilitating easy access and use for the scientific community.
- 🔄 **DiffDock-Pocket**: An upcoming tool called DiffDock-pocket aims to address the limitations of controlling for specific binding pockets and predicting side chain rearrangements upon binding.
- ⚙️ **Technical Aspects**: The generative model operates on a non-Euclidean manifold space defined by accessible ligand poses through torsion angle adjustments, which is a key technical detail of the DiffDock approach.
- 🚀 **Future Research**: The presenters discuss the potential for incorporating prior knowledge into diffusion sampling processes and the active research in this area, suggesting opportunities for further development and improvement of the tool.
Q & A
What is the primary focus of the DiffDock approach?
-DiffDock is a method for small molecule docking using deep learning approaches. It focuses on blind docking, where the entire protein structure is considered to find the binding site of a small molecule, rather than focusing on a known pocket.
How does DiffDock handle the uncertainty in the docking task?
-DiffDock uses a generative modeling approach with diffusion models to handle the uncertainty in the docking task. It aims to populate all possible modes, accounting for both aleatoric uncertainty (multiple poses) and epistemic uncertainty (model indecision).
What are the inputs and outputs of the DiffDock tool?
-The input to DiffDock is the 3D structure of a protein and the 2D chemical graph of a small molecule. The output is the 3D coordinates of every atom of the small molecule, along with scores for multiple candidate poses.
How does DiffDock differ from traditional docking methods?
-Traditional docking methods use a scoring function and a search algorithm to find the minimum energy conformation of the ligand with respect to the protein. DiffDock, on the other hand, uses deep learning to predict the binding pose directly, without relying on an energy function.
What are the advantages of using DiffDock over traditional docking methods?
-DiffDock can handle large search spaces more efficiently than traditional methods, is less sensitive to inaccuracies in the protein structure, and can deal better with scenarios where the binding pocket is not already known.
How does the generative model in DiffDock work?
-The generative model in DiffDock uses diffusion models to gradually remove noise from an initial, randomly positioned ligand pose. It predicts vectors for translation, rotation, and torsional adjustments to iteratively refine the pose towards the true binding pose.
What is the role of the confidence model in DiffDock?
-The confidence model in DiffDock is used to rank the generated poses. It is trained to classify poses as being within two angstroms RMSD of the ground truth pose or not, helping to select the most accurate poses for further analysis.
How does DiffDock handle the issue of steric clashes?
-DiffDock does not explicitly handle steric clashes during its training or generative process. It focuses on achieving a geometrically close pose to the ground truth without considering whether the pose clashes with the protein side chains.
What are the computational requirements for running DiffDock?
-DiffDock is designed to run on GPU and can produce samples in either 10 or 40 seconds per run, depending on the number of samples taken. The speed can be adjusted by changing the number of samples and the batch size.
How does DiffDock perform on predicted protein structures, such as those from AlphaFold?
-DiffDock retains a good level of performance on predicted structures like those from AlphaFold, even when the side chains are not accurate, which is a challenge for traditional docking methods.
What are the potential applications of DiffDock?
-DiffDock can be used for blind docking to discover new binding sites, for virtual screening of potential drug candidates, and for understanding the mechanism of action of new drugs by reverse screening on specific pathways.
What is the DiffDock-pocket and how does it improve upon the original DiffDock?
-DiffDock-pocket is a follow-up work that addresses the ability to control for a specific binding pocket and predict the rearrangement of side chains upon binding. It introduces pocket conditioning and side chain torsional flexibility into the diffusion process, improving performance on these tasks.
Outlines
😀 Introduction to SBGrid Webinar Series
The first paragraph introduces the SBGrid YouTube channel, which features software tutorials, lectures, and unique content for structural biologists. The host announces upcoming webinars with Robin Pearce discussing DeepFoldRNA and Graham Winter from Diamond Light Source talking about DIALS. The current webinar features a group presentation by Gabriele Corso, Hannes Stark, and Bowen Jing on DiffDock, a deep learning approach for small molecule docking. The host encourages audience interaction and questions.
🔬 DiffDock: An Overview and Methodology
The second paragraph delves into the specifics of DiffDock, a tool for small molecule docking using deep learning. The presenters, Hannes, Gabriele, and Bowen, explain the task at hand, which involves predicting the 3D coordinates of a small molecule's atoms in relation to a protein's 3D structure. They discuss the limitations of traditional docking methods and the potential of deep learning to overcome these challenges. The paragraph also touches on the generative modeling approach that DiffDock employs, contrasting it with regression-based methods.
🧬 Generative Modeling and Diffusion Models in DiffDock
The third paragraph focuses on the generative model mechanics within DiffDock, specifically the use of diffusion models. These models have been successful in various fields, including AI-generated art. The explanation covers how diffusion models add noise to data and learn to remove it, using a neural network to approximate a complex function. The generative process is outlined, detailing how the model operates on the space of ligand poses, using chemically consistent noise and training to remove torsional, positional, and orientational noise.
📈 Results and Performance of DiffDock
The fourth paragraph presents the results and performance benchmarks of DiffDock. The tool is trained on PDBBind, a standard benchmark with high-quality structures. The performance is evaluated on both hollow and predicted structures, with DiffDock showing higher reliability and performance on the latter. The paragraph also highlights the successful application of DiffDock in reverse screening by Tim Peterson's group and provides a summary of how to use the tool, including accessing models and notebooks on GitHub.
🔍 DiffDock-Pocket: Enhancements and Follow-up Work
The fifth paragraph introduces DiffDock-pocket, an enhancement to address specific binding pockets and predict side chain rearrangements upon binding. It discusses the improvements made to the original DiffDock, including pocket conditioning and side chain torsional flexibility. The performance of DiffDock-pocket is compared to traditional methods, showing significant advantages in certain scenarios. The paragraph also outlines the process of using the tool, from input to output, and the incorporation of a confidence model for pose selection.
🚀 DiffDock's Sampling Process and Future Research
The sixth paragraph discusses the sampling process in DiffDock, emphasizing that it is data-driven and not based on physics. It addresses the possibility of incorporating reliability information about the protein structure into DiffDock and suggests that manual adjustments can be made during inference. The paragraph also explores the idea of using prior information, such as multiple structure predictions, to guide the docking process. It concludes with a discussion on the general benchmarks for speed and the potential for future research in this area.
🤖 DiffDock's Scoring Function and Its Implications
The seventh paragraph compares the scoring function of DiffDock, particularly the confidence model, with traditional scoring functions. It explains that the confidence model is trained to select poses within two angstroms RSD of the ground truth, which contrasts with traditional scoring functions that consider interatomic distances and steric clashes. The discussion highlights the advantages of DiffDock's scoring function in terms of simplicity and avoidance of local minima. It also touches on the potential for handling larger ligands and conformational spaces.
🎨 Analogies Between DiffDock and AI Image Generation Tools
The eighth paragraph draws parallels between DiffDock and AI image generation tools, noting that the same underlying model is used and that concepts from image diffusion have analogs in molecular diffusion. It suggests that intuitions developed from using image generation tools could transfer to DiffDock, although specific adjustments like sampler steps or batch size may vary. The paragraph concludes with a note on the potential for further exploration and research in these areas.
📝 Final Questions and Closing Remarks
The ninth and final paragraph wraps up the discussion with final questions from the audience. The presenters address questions about the size of ligands that can be sampled with DiffDock and the potential for exploring larger conformational spaces. They also discuss the limitations when predicting side chain flexibility and the advantages of the diffusion approach over traditional methods. The session concludes with thanks to the presenters and the audience for their participation.
Mindmap
Keywords
💡DiffDock
💡DeepFoldRNA
💡DIALS
💡Small molecule docking
💡Blind docking
💡Deep learning
💡Generative modeling
💡Diffusion models
💡PDBBind
💡ESMFold
💡AlphaFold
Highlights
SBGrid YouTube channel hosts software tutorials and lectures by structural biologists.
Upcoming webinars feature Robin Pearce discussing DeepFoldRNA and Graham Winter talking about DIALS.
Gabriele Corso, Hannes Stark, and Bowen Jing present DiffDock, a deep learning approach for small molecule docking.
DiffDock is designed for blind docking, using the entire protein structure to find small molecule binding sites.
The output of DiffDock includes 3D coordinates of every atom of the small molecule and scores for multiple candidates.
Traditional docking methods are compared with deep learning approaches, highlighting the challenges in large search spaces.
DiffDock addresses the sensitivity issues related to inaccuracies in protein structures, such as those generated by AlphaFold.
The generative modeling approach is introduced as a solution for the docking problem, as opposed to regression-based methods.
DiffDock utilizes diffusion models to sample from complex probability distributions, inspired by molecular structures.
The generative model for molecular poses is constructed by adding chemically consistent noise and training a model to remove it.
DiffDock can sample multiple poses and is capable of identifying multiple binding modes, unlike regression models.
The model is not trained with any notion of steric clash, focusing solely on geometric accuracy.
DiffDock is tested on PDBBind, a standard benchmark containing about 19,000 high-quality structures from PDB.
Results show DiffDock outperforms traditional methods, especially when docking to predicted structures like those from AlphaFold.
DiffDock has been successfully used in reverse screening to understand the mechanism of action of new drugs.
DiffDock-pocket, an upcoming tool, aims to address the limitations of controlling for specific binding pockets and predicting side chain rearrangements.
DiffDock-pocket shows improved performance in predicting correct side chain rearrangements compared to traditional methods.
The scoring function of DiffDock is trained to select poses within two angstroms RMSD of the ground truth, differing from traditional scoring functions.
DiffDock's smoothed energy surface allows for exploration of larger conformational spaces and more rotatable bonds.
The principles and intuitions from AI-generated image tools can transfer to DiffDock, as they are based on the same type of model.
Transcripts
Welcome to the SBGrid YouTube channel,
software tutorials by developers,
lectures by structural biologists, unique content
brought to you by SBGrid.
[MUSIC PLAYING]
Hello, everybody.
Welcome to the SBGrid webinar series continuing next week
we're going to be-- oh, next month on December 12th,
we're going to have Robin Pearce talking about DeepFoldRNA
and then in January, Graham Winter from Diamond Light
Source is going to be joining us to talk about the DIALS.
And today we have a group presentation
from Gabriele Corso, Hannes Stark, and Bowen Jing.
They're here to talk to us about DiffDock,
which is an approach for using some deep learning
approaches for small molecule docking,
and they will explain it much better than I will.
So I'm happy to hand it over to them.
If you have questions, feel free to use the Q&A function
or send messages to one of the hosts in the chat
and we'll moderate until the end.
And with that, Gabriele, Hennes, and Bowen,
thank you for joining us again, and take it away.
Excellent.
Thank you very much for the nice introduction.
So we'll be talking about DiffDock here with--
and I'm the Hannes guy.
This is Grabriele and this Bowen.
So first we get a little bit into very concretely
what is the task that we're considering here,
input, output.
And then we'll get a little bit into how the method works,
and then we'll get into some results
and into how to use it in practice.
OK, then let's get started because yeah,
the inputs and outputs are very simple.
As input, we have the the 3D structure of a protein
and the 2D chemical graph of a small molecule.
So of the small molecule, we do not know the 3D structure yet
and we do not know the 3D structure with respect
to the protein.
And for the protein, we have the whole protein as input,
and we're not considering the pocket level
docking scenario, where we maybe have
a bounding box of some pocket that we already know.
Now instead, we're doing blind docking,
where we have the whole protein and want to find out
where the small molecule binds.
And the output of the tool is the 3D coordinates,
the 3D coordinates of every single atom
of the small molecule.
OK.
And there will then also be some further outputs
because we can produce multiple candidates,
and we also have a score for all the candidates,
but we'll get into that later.
But then we wanted to motivate this a little bit.
So what do we traditionally do with our usual docking methods,
and why now do this with deep learning?
Well, the traditional deep learning traditional docking
methods, they're based on a scoring function that
ranks every single conformer that ranks every single 3D
position that the ligand can take
with respect to the protein.
And we have this energy function, this score function,
and then we use a search algorithm
to search over to find the minimum of this energy
function.
But of course, if we have a very large search
space of blind docking and we don't already know the pocket,
then this can be quite--
it can take a long time to find the minimum.
And another issue is the sensitivity
to slight inaccuracies in the protein structure,
such as if we, for example, have computationally
generated structures where maybe a side chain might
be a little bit off.
And there has been evidence in some papers
that when we're talking to computationally generate
AlphaFold structures, for example, then
these classical methods, they struggle with this a bit.
And with that, we now have our question of,
what can we do with deep learning for docking?
And for that, so far we've seen the regression-based
approaches, say, where the deep learning method would
have some graph neural network, where the nodes of the protein
are given by the protein residues,
the nodes of the small molecule are given by its atoms,
and they are also associated with locations.
And then we would do some message passing.
For example, this regression approach,
it would make its prediction by predicting key points
for the protein, where the model thinks
that small molecule should bind, or where the model think
that the pocket is, and it would predict key points,
like interaction points for the small molecule.
And then we would calculate the translation and the rotation
to optimally align those key points
and apply the same transformation
to the small molecule to end up with the final location
prediction.
But these types of regression approaches,
they did not meaningfully improve the performance
that we were able to achieve compared to traditional docking
methods, where here in red we're showing deep learning methods,
and in blue we're showing traditional search
based methods, traditional search based methods.
Yeah.
And now we argue that--
we have this little summary here where we argue.
In our search based method, we have our energy function.
We learned this ground truth energy function [INAUDIBLE]..
We learned the scoring function.
And then we use a search algorithm.
We start somewhere at a random location,
and then we use our search algorithm
to find the modes of this energy function,
or the minimum of this energy function.
But we might very easily get stuck in local minima.
Meanwhile, if we have our deep learning regression
approaches--
yeah, I should also mention here,
in green in this visualization, we
would have the ground truth what we want to predict,
and in yellow, we have what we get as output with the method.
So this is for the traditional search based methods.
And then here we have our regression based regression
based deep learning methods, where we have this distribution
which, during training, we sample our data from,
and then we make our prediction with our model.
And our prediction tries to minimize the mean square error.
And then the best it can do if it makes a single prediction
to minimize the mean square error to samples
from this distribution is to put its prediction at the mean,
but this is actually not what we're interested in here.
We're interested in the sample from the modes,
the global mode, and this is why we argue another approach.
Generative modeling should be the approach taken
for the docking problem, and that's
what Gabriele will talk about now.
OK.
So Hennes has motivated kind of why traditional docking
methods really struggle with a very large search space.
Why the previous deep learning methods based on regression
also struggle in this task.
And to give another kind of intuition
for why this is the case, what's particularly
hard about docking.
And what's particularly hard is that there is
a lot uncertainty in the task.
And this is both aleatoric, which
just means that there might be multiple poses,
and epistemic, which basically means
that the model will be undecided between multiple poses.
And if we have regression models as I'll show you
on the next slide, we're going to get some kind of mean that's
not very useful, while with our generative model,
we'll try to populate all these modes.
And let me give you a couple of concrete examples.
So here we have this protein in gray.
This is actually a drug target against malaria.
And you can see in green this chromo inhibitor.
The dots in these two sites in the proteins.
And if we run one of the regression models,
we obtain a prediction that is right in the middle.
This is clearly not a useful prediction,
while with the generative model that will show,
we are actually able to sample both modes.
, Similarly we have here another complex where instead we have
a single true docking pose.
And, however, still these regression
methods really struggle, either putting
large part of the ligand in steric clash with the protein
or having this completely unphysical conformation,
and instead we will see that we are actually
able to sample relatively accurately the pose
with the generative model.
And Bowen will kind introduce how we're actually constructing
this generative model.
OK.
All right.
All right, so I'll briefly talk about the mechanics
of the model and how we actually have this generative model
for molecular poses.
Now there are many different classes
of generative models in deep learning,
and what we're going to use is diffusion models.
These have recently been quite famous.
If you've heard in the news about AI generated
art or photorealistic imagery, this all came from diffusion
models which are very good at modeling very
complex probability distributions,
which makes them very well suited for molecular structures
as well, and it's this aspect that we leverage
in developing DiffDock Now I want
to briefly outline how these models work just
to ground everyone in some common language.
You've probably heard people say that diffusion models add noise
to the data and then learn to remove noise.
So what that looks like is, you have your data
and you can imagine some kind of diffusion process happening
here.
So you can imagine red is like concentration
of the data at a particular point in space,
and this diffusion process is adding noise.
And then with your neural network,
you learn a vector field that points
in the direction of higher concentration or probability
density.
And this, intuitively, is the part of the diffusion framework
that corresponds to removing noise.
And this vector field is going to be
evolving over time, right?
So it's a pretty complex function
that we're going to approximate with the neural network called
the score function.
And then what you do at inference time
is you draw random samples from your initial noisy distribution
and follow the neural network as it tells you
how to remove noise by following this vector
field into eventually you get to the data distribution.
Hopefully the visualization is clear
even though the arrows are kind of small.
So now this right here is just a toy diffusion on a 2D space.
What we're going to want to do in DiffDock is think
about how this generalizes to the space of ligand poses,
right?
So just to emphasize that even though we're
talking about diffusion with very
similar mathematical formalism to physical or chemical
diffusion, this is really diffusion over the space
that the data distribution lives in, right?
So in the case of ligand poses, we're
going to want to think about what that space is
and how to diffuse over that.
So the space that we're actually going to look at
is actually quite inspired by the way
that traditional docking methods have
thought about the space of ligand poses.
So in GLIDE or Vina you're probably
familiar that you provide a conformer
of the ligand as input, and then what GLIDE or Vina will do
is it will move this ligand around with rigid body motions
and update its torsion angles, but it will not
disrupt the bond lengths, bond angles and ring
structures of the ligand.
So we're going to take away from this paradigm is
that the space of ligand poses is actually
this non-Euclidean manifold that is described
by the space of poses accessible by twisting these torsion
angles and moving around the ligand.
So this is going to be the space of ligand poses
that takes the place of that 2D toy Gaussian example
that we saw on the previous slide
when it comes to thinking about diffusion models.
So the diffusion that we're going to want to construct,
the kind of noise that we're going to want to add,
is going to be this kind of like chemically consistent noise
with the initial conformer.
So the noise is just going to be in the ligand structure,
rather in the ligand torsion angles and its rigid body
motion.
And when we talk about doing diffusion
over this space, what we're going to do
is train a model that removes noise of this kind.
This is going to be a model that removes
torsional noise, positional noise,
and orientational noise from a randomly seated initial ligand
pose in order to move it towards the distribution
of the true pose.
And so I will skip these technical details here,
but what that really looks like is shown in the bottom
left corner here.
So, recall in the earlier visualization
that we had a neural network that
tried to learn a vector field on this 2D space
so that it could point towards the direction of lower
noise in the direction of higher probability density.
That's exactly what's going to be happening in the score model
that we have DiffDock except that the noise is torsional,
orientational, and positional.
So what that means is that the score model is going
to look at the current ligand pose,
and when I say current what I mean here
is because diffusion is an iterative generative process,
so the input the beginning of a diffusion generative process
is going to be some randomly positioned ligand
pose that you're going to just initialize the diffusion
process with.
And the assumption is that this random pose has a lot of noise,
and the model the generative model
is going to be progressively removing
that noise one step at a time.
And there are three kinds of noise, positional noise,
translational noise, and torsional noise.
So the score model is going to predict a vector,
this brown vector here.
So that's kind of like the direction of removing
translational noise.
You can think of that as like a linear velocity
or linear momentum.
We're going to have a rotation vector, which is
removing orientational noise.
This is like an angular momentum or a angular velocity.
And then for each torsion angle we're
also going to predict a quantity that tells us
how quickly to twist that particular rotatable bond
and in which direction in order to make the pose look less
noisy.
And this is, I guess you can say,
it's like an angular velocity around that particular torsion
angle.
And all of this is done with a particular kind of message
passing neural network, which I will not
get into the details of, but the upshot of all this
is that the generative process looks
something like the following.
So, again, similar to GLIDE or Vina or Autodock
or any of these very well established docking tools,
the input to our method is again going
to be a conformer either from rdkit or maybe
from the Cambridge crystallography database
if you prefer that.
And what our model will then do is
it will first sample from the equivalent
of that initial Gaussian distribution, which
in this case means that we're going to randomly position
the ligand relative to the protein.
We're going to completely randomize its orientation
and its torsion angles.
And so the distribution of poses looks something
like what you see at the bottom left here.
And what this corresponds to is kind of like the noisiest state
possible for the pose, and then we're
going to just progressively use our model
to figure out how to remove noise from the pose,
both translationally or intentionally and
conformationally by adjusting the torsion
angles do this multiple times in practice about 20 times,
and then all of the poses will hopefully
move towards a low noise state which
hopefully corresponds to them being in the binding pocket.
Now I do want to emphasize that here we
are showing independent samples and independent trajectories.
So when these poses are moving relative to each other,
they are not interacting with each other in any way
whatsoever.
This illustration is just showing multiple instantiations
of that process.
Now, of course, finally, you do want
to have an ability to select from this distribution of poses
a high ranking pose for downstream analysis,
and we also provide a bespoke confidence model.
This is trained as a under two Angstrom RMSD classifier
and it will select out from the many samples
that you can draw from the generative model which
pose you would use for downstream analysis.
Here is just kind of a final visualization of that.
So what we have here at the beginning
is a cloud of randomly initialized ligand poses,
and then as you can see, they move
towards the binding pocket.
One thing that I do want to emphasize here
that is quite interesting, and maybe this
is a key point of difference between this iterative process
and a traditional docking process,
is that as you can see, during the course of this denoising
the ligand oftentimes will pass straight through the protein.
It will look like it passes through very
energetically unfavorable regions of state space.
Now this is a good thing because this
is what allows us to actually reach
the global minimum of this data driven energy
function in the first place.
Because if we otherwise had to deal with this rugged energy
landscape, we would not get there.
But the other aspect that I want to highlight
is that DiffDock is not trained with any notion
of steric clash.
It is trained only with the objective
of getting as geometrically as close to the ground
truth pose as possible.
So what this means in practice is
that the output pose, for example,
if you're doing cross docking or if you're
talking to a AlphaFold structure or a structure
where the side chains the wrong, DiffDock
will put the poles in what it thinks is the right binding
pocket but without any regard for whether
or not it clashes with the side chain.
So when you look at a DiffDock pose
and evaluate whether or not you like this pose or not,
the energy or the scoring function under a scoring
function, for example, will probably not be the best metric
that you will want to use.
You will probably want to do some relaxation first,
because DiffDock, while it will generally
get the geometry of the pose right,
it will not try to do anything about these energetics,
so that's something to keep in mind.
And then I will hand it over to Gabrie to talk about results.
OK.
And if there are any questions on the method side,
I guess we can take them also now.
But, otherwise, also happy to take them at the end.
OK, so let's see some results and some summaries.
So first of all, what do we train this on?
So we use PDBBind, which is a standard benchmarks.
It contains about 19,000 structures from PDB.
They are being curated to be of higher quality
and obviously containing ligands.
We do a time based split so we train
on complexes that was a result before 2019
and we test on your complexes.
The complexes that we test on, we
make sure that there are no ligands that
were in our training set to look more
for some kind of generalization.
And we have various different kinds of baselines
both from traditional methods and deep learning ones.
OK, so the first set of results that we are going to see
is when we provide the methods, the hollow structures.
So this means that we actually feed into the methods
the exact structure that the protein will
take when it will be bound.
And this is typically the way that these methods are
evaluated, although, one could argue
that it's not very realistic.
But we can still see here that for this blind docking task
traditional methods and this deep learning methods
don't really get a success rate where we're measuring success
by the proportion of predictions with the top one RMSD
below 2 angstrom.
And we can see that the success rate is below 25%.
Now this can be increased by combining basically a pocket
finding methods like to rank or equipping itself
with one of these traditional methods focused
on that specific pocket.
But then we actually see that DiffDock itself
can achieve a significantly higher performance.
We can also look at the performance
on predicted structures.
So this is in the setting that instead of fitting the ground
truth structure so the ground truth bound structure,
we are fitting the full generated structure.
So we feed the sequence of the protein in ESMFold,
we get the structure, and then we
try to dock the ligand to the structure.
And as Hannes said at the beginning,
the traditional methods have also
been shown in previous works but we can also
see here that they really struggle in this task.
The performance and the success rates
drops all the way to below 5%.
And the reason is because often the side chains
of the structures are not accurate,
but in the sense that they will change upon binding.
And on the other hand, you can see that actually DiffDock
is a lot more reliable and loses a much smaller percentage
of its success rate, and so it retains
some good level of performance also on AlphaFold
or fold predictor structures.
Now we actually wanted to give some shout
outs to some of the works that have used DiffDock
since we published.
So this, for example, is a very interesting work
on where DiffDock was basically used to do reverse screening,
this came from the group of Tim Peterson at the Washington
University in Saint Louis where they use DiffDock
and trying to bind DiffDock on a series of proteins
on a particular pathway to try to understand
the mechanism of action of a new drug that they had discovered.
And this is a very promising, for example,
application where blind docking and in particular blind docking
to AlphaFold or in general infrastructure,
we think will have a big, big impact.
To kind summarize a bit more using the tool itself.
So you can find first of all kind of more detailed structure
on our GitHub where you also find the models and Colab
notebooks in case you prefer using those.
And so the input is a protein structure.
And here you can either give the structure
file if you have, for example, a cross talk structure.
Or you can even feed the input and in which case,
the model will fold the protein with ESMfold.
And then you have to provide the ligand.
And so here again can be either a structure file or a smile
string, but in either setting, we
don't assume that this structure is actually
accurate for ligand.
And then the reverse diffusion runs,
we run the confidence model, and then
the output will be basically the files with the predicted ligand
poses where you will also find the rank and the confidence
of each of the poses in the name of the file.
And this is kind of mainly what we also have.
We also wanted to give a brief overview of the follow-up work
that we are going to be releasing soon called
DiffDock-pocket where we tackle two of the commonly
reported problems with the original DiffDock
for many people that have been using it,
and one was the ability of not being able to control
for a specific binding pocket.
And the second is although I've shown
you some good infrastructure performance,
there was not really any way of predicting
the rearrangement of the side chains upon binding.
And so potentially making the relaxation step that Bowman
was talking about harder.
So what we do here is we do a few changes to have this pocket
conditioning where we restrict the focus to the pocket
and we give access to the model to the full atomic coordinates
and then we also add side chains torsional flexibility
built directly into the diffusion process.
And again, you can see here again kind of hollow docking
performance on PDB bind when condition on a specific pocket.
So when we are dealing with hollow structures
or often with cross docking structure,
we are at least as good or better
than some of these traditional methods that
were designed for this task.
But again we have significantly better performance
when docking to structures and we can see here also
when it comes to predicting the correct rearrangement
of the side chains, we do significantly better
than some of the traditional methods.
And with that, we'd like to thank you for listening
and open the floor for any questions.
Thank you.
That was very interesting talk.
And I have to admit that DiffDock-pocket answer it
in one of the questions I had queued up
before I even got a chance to ask it.
So, well prepared.
So we have one question that came in from Joseph DeCorte.
Beyond relaxation, do you recommend local refinement
with one of the DiffDock pose with one of the more
traditional algorithms?
Yeah.
People have used they've talked with the combination
of different tools.
In general, it kind of I think depends on what you're
going to use down the line.
I think in general, whenever you're using some,
for example, scoring function or some energy function
is best to basically either relax or do
some pose refinement using that same scoring function,
because different scoring function
and also the intrinsic scoring function
DiffDock have kind of slightly different properties.
And so in general, depending on the tool
I think that you want to use downstream,
you should basically do relaxation
with the same kind of level of tool.
OK.
Thank you.
Since you touched on the sampling a little bit,
is that a case where you've tested multiple different ways
of doing the scoring during sampling or could
you just say a little bit more about how you're doing that,
Scoring during sampling?
Was meant by that?
Is that just purely geometric or is
that one of the traditional atomic force fields?
So the sampling process itself is not driven by any physics
based energy or scoring function.
Oh, are you talking about this?
This part?
Thi ranking of the poses.
No, I think going back further.
So you were answering the right area.
Yeah, the actual sampling itself.
So, I mean, a kind of overviewed here
in diffusion models, the neural network
is actually predicting the movement
that would bring the pose towards the data distribution.
So rather maybe that was not the best illustration,
maybe this is the slide, sorry for so many slides.
So in this slide here the neural network
is quite literally predicting a set
of two vectors and a torsional velocity
around each one of these.
So it's all coming from the neural network
and it's all data driven.
There's no physics based sampling involved.
So is there a way to incorporate the any reliability information
about the protein structure into the DiffDock
way of thinking about it.
So in general, the question of how
to incorporate this prior knowledge
into these diffusion sampling processes
is an active area of research.
Yeah, there are lots of interesting ways to go here.
Yeah, I was just thinking like what if you have deep factors.
What if you have multiple AlphaFold or ESMfold
predictions and you want to use a cluster.
And it sounds like I should wait and see for future research.
Yeah.
Yeah.
I mean, one very first pass idea that one could have is,
if you want to restrict to a certain binding pocket,
then you can manually adjust this translation vector.
So that's the one in brown here.
So one thing that you can do if you
want to dock to a specific pocket without retraining
the dock is just if the brown vector is
pointing towards a different pocket, just correct it.
But what one generally finds is that when
it is possible to retrain the model
to explicitly use the prior information as input,
it generally does better.
So maybe that's why the focus has
been a bit more on how to incorporate
the different kinds of knowledge that you
would want to incorporate.
But it is true that there are a number of inference time tricks
that you could consider doing.
And the rough paradigm is just, like, you can manually
adjust any of these updates if you feel like they're wrong.
One other approach for if you wanted
to target a pocket before DiffDock pocket came out,
it sounds like you could just edit your input structure
and delete the parts that you didn't want to have docking on
and that would be ugly and feel kind of like a hack but--
Yeah, yeah.
And here you know, there and there are different ways
that one can think about it I mean
one could also like just sample multiple times until you
get closer inside of that pocket and do a manual kind
of confidence filtering.
But in general that's not really kind of making
the model kind of like--
We would expect the performance when
restricting to a particular pocket
to be better because the model has should have an easier task
and so this was one of the motivation
for actually training model fully on the pocket
and also trying to predict pocket rearrangements
at the same time.
Yeah.
Thank you.
Another question, could you talk a little bit
about the general benchmarks as far as speed
for like number of molecules per day for GPU?
And obviously there's going to be a lot of range.
Yeah, let me see, I think it might be on one
of the supplementary slides.
So it's always a bit hard to compare these methods.
So first of all, so here it's kind of the general number,
which is on a depending on how many samples that you take,
when you're on DiffDock, it takes either 10
or 40 seconds on a GPU.
And so based on that do some calculations
on how many samples a day you can obtain.
There are a range of tricks that one can do to accelerate them.
And the obvious one is you can take fewer samples
if you have access to less computational resources.
And you can see in our paper, or I
think we should also have here, that there is a sort of you
know obviously a curve.
So here we've kind of presented results with like 40 samples.
You have relatively little loss in performance just taking
10 samples, and and that's for example four times faster.
So there is a range of hyperparameters
that one can play with if speed is an important consideration.
Yeah, these models also, something I should say
is that, they do run best on GPU, so running on GPU
often takes even longer than the traditional methods when
compared on GPU.
And could you say a little bit about the DRam requirements.
Because some things are very DRam heavy,
some things are not.
Nobody has enough GPU time.
Yeah, so it really depends to be honest on the size
of the protein and the size of the protein ligand.
But something that can be controlled
and is one of the hyperparameters that
can be fed into DiffDock is basically
what we call the batch size, which is basically
how many complexes are sampled, how many kind of samples we
take in parallel.
So basically if you are taking 40 samples, by default
we're taking them in batches of 10 samples.
But if you have a smaller memory,
you can scale this to, for example, four or eight
and to fit your GPU memory.
Thank you.
We've got a couple more questions coming in.
So could you say something about if the scoring for DiffDock
pocket, or DiffDock in general, how would you compare that
to traditional scoring?
And I believe this is something that you addressed earlier,
but I might be wrong about that.
Well, I think maybe to answer that question
without further knowledge of what specifically is meant
by the comparison, the scoring by this confidence model
here is based on a trained classifier where the model is
trained to predict whether the pose is under two Angstrom
RSD or above two Angstrom RSD.
So what that will mean is that compared
to a traditional scoring function that
is based on pairwise interaction terms,
this scoring function will give a very good score to oppose
that is say 1.5 angstroms away but has overlapping atoms
with one of the side chains.
But it will not give a good score
to a ligand that is in the wrong pocket
but happens to have a very good energy inside of that pocket.
So I guess maybe to summarize and to hit home the message,
this scoring function is trained with the sole purpose
of selecting poses that are within two angstroms RSD
geometrically of the ground truth ligand pose.
So if ideally trained, it is a convex energy surface.
There will be no spurious local minima.
Of course, that may not be the case in practice.
But this is to sharply contrast with traditional scoring
functions which can be very, very sensitive, for example,
to specific interatomic distances avoiding
steric clashes and have a very rugged energy surface.
Hopefully that answers the question.
I think it does, and it illustrates,
"Is A better than B?", well, better for what?
Exactly, exactly.
Jason has his hands up so I assume he has another question.
Yeah, I did.
If you're taking this approach where we are not really
paying attention to clashes, we don't really
have to worry about physics, we got a big set of conformers
we just dive in and let the--
what are the ramifications for the size of the ligands
that you can sample?
Can you go to bigger conformational spaces?
Can you use more roadable bonds?
Have you looked at things like--
Typically, with the approaches, people
use fragments or maybe even 20-25 atoms,
but can you go even bigger with this,
and do you pay a penalty performance wise?
Yeah, so definitely that's a great point, which
is the fact that you're using this kind of smoothed out
energy surfaces gives particular improvements
when your dimensional space that you're searching over it's
higher dimensions.
And so we do expect it to do better
in terms of when you add more and more kind of degrees
of freedom.
Now, I don't think we have, I don't
know if we have some of these results in the presentation,
but also the other problem, however perpendicular problem,
is the fact that potentially larger and larger ligands are
out of the domain where the model was trained
so that's potential caveat.
But I think one setting where it becomes
very clear kind of disadvantage is
when you look at predicting flexibility of side chain.
So many of the traditional methods
allow you to basically enable some flexibility in side chains
by basically allowing to also model the torsion
angles of the side chain, similarly to the way
the DiffDock pocket does.
But when you actually see how these models do,
they actually work pretty terribly.
And I think from my intuition the reason why kind of they
really struggle is that the degrees of freedom
really increase, and so this search algorithm that they
use really, really struggle.
And so this is, I think, kind of one
setting where we can really see the advantage of having
these kind of diffusion like approach versus a energy
based approach and search based approach
of traditional methods.
Yeah, that's interesting.
I hadn't thought about that that the diffusion based approach,
well, it kind of skirts that limitation
in the sense that it doesn't have
to worry about steric interactions
is limited that it's trained on it.
So, like, it's expanding beyond that is probably challenging.
I think that may have run through all of our questions.
So, if there are no more from the audience,
then thank you for a great talk, and I enjoyed it,
and I hope the audience did too.
Thank you.
Thank you very much.
Thank you very much.
It was great.
I guess maybe one final question, on a lighter note,
you'd mentioned the image generation AI tools,
do you think that the intuitions that people would develop
playing around with those tools would transfer to something
like DiffDock in terms of, oh, I need more sampler steps,
oh, I need a different batch size, or is
that just, you know?
Well, I mean, it is the same kind of model,
so at the end of the day, there's
a ton of shared language.
And pretty much every concept that
has been applied in image diffusion
has some kind of analog and molecular diffusion.
And if they haven't been explored already then
their active areas of research.
Cool.
Well, thank you again.
Thank you very much.
[MUSIC PLAYING]
تصفح المزيد من مقاطع الفيديو ذات الصلة
ModelAngelo
Googles ALPHAFOLD-3 Just Changed EVERYTHING! (AlphaFold 3 Explained)
BIOL 4330 Unit 2 1 2 Bayesian Analysis and Markov Chain Monte Carlo
SE 12 : All SDLC Models Revision | Software Engineering Full Course
Google DeepMind's New AI Just Did in Minutes What Took Scientists Years
¡NUEVA CÁMARA PARA EL CANAL! 🤩 - MarraTec
5.0 / 5 (0 votes)