Hi. I’m David Baker. I’m a professor at the University of Washington, and this is part 2 of my iBio seminar, and today I’m going to be talking about the design of new protein functions. In the first part, I spoke about designing brand new protein structures, and now I’m going to show you, today, how we can go beyond designing structure, to designing new protein functions. The motivation for this is really presented by nature. The exquisite functions of naturally occurring proteins really solved the challenges that were faced during biological evolution remarkably well. So, if you think what living things are able to do, they’re able to capture energy from the sun, they’re able to use that energy to build up molecules, build up complex organisms, and eventually to think, and for me to talk and listen to you… and for you to listen. So, in all those processes… they are largely mediated by proteins. In our genomes, of course, are genes, and those genes give the blueprint for life, but they do so by encoding proteins. Proteins are what actually do the work. And again, the protein complement we have in our bodies, and the other living things currently existing on earth, are really exquisitely tuned by natural selection to solve the problems that were relevant during evolution. However, in today’s world we face challenges that were not faced during natural evolution. There are diseases like cancer and Alzheimer’s that were not really issues during evolution because we didn’t live long enough. We’re heating up the planet, we’re running out of fuel, and there are new types of viral epidemics that are coming around, and one can have reasonable confidence that if we had another billion years to wait, and there was adequate selection pressure, that all of these problems would be solved beautifully by natural selection. But most of don’t have a billion years to wait, and so what if we could design a whole new world of synthetic proteins that solved today’s problems as well as naturally occurring proteins solved the problems that arose during evolution. And that’s really the grand challenge of protein design. The methods that are used in the calculations I’m going to tell you about today I reviewed in part 1 of my iBio seminar, but I’ll go over the basic ideas quickly again now. The basic principle is that proteins fold to their lowest-energy states, and so if we want to design new proteins that fold up into new structures that carry out new functions, we have to be able to calculate energies reasonably accurately and we have to be able to sample through the different possible protein conformations to find the lowest-energy state. And, over the years, my group, in collaboration with many groups around the world, has developed the Rosetta protein design software… protein structure modeling software to carry out these calculations. If we want to design proteins with new functions, we need hypotheses about the shape of the protein, the configuration of atoms, that would best carry out that function. And the final point is the most important one: we can design new models of new molecules as much as want on the computer, but if we don’t go to the lab and test them, they remain purely science fiction, so the final step in everything which I tell you about is to… after doing the protein design calculation, coming up with a new amino acid sequence that encodes a protein that’s predicted to have the desired function, the final step is to manufacture a synthetic gene encoding that new protein, a brand new protein that never existed before, and then take that synthetic gene, put it into bacteria, make the protein, and then see whether the protein does what it was designed to do. The way the protein design calculations work is shown very schematically here for the simplest possible case. This is the problem where we have a protein backbone we want to make, and we want to find a sequence which is very low energy in this backbone. So, we keep the backbone fixed and we search through the different combinations of amino acids for an amino acid sequence which is very low in energy in this structure. Then, as I said, once we have that sequence, we can go to the lab and make it and experimentally test it. So, the first example I’m going to give you concerns the influenza virus. A schematic of the influenza virus is shown on the upper left, and then in the middle two panels is a blow-up of a surface protein on the influenza virus called the hemagglutinin, and in yellow in the middle panel are two parts of that viral surface protein, this hemagglutinin, which are very highly conserved during evolution. The virus is constantly mutating to evade our immune systems, that’s why we need new vaccines every year, but there are certain regions which absolutely don’t change because they’re critical to the function of the virus. There’s a region I’ll refer to as the stem region, in the middle of the structure, and then on the top, where the protein is actually attaching to cells in our bodies, this is how the virus gets into our cells, there’s a second site called the receptor-binding site. What I’m going to tell you about today is the design of proteins which bind to these sites shown in yellow and block the virus function; they prevent the virus from getting into our cells. So, using the methods that I briefly outlined, we’ve designed proteins which block the virus that bind at both the site in the stem region on the side and then on the surface, but I’m going to tell you in detail about the ones that bind at the stem site today. So, the design process has two steps, and I’m going to illustrate them for you here. On the left you see a blow-up of that stem region of the influenza virus hemagglutinin, that was the region that was in yellow on the previous slide in the middle of the slide… in the middle of the protein… and you can see that there’s kind of a deep groove that we decided we would try and design proteins to bind into. The design calculation has two parts. The first part consists of placing amino acid sidechains into the groove in ways that they make very good interactions. An analogy for our approach is to think of this like a climber would think about a climbing wall, where there’s some region that you want to hold onto, like this groove, and the first problem is to find handholds and footholds that allow you to really get a grip on this, and then you have to figure out how you’re going to place your body so that you can have your hands and feet in all the good places for them at the same time. So, we start by figuring out where the handholds and footholds are, that is, where we can place disembodied amino acids into this cavity to make really good interactions, and the second part is to place the body, and this can either be a protein that we designed from scratch or one that we design de novo. And, so what you see here again in sort of the solid surface representation is the flu virus protein, and you see the sidechains that we placed in the preceding slide docked up against the surface, and now the ribbon-y thing is a brand new designed protein that we’ve made that holds these critical side chains up against the virus in exactly the right orientations. There are… one of the components of the calculations of the design are electrostatic interactions, favorable interactions between positive atoms and negative atoms, so on the right you see a very red region on the virus, that’s negatively charged, and we’re putting a blue side chain, which is positively charged, right into that to get more binding energy. The two designs that I’m going to tell you about are shown here, again, with the influenza virus in yellow and the design in magenta. You see the sidechains fitting into that pocket on the virus, and you see the backbone of the designed protein in the ribbon diagram. Something that’s important for me to emphasize is that when we do these calculations, only a fraction of the computed designs that are predicted to bind the virus actually fold up to fold up to structures that, when we test them, bind the virus experimentally. These two proteins bind the virus and they bind quite tightly, but most of the designs in fact don’t, and it turns out the reason that they don’t is probably because these sequences don’t fold up, don’t really fold up to these structures. Our calculations are not quite good enough, so that we get some designs which simply don’t fold properly, but the thing that’s very powerful now is it’s very easy to synthesize synthetic genes, so we can make many, many, many different designs that have been found in these computer calculations and test them all, and identify those which actually function. Now, I told you that those two proteins in fact do bind the virus, but it’s important to know how they bind the virus and how similar it is to the way that we designed them to bind the virus. So, on this slide I show crystal structures, determined in the laboratory of Ian Wilson at Scripps, where the influenza virus protein is shown on the left in magenta and cyan and the design model is in purple, and it’s binding, again, in the middle of the influenza virus protein in that stem region, and in red is the crystal structure. What you can see is that the crystal structure… in the crystal structure, this protein we’ve designed, this one is called HB36 on the left, is binding to the virus exactly like we designed it to bind, and in that inset there in the middle you can see that even the designed side chains in the crystal structure are exactly where they were supposed to be. And the same thing is true for the other designed protein that I described, called HB80. The crystal structure is, again, nearly identical to the design model. So, while I told you that a large fraction of our designs simply don’t bind at all, the ones that do bind bind to the virus in essentially exactly the same way that they were supposed to bind the virus. The proteins, after some experimental optimization of the sequence, bind with picomolar affinity to the virus, they’re very tight binding proteins, and our collaborators Merika Treats, a graduate student in the laboratory of Deb Fuller, has some very exciting results now showing that mice who would die from a lethal infection from the flu virus are completely protected when these designed proteins, actually the one that was on the left, and given to them, and the protein can be given to them up to 24 hours before or 24 hours after they are infected with the virus. So, we’re very excited now about the possibility that this could become a new type of flu therapeutic where either you’re going into an area that’s infected or you’ve just been infected. Such designed proteins might be a future treatment for the flu. We’re designing proteins now, using the techniques that I’ve described, to bind to not only other pathogens but to proteins on the surfaces of cancer cells and normal cells to modulate biological function. I don’t have time today to tell you about that, but we’re able to make proteins that are also useful for figuring some fundamental biological questions, because we can design proteins that knock out specific interactions, and so that allows biologists, then, to probe what the function of that interaction is. But now I’m going to switch gears and talk about the design of proteins to bind small molecules, and we use a very similar approach. On the left is the structure of a small molecule called digoxigenin, which is used as a therapeutic to treat heart patients, some heart conditions, but if you get too much of it it’s very, very dangerous and patients can die. So, we were interested in trying to design a protein that could essentially be a therapeutic sponge and soak it up. The designed protein is shown on the bottom right. In magenta is this dig molecule, I’ll call it for short, and in green is a protein we’ve designed which makes very complementary interactions, those are hydrogen bonding interactions shown in the dashed lines, and it surrounds the dig. Another view of it is shown in the upper panel, where you can see a space-filling view of the designed protein, and you can see it really snugs the surface of the small molecule. So again, this is purely a computer calculation, but we then go to the lab and make the protein… and we make the protein… and when we made the protein we found it bound the small molecule, and Barry Stoddard’s group was then able to solve the crystal structure, and that’s shown here. In cyan is the… sorry, in magenta is the designed model, that’s what I already showed you, it’s the designed model of the designed protein bound to this small molecule, and in cyan is the crystal structure, and you can see that the small molecule… first of all, you can see that this designed protein has the correct structure, and second, you can see that the designed molecule binds to that structure in almost exactly the way that was designed, making those same hydrogen bonding interactions. And the left panel shows you the shape complementarity in the crystal structure of this small molecule with the protein. This design was very exciting because it, again, binds the small molecule with picomolar affinity, and we are now using this method to design proteins which bind a number of different types of molecules, both toxins and other types of drugs, and these types of designed proteins could be useful not only for soaking up dangerous molecules in the body, but also for detection of molecules and other purposes. And, I’m going to conclude by telling you about our work on designing new materials. So, many of the materials that you’re familiar with, like silk and wool, are made out of proteins, and biology has lots of examples of more specialized sort of nanomaterials, like viruses have these very elaborate and beautiful coat structures with which they use to protect their DNA, and the principle of all these materials in biology is self-assembly, where there’s a subunit that’s made, that’s encoded in a gene, and then that subunit interacts with other copies of itself to make a larger structure. And, I’m going to show you now how we can design brand new proteins which self-assemble with other copies of themselves to make larger structure. So, in this first example, what we’ve done is to take a protein that’s shown on the left, and place it on the corners of a cube. And so, there are eight corners on a cube, so we’ve taken eight copies of this protein and arranged them on the corners of the cube in such a way that the surfaces of these different copies on the different corners touch each other. And we then designed the sequences of these interfaces where they touch so that the proteins… to make very low energy interactions, so that when this protein is made in cells, what we hope is that it will self-assemble into the cubic structure, stabilized by these designed interactions that we’ve made. And, in the lower panel here, you can see an electron micrograph of cells that are making this designed protein, and you can see that these cells are filled with these cubic structures, and the averages of these images are shown on sort of the right column of this panel, and you can see they look quite a bit like the designed model. They look like little dice. In fact, what we’d like to be good enough to do is be able to put different numbers on different sides. We’re not quite there yet. When the crystal structure was solved in Todd Yates’ lab, it was found to be nearly identical to the designed model, which we were very excited about. So, we can make these types of nanomaterials and enclosed structures with very high accuracy. This shows another view. The left three columns show the same design I just described, but now viewed down the different symmetry axes of the cube. So for example, the third column is the four-fold axis of a cube, and in the upper row is the designed model, what we were trying to make, and in the lower row is the crystal structure, those are the structures that we actually found experimentally, and you can see they’re essentially identical. On the right is a second example where we were trying to design proteins to come together to form a tetrahedron, and again you can see that the designed models in the top row are very similar to the actual crystal structures that were solved experimentally in the bottom row. And Yang Hsia, a graduate student in the lab, has more recently used this approach to try and make even bigger structures like the icosahedron shown on the top left. This is more or less like the play structures that they have in some playgrounds, except this is a complete icosahedron. And, when Yang made this protein in the lab, very recently, he was excited when Shane Gonen, who he sent the protein to to do electron microscopy, sent back the pictures that I’m showing you here. You can’t quite see the whole icosahedron but, for example in the lower row on the middle panel, you see something that looks very much like it. So, we’re currently trying to solve the high-resolution structure. So, these were materials that were made out of just one component that was identical that was then interacting with other copies of itself. We can make this more sophisticated by, instead of having one component, we can have two components. So, in panel A here, I’m showing two tetrahedra that are inverted relative to each other, one green and one blue. And so, what we’re doing here is we’re taking one building block, the green one, and putting it at the corners of the green tetrahedron, and another building block, the blue one, and putting it at the corners of the blue tetrahedron, and then as shown in the middle panel here, we can move them… we can slide them closer and further away from the center of these tetrahedra, and we can also rotate each one, and we do this until we find a way in which these fit together in a very shape-complementary way, and that’s shown in panel C. At this point it becomes a calculation very similar to what I showed in that movie that I showed at the beginning of my talk, where we now have to design… find an amino sequence… amino acid sequences on both sides, on both the green side and the blue side, which fit together very well and make very strong interactions. And, when we’ve done that, we again order synthetic genes, or make synthetic genes, that encode both proteins. We make them in bacteria and then we look to see whether there’s anything that’s assembled, and I’m going to show you the results on the next slide. These are electron micrographs of two of these materials. These are, again, two components, with a green component and a blue component, and the designed models are shown on the lower part of the slide, with one component in green and one component in blue. In the upper panels are electron micrographs of what we get out of E. coli cells, bacterial cells that are expressing these two proteins, and you can see that… first of all what you can see is that, for each design, we get remarkably homogeneous particles, so all the particles in these images look essentially identical, and if you look closely you can see that, for the different shaped designs, we get different shaped structures and they correspond to the shapes that we’re trying to design. So, I think in the middle panel, you can the that the holes are a little bit bigger than in the particles on the left panels. And, what’s exciting about this for the applications I’ll describe is not only that the shapes are coming out right, as we designed, but that every particle is the same. So, for example, if you wanted to make a new type of drug delivery vehicle, there are various ways of making particles for drug delivery now, so say you want to target a toxic compound specifically to the tumor you want to kill, but those methods always… when you look at the particles they’re always very heterogeneous, so it’s hard to predict what they’ll do inside the body. With this technique, we can make particles that are very precise and each one is identical to each other one. So, Todd Yeates’ group was again able to solve crystal structures of these two-component materials. So, in the upper rows are the designed models, shown down the different symmetry axes… two of the symmetry axes of these particles, and in the lower rows are the crystal structures of these designs. So again, the process is, you have the computer model, which is what’s on the top row, then you order a synthetic gene which encodes both of the designed proteins, you put these synthetic genes into bacteria, you make the proteins, and then you purify them out of E. coli and you look to see what you’ve got. And then, in this case, go one step further to determine the X-ray crystal structures, and what you can see here is that these designed proteins are again… the crystals structures are essentially identical to the designed models. So, we can make these designed nanomaterials very, very precisely. So, the different types of nanostructures that I’ve described so far are the ones on the left, and I already mentioned… so, the question is, what good could they be for? One very exciting possibility is targeted drug delivery, where, as I mentioned, you could put a chemotherapy agent inside the cage and then target it to the tumor, so you don’t have to take it systemically. You can also put targeting domains on the outside so that it goes exactly where you want it to go, and we’re now… a first-year student in the lab is now exploring different ways of putting nucleic acid inside these to make synthetic viruses, not for bad purposes but for good purposes, so we can deliver, say, for gene therapy or for other types of therapy, deliver RNA or DNA molecules exactly in the body where they would be good to go. Another application is to vaccines. We can display… one of the things we’re trying to display now is the HIV coat protein… we can display it on the outside of these cages, it will be there in many copies, and hopefully trigger a strong immune response. We can also put molecules called adjuvants inside these cages to stimulate a stronger response. Now, there are other types of particles, other types of nanomaterials that we can design. For example, the wire on the right side. You could imagine things like being useful for transporting ions or maybe even electrons in some sort of nanoelectronic device. And, my last example today is going to be for what you see in the middle – a designed, repeating, 2-dimensional layer, and this is the work of graduate student Shane Gonen. Here is his design. It’s a hexagonal lattice where these proteins are designed to assemble first into hexagons, which then interact with other copies of themselves to tile the plane, and when he makes this protein in E. coli he gets this… this is straight out of a broken E. coli cell. He sees these large arrays that correspond… that have the geometry one would expect for his design, and if he averages his data, the… and then, sort of a representation of a map, a density map that comes from this data is shown in the lower-left panel, and you can see that his model fits into that quite well. But, as you can imagine, we really aren’t satisfied until we’ve determined the high-resolution structure, which Shane is currently working on. I’ve been very fortunate to have absolutely outstanding colleagues that actually did all the work that I described. Their names are listed on this slide and, more generally, I hope I’ve given you a sense, today, for the potential of protein design to create a whole new world of designed proteins to solve challenges that we collectively face today.