Many naturally occurring protein assemblies have dynamic structures that allow them to perform specialized functions. Although computational methods for designing novel self-assembling proteins have advanced substantially over the past decade, they primarily focus on designing static structures. Here we characterize three distinct computationally designed protein assemblies that exhibit unanticipated structural diversity arising from flexibility in their subunits. Cryo-EM single-particle reconstructions and native mass spectrometry reveal two distinct architectures for two assemblies, while six cryo-EM reconstructions for the third likely represent a subset of its solution-phase structures. Structural modeling and molecular dynamics simulations indicate that constrained flexibility within the subunits of each assembly promotes a defined range of architectures rather than nonspecific aggregation. Redesigning the flexible region in one building block rescues the intended monomorphic assembly. These findings highlight structural flexibility as a powerful design principle, enabling exploration of new structural and functional spaces in protein assembly design.
Discrete protein assemblies ranging from hundreds of kilodaltons to hundreds of megadaltons in size are a ubiquitous feature of biological systems and perform highly specialized functions. Despite remarkable recent progress in accurately designing new self-assembling proteins, the size and complexity of these assemblies has been limited by a reliance on strict symmetry. Here, inspired by the pseudosymmetry observed in bacterial microcompartments and viral capsids, we developed a hierarchical computational method for designing large pseudosymmetric self-assembling protein nanomaterials. We computationally designed pseudosymmetric heterooligomeric components and used them to create discrete, cage-like protein assemblies with icosahedral symmetry containing 240, 540 and 960 subunits. At 49, 71 and 96 nm diameter, these nanocages are the largest bounded computationally designed protein assemblies generated to date. More broadly, by moving beyond strict symmetry, our work substantially broadens the variety of self-assembling protein architectures that are accessible through design.
Carbohydrates and glycoproteins modulate key biological functions. However, experimental structure determination of sugar polymers is notoriously difficult. Computational approaches can aid in carbohydrate structure prediction, structure determination, and design. In this work, we developed a glycan-modeling algorithm, GlycanTreeModeler, that computationally builds glycans layer-by-layer, using adaptive kernel density estimates (KDE) of common glycan conformations derived from data in the Protein Data Bank (PDB) and from quantum mechanics (QM) calculations. GlycanTreeModeler was benchmarked on a test set of glycan structures of varying lengths, or "trees". Structures predicted by GlycanTreeModeler agreed with native structures at high accuracy for both de novo modeling and experimental density-guided building. We employed these tools to design de novo glycan trees into a protein nanoparticle vaccine to shield regions of the scaffold from antibody recognition, and experimentally verified shielding. This work will inform glycoprotein model prediction, glycan masking, and further aid computational methods in experimental structure determination and refinement.
Computational design of non-porous pH-responsive antibody nanoparticles MatdesNanoparticlesLab-led
Yang EC, Divine R, Miranda MC, Borst AJ, Sheffler W, Zhang JZ, Decarreau J, Saragovi A, Abedi M, Goldbach N, Ahlrichs M, Dobbins C, Hand A, Cheng S, Lamb M, Levine PM, Chan S, Skotheim R, Fallas J, Ueda G, Lubner J, Somiya M, Khmelinskaia A, King NP, Baker D. (2024) Nat Struct Mol Biol. 31(9)
:1404-1412 | doi:10.1038/s41594-024-01288-5PDFAbstract
Programming protein nanomaterials to respond to changes in environmental conditions is a current challenge for protein design and is important for targeted delivery of biologics. Here we describe the design of octahedral non-porous nanoparticles with a targeting antibody on the two-fold symmetry axis, a designed trimer programmed to disassemble below a tunable pH transition point on the three-fold axis, and a designed tetramer on the four-fold symmetry axis. Designed non-covalent interfaces guide cooperative nanoparticle assembly from independently purified components, and a cryo-EM density map closely matches the computational design model. The designed nanoparticles can package protein and nucleic acid payloads, are endocytosed following antibody-mediated targeting of cell surface receptors, and undergo tunable pH-dependent disassembly at pH values ranging between 5.9 and 6.7. The ability to incorporate almost any antibody into a non-porous pH-dependent nanoparticle opens up new routes to antibody-directed targeted delivery.
The design of protein-protein interfaces using physics-based design methods such as Rosetta requires substantial computational resources and manual refinement by expert structural biologists. Deep learning methods promise to simplify protein-protein interface design and enable its application to a wide variety of problems by researchers from various scientific disciplines. Here, we test the ability of a deep learning method for protein sequence design, ProteinMPNN, to design two-component tetrahedral protein nanomaterials and benchmark its performance against Rosetta. ProteinMPNN had a similar success rate to Rosetta, yielding 13 new experimentally confirmed assemblies, but required orders of magnitude less computation and no manual refinement. The interfaces designed by ProteinMPNN were substantially more polar than those designed by Rosetta, which facilitated in vitro assembly of the designed nanomaterials from independently purified components. Crystal structures of several of the assemblies confirmed the accuracy of the design method at high resolution. Our results showcase the potential of deep learning-based methods to unlock the widespread application of designed protein-protein interfaces and self-assembling protein nanomaterials in biotechnology.
Computationally designed multi-subunit assemblies have shown considerable promise for a variety of applications, including a new generation of potent vaccines. One of the major routes to such materials is rigid body sequence-independent docking of cyclic oligomers into architectures with point group or lattice symmetries. Current methods for docking and designing such assemblies are tailored to specific classes of symmetry and are difficult to modify for novel applications. Here we describe RPXDock, a fast, flexible, and modular software package for sequence-independent rigid-body protein docking across a wide range of symmetric architectures that is easily customizable for further development. RPXDock uses an efficient hierarchical search and a residue-pair transform (RPX) scoring method to rapidly search through multidimensional docking space. We describe the structure of the software, provide practical guidelines for its use, and describe the available functionalities including a variety of score functions and filtering tools that can be used to guide and refine docking results towards desired configurations.
As a result of evolutionary selection, the subunits of naturally occurring protein assemblies often fit together with substantial shape complementarity to generate architectures optimal for function in a manner not achievable by current design approaches. We describe a "top-down" reinforcement learning-based design approach that solves this problem using Monte Carlo tree search to sample protein conformers in the context of an overall architecture and specified functional constraints. Cryo-electron microscopy structures of the designed disk-shaped nanopores and ultracompact icosahedra are very close to the computational models. The icosohedra enable very-high-density display of immunogens and signaling molecules, which potentiates vaccine response and angiogenesis induction. Our approach enables the top-down design of complex protein nanomaterials with desired system properties and demonstrates the power of reinforcement learning in protein design.
Computationally designed protein nanoparticles have recently emerged as a promising platform for the development of new vaccines and biologics. For many applications, secretion of designed nanoparticles from eukaryotic cells would be advantageous, but in practice, they often secrete poorly. Here we show that designed hydrophobic interfaces that drive nanoparticle assembly are often predicted to form cryptic transmembrane domains, suggesting that interaction with the membrane insertion machinery could limit efficient secretion. We develop a general computational protocol, the Degreaser, to design away cryptic transmembrane domains without sacrificing protein stability. The retroactive application of the Degreaser to previously designed nanoparticle components and nanoparticles considerably improves secretion, and modular integration of the Degreaser into design pipelines results in new nanoparticles that secrete as robustly as naturally occurring protein assemblies. Both the Degreaser protocol and the nanoparticles we describe may be broadly useful in biotechnological applications.
Rigorous undergraduate research experiences are essential for improving student success in graduate education and STEM careers. During the COVID-19 pandemic, undergraduate researchers in our institution lost their work-study positions due to a pause of in-person research activities. Losing their work-study positions posed a great financial burden on the students, and eliminated their opportunity to experience undergraduate research. To address the need for new research opportunities, we created a paid, fully-remote, and cohort-based computational research curriculum. This research curriculum used previously-developed protein design methods as a platform to educate and train undergraduate student researchers. In this program, students learned computational design methods to modulate the stability of previously-characterized, designed protein assemblies. Their results uncovered gaps in the accuracy of current nanomaterial assembly design algorithms. Our program provides a model for other educational organizations with access to basic computing infrastructure to offer structured research training opportunities to a larger number of undergraduate researchers than is typically possible in a wet lab environment.
Natural molecular machines contain protein components that undergo motion relative to each other. Designing such mechanically constrained nanoscale protein architectures with internal degrees of freedom is an outstanding challenge for computational protein design. Here we explore the de novo construction of protein machinery from designed axle and rotor components with internal cyclic or dihedral symmetry. We find that the axle-rotor systems assemble in vitro and in vivo as designed. Using cryo-electron microscopy, we find that these systems populate conformationally variable relative orientations reflecting the symmetry of the coupled components and the computationally designed interface energy landscape. These mechanical systems with internal degrees of freedom are a step toward the design of genetically encodable nanomachines.
2021
Designed proteins assemble antibodies into modular nanocages MatdesNanoparticlesAntibodies
Divine R, Dang HV, Ueda G, Fallas JA, Vulovic I, Sheffler W, Saini S, Zhao YT, Raj IX, Morawski PA, Jennewein MF, Homad LJ, Wan YH, Tooley MR, Seeger F, Etemadi A, Fahning ML, Lazarovits J, Roederer A, Walls AC, Stewart L, Mazloomi M, King NP, Campbell DJ, McGuire AT, Stamatatos L, Ruohola-Baker H, Mathieu J, Veesler D, Baker D. (2021) Science. 372(6537)
| doi:10.1126/science.abd9994Abstract
Multivalent display of receptor-engaging antibodies or ligands can enhance their activity. Instead of achieving multivalency by attachment to preexisting scaffolds, here we unite form and function by the computational design of nanocages in which one structural component is an antibody or Fc-ligand fusion and the second is a designed antibody-binding homo-oligomer that drives nanocage assembly. Structures of eight nanocages determined by electron microscopy spanning dihedral, tetrahedral, octahedral, and icosahedral architectures with 2, 6, 12, and 30 antibodies per nanocage, respectively, closely match the corresponding computational models. Antibody nanocages targeting cell surface receptors enhance signaling compared with free antibodies or Fc-fusions in death receptor 5 (DR5)-mediated apoptosis, angiopoietin-1 receptor (Tie2)-mediated angiogenesis, CD40 activation, and T cell proliferation. Nanocage assembly also increases severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pseudovirus neutralization by α-SARS-CoV-2 monoclonal antibodies and Fc-angiotensin-converting enzyme 2 (ACE2) fusion proteins.
Organizing matter at the atomic scale is a central goal of nanotechnology. Bottom-up approaches, in which molecular building blocks are programmed to assemble via supramolecular interactions, are a proven and versatile route to new and useful nanomaterials. Although a wide variety of molecules have been used as building blocks, proteins have several intrinsic features that present unique opportunities for designing nanomaterials with sophisticated functions. There has been tremendous recent progress in designing proteins to fold and assemble to highly ordered structures. Here we review the leading approaches to the design of closed polyhedral protein assemblies, highlight the importance of considering the assembly process itself, and discuss various applications and future directions for the field. We emphasize throughout the exciting opportunities presented by recent advances as well as challenges that remain.
Recent advances in computational methods have enabled the predictive design of self-assembling protein nanomaterials with atomic-level accuracy. These design strategies focus exclusively on a single target structure, without consideration of the mechanism or dynamics of assembly. However, understanding the assembly process, and in particular its robustness to perturbation, will be critical for translating this class of materials into useful technologies. Here we investigate the assembly of two computationally designed, 120-subunit icosahedral complexes in detail using several complementary biochemical methods. We found that assembly of each material from its two constituent protein building blocks was highly cooperative and yielded exclusively complete, 120-subunit complexes except in one non-stoichiometric regime for one of the materials. Our results suggest that in vitro assembly provides a robust and controllable route for the manufacture of designed protein nanomaterials and confirm that cooperative assembly can be an intrinsic, rather than evolved, feature of hierarchically structured protein complexes.
In recent years, new protein engineering methods have produced more than a dozen symmetric, self-assembling protein cages whose structures have been validated to match their design models with near-atomic accuracy. However, many protein cage designs that are tested in the lab do not form the desired assembly, and improving the success rate of design has been a point of recent emphasis. Here we present two protein structures solved by X-ray crystallography of designed protein oligomers that form two-component cages with tetrahedral symmetry. To improve on the past tendency toward poorly soluble protein, we used a computational protocol that favors the formation of hydrogen-bonding networks over exclusively hydrophobic interactions to stabilize the designed protein-protein interfaces. Preliminary characterization showed highly soluble expression, and solution studies indicated successful cage formation by both designed proteins. For one of the designs, a crystal structure confirmed at high resolution that the intended tetrahedral cage was formed, though several flipped amino acid side chain rotamers resulted in an interface that deviates from the precise hydrogen-bonding pattern that was intended. A structure of the other designed cage showed that, under the conditions where crystals were obtained, a noncage structure was formed wherein a porous 3D protein network in space group I2 3 is generated by an off-target twofold homomeric interface. These results illustrate some of the ongoing challenges of developing computational methods for polar interface design, and add two potentially valuable new entries to the growing list of engineered protein materials for downstream applications.
Computational protein design provides the tools to expand the diversity of protein complexes beyond those found in nature. Understanding the rules that drive proteins to interact with each other enables the design of protein-protein interactions to generate specific protein assemblies. In this work, we designed protein-protein interfaces between dimers and trimers to generate dodecameric protein assemblies with dihedral point group symmetry. We subsequently analyzed the designed protein complexes by native MS. We show that the use of ion mobility MS in combination with surface-induced dissociation (SID) allows for the rapid determination of the stoichiometry and topology of designed complexes. The information collected along with the speed of data acquisition and processing make SID ion mobility MS well-suited to determine key structural features of designed protein complexes, thereby circumventing the requirement for more time- and sample-consuming structural biology approaches.
Antifreeze proteins (AFPs) are small monomeric proteins that adsorb to the surface of ice to inhibit ice crystal growth and impart freeze resistance to the organisms producing them. Previously, monomeric AFPs have been conjugated to the termini of branched polymers to increase their activity through the simultaneous binding of more than one AFP to ice. Here, we describe a superior approach to increasing AFP activity through oligomerization that eliminates the need for conjugation reactions with varying levels of efficiency. A moderately active AFP from a fish and a hyperactive AFP from an Antarctic bacterium were genetically fused to the C-termini of one component of the 24-subunit protein cage T33-21, resulting in protein nanoparticles that multivalently display exactly 12 AFPs. The resulting nanoparticles exhibited freezing point depression >50-fold greater than that seen with the same concentration of monomeric AFP and a similar increase in the level of ice-recrystallization inhibition. These results support the anchored clathrate mechanism of binding of AFP to ice. The enhanced freezing point depression could be due to the difficulty of overgrowing a larger AFP on the ice surface and the improved ice-recrystallization inhibition to the ability of the nanoparticle to simultaneously bind multiple ice grains. Oligomerization of these proteins using self-assembling protein cages will be useful in a variety of biotechnology and cryobiology applications.
Nature provides many examples of self- and co-assembling protein-based molecular machines, including icosahedral protein cages that serve as scaffolds, enzymes, and compartments for essential biochemical reactions and icosahedral virus capsids, which encapsidate and protect viral genomes and mediate entry into host cells. Inspired by these natural materials, we report the computational design and experimental characterization of co-assembling, two-component, 120-subunit icosahedral protein nanostructures with molecular weights (1.8 to 2.8 megadaltons) and dimensions (24 to 40 nanometers in diameter) comparable to those of small viral capsids. Electron microscopy, small-angle x-ray scattering, and x-ray crystallography show that 10 designs spanning three distinct icosahedral architectures form materials closely matching the design models. In vitro assembly of icosahedral complexes from independently purified components occurs rapidly, at rates comparable to those of viral capsids, and enables controlled packaging of molecular cargo through charge complementarity. The ability to design megadalton-scale materials with atomic-level accuracy and controllable assembly opens the door to a new generation of genetically programmable protein-based molecular machines.
The dodecahedron [corrected] is the largest of the Platonic solids, and icosahedral protein structures are widely used in biological systems for packaging and transport. There has been considerable interest in repurposing such structures for applications ranging from targeted delivery to multivalent immunogen presentation. The ability to design proteins that self-assemble into precisely specified, highly ordered icosahedral structures would open the door to a new generation of protein containers with properties custom-tailored to specific applications. Here we describe the computational design of a 25-nanometre icosahedral nanocage that self-assembles from trimeric protein building blocks. The designed protein was produced in Escherichia coli, and found by electron microscopy to assemble into a homogenous population of icosahedral particles nearly identical to the design model. The particles are stable in 6.7 molar guanidine hydrochloride at up to 80 degrees Celsius, and undergo extremely abrupt, but reversible, disassembly between 2 molar and 2.25 molar guanidinium thiocyanate. The dodecahedron [corrected] is robust to genetic fusions: one or two copies of green fluorescent protein (GFP) can be fused to each of the 60 subunits to create highly fluorescent ‘standard candles’ for use in light microscopy, and a designed protein pentamer can be placed in the centre of each of the 20 pentameric faces to modulate the size of the entrance/exit channels of the cage. Such robust and customizable nanocages should have considerable utility in targeted drug delivery, vaccine design and synthetic biology.
We recently reported the development of a computational method for the design of coassembling multicomponent protein nanomaterials. While four such materials were validated at high-resolution by X-ray crystallography, low yield of soluble protein prevented X-ray structure determination of a fifth designed material, T33-09. Here we report the design and crystal structure of T33-31, a variant of T33-09 with improved soluble yield resulting from redesign efforts focused on mutating solvent-exposed side chains to charged amino acids. The structure is found to match the computational design model with atomic-level accuracy, providing further validation of the design approach and demonstrating a simple and potentially general means of improving the yield of designed protein nanomaterials.
The self-assembly of proteins into highly ordered nanoscale architectures is a hallmark of biological systems. The sophisticated functions of these molecular machines have inspired the development of methods to engineer self-assembling protein nanostructures; however, the design of multi-component protein nanomaterials with high accuracy remains an outstanding challenge. Here we report a computational method for designing protein nanomaterials in which multiple copies of two distinct subunits co-assemble into a specific architecture. We use the method to design five 24-subunit cage-like protein nanomaterials in two distinct symmetric architectures and experimentally demonstrate that their structures are in close agreement with the computational design models. The accuracy of the method and the number and variety of two-component materials that it makes accessible suggest a route to the construction of functional protein nanomaterials tailored to specific applications.
Molecular self-assembly offers a means by which sophisticated materials can be constructed with unparalleled precision. Designing self-assembling protein structures is of particular interest as a result of the unique functional capabilities of proteins. Custom-designed protein materials could lead to new possibilities in therapeutics, bioenergy, and materials science. Although the field was long hampered by the challenges involved in designing such complex molecules, novel approaches and computational tools have recently led to remarkable progress. Here we review recent design studies in the context of three fundamental aspects of self-assembling materials: subunit organization, subunit interactions, and regulation of assembly.
We describe a general computational method for designing proteins that self-assemble to a desired symmetric architecture. Protein building blocks are docked together symmetrically to identify complementary packing arrangements, and low-energy protein-protein interfaces are then designed between the building blocks in order to drive self-assembly. We used trimeric protein building blocks to design a 24-subunit, 13-nm diameter complex with octahedral symmetry and a 12-subunit, 11-nm diameter complex with tetrahedral symmetry. The designed proteins assembled to the desired oligomeric states in solution, and the crystal structures of the complexes revealed that the resulting materials closely match the design models. The method can be used to design a wide variety of self-assembling protein nanomaterials.
In nature, many proteins have evolved to have self-complementary shapes. This drives them to assemble into supramolecular structures, sometimes of great complexity, and often carrying out sophisticated cellular functions. Designing novel proteins that can self-assemble into similarly complex structures is a longstanding goal in bioengineering. New ideas, combined with continually improving computer algorithms, are making it possible to advance on that goal, bringing wide-ranging applications in synthetic biology within reach. Prospective applications range from vaccine design to molecular delivery to bioactive materials. Recent strategies and examples of successfully designed protein cages, layers, and crystals are reviewed.