HN Debrief

The Unreasonable Redundancy of Nature's Protein Folds

The post looks at protein folds, the recurring three-dimensional shapes that protein chains collapse into, and argues that natural biology is much more repetitive than the raw combinatorics of amino-acid sequences would suggest. Different proteins with different sequences and even different functions often converge on the same handful of structural motifs. That is not the same claim as “sequence does not matter.” Sequence still determines structure and function. The sharper point is that many sequences map onto the same stable fold, and evolution appears to keep reusing a limited library of folds rather than spreading evenly through all possible protein architectures.

If protein fold space is far more constrained by evolutionary path dependence than by physics, synthetic biology and protein design may have much more room to create useful structures than biology ever found on its own.

Discussion mood

Interested but not astonished. Most people saw the core observation as standard protein biochemistry, then got engaged by the deeper implication that evolution may be heavily reusing historically accessible scaffolds rather than exhausting what physics allows.

Key insights

  1. 01 The key unresolved issue is not whether proteins tolerate lots of mutations.
    It is whether the small number of natural folds reflects a true physical limit or just the tiny region evolution had time and machinery to search. The comments sharpened that into a sampling problem versus an intrinsic-property problem. They also pointed out that huge sequence diversity collapsing into the same fold cuts both ways. It shows why nature looks redundant, but it also leaves open the possibility that many viable folds remain undiscovered because evolution reaches new functions mostly by modifying old scaffolds.

    Natural fold counts do not tell you whether protein design space is small. They may mostly tell you how path dependent evolution is.
      Attribution:
    • DrScientist #1 #2
    • Windchaser #1
  2. 02 Protein reuse is not a vague metaphor.
    It is already embodied in specific fold families like the Rossmann fold and the TIM barrel, which support many different biochemical jobs. The comments also clarified why this is possible. Enzymes usually rely on a small set of residues in the active site, while large parts of the surrounding scaffold mainly preserve geometry and stability. That makes folds portable across functions and sequences far more interchangeable than non-specialists expect.

    A protein fold is often a reusable chassis, not a one-function artifact. Function rides on a few critical residues embedded in a tolerant scaffold.
      Attribution:
    • resiros #1
    • flobosg #1 #2
    • jyounker #1
  3. 03 AlphaFold-style models are a poor instrument for answering “what proteins could exist.
    ” They are grounded in natural sequence and structure data, so they excel at interpolation inside biology’s existing catalog but say much less about folds based on non-canonical amino acids, altered chemistries, or structures that natural evolution never sampled. The comments tied that to real drug design work, where modified peptides and proteins still rely heavily on direct experimental structure determination.

    Prediction on natural proteins is not exploration of full protein possibility space. If you want novel chemistries and folds, training on biology’s historical record is a built-in ceiling.
      Attribution:
    • photochemsyn #1
    • flobosg #1
  4. 04 Accessibility in protein space is not just about energy minima.
    It is also about what a cell can afford to build and fold. Chaperones, cofactors, ribosomal folding constraints, and the size of the basin of attraction in sequence space all determine whether evolution can ever find a fold, even if the fold is perfectly stable in principle. That reframes “possible” into at least three different notions: physically stable, biologically supportable, and evolutionarily reachable.

    A viable fold can still be invisible to evolution. Biology only discovers structures that are stable and reachable with the machinery it already has.
      Attribution:
    • hirenj #1
    • gilleain #1
    • pfdietz #1

Against the grain

  1. 01 The article’s main observation was presented as fresher than it is.
    Basic biochemistry has long taught that protein structure is more conserved than sequence, that only a few residues are often essential for catalysis, and that homologous proteins can diverge dramatically in sequence while keeping the same fold. From that perspective, the redundancy of natural folds is not a new revelation but a restatement of standard protein science.

    For domain experts, the novelty is not that folds are reused. The novelty only starts if the post says something deeper about why.
      Attribution:
    • jyounker #1
  2. 02 The scarcity of natural folds may be mostly an artifact of evolution reworking existing proteins, not a deep constraint of folding physics.
    One commenter pushed the stronger claim that first-principles protein design would likely not show the same limitation, because nature explores sequence space conservatively and locally. That is a more optimistic view of synthetic protein design than the dominant bounded-space framing.

    Do not confuse nature’s search strategy with nature’s limits. Engineering may be able to escape the reuse bias entirely.
      Attribution:
    • spwa4 #1 #2

Reference links

Protein fold references

Evolution and protein design framing

  • François Jacob, Evolution and Tinkering
    Quoted to frame evolution as a tinkerer that reuses available parts instead of engineering from scratch.
  • Vault organelle
    Mentioned as an example of unusual cellular machinery that hints other folding environments might unlock new structures.

Beyond natural amino acids

Background and side references