Chemist uses mathematics to do the near-impossible

January 6, 1995

By Lynn Yarris,

Not so very long ago, the idea that one could calculate even simple molecular structures from mathematical equations was treated with hearty skepticism. Now, however, Teresa Head-Gordon, a physical chemist in the Life Sciences Division, is using such algorithms to solve one of the most complex structural problems in molecular biology--the folding of proteins.

Proteins are large molecules made from the linking together of specific sequences of amino acids. A key to the multitude of chemical tasks performed by proteins is that their chains of amino acids can bend and twist, enabling a protein molecule to fold over and around itself. Understanding the mysteries of protein folding may make it possible for scientists to repair defective proteins that cause disease and other health problems, and to design new and improved synthetic proteins for biotechnical applications.

Such understanding will also enable scientists to more quickly apply the knowledge gained from the Human Genome Project.

"Biologists want to know the structure of a protein just from an amino acid sequence," says Head-Gordon. "However, it is not certain you can make algorithms that predict protein structures from primary sequences without a better understanding of the forces that drive folding."

Head-Gordon is taking a holistic approach to unraveling the mysteries of protein folding. She is first designing specific algorithms that address three basic issues: folding kinetics; folded structure and thermodynamics; and the accuracy of computational predictions. These specific algorithms are expected to provide the basic knowledge needed to develop comprehensive structure-predicting algorithms.

One of Head-Gordon's first successes was an algorithm she calls "antlion," after the insect that digs a hole and waits for its prey to fall in. From a given amino acid sequence, the antlion algorithm can predict highly accurate structures for small proteins like melittin (bee venom), which is composed of about 26 amino acids. Predictions are determined by mathematically "trapping" a protein into its lowest--hence most structurally stable--potential energy state.

Although the antlion method provides a framework for predicting the structures of large proteins, Head-Gordon says it also exposes its own limitations, such as failing to explain what forces cause the protein to fold in the first place.

"There has been much debate over whether the early stages of folding are controlled by environmental influences, such as water, or by the specific sequence of a protein's amino acids," she says.

Her latest algorithm, which models proteins in pure water, may shed new light on the debate. Proteins are formed in an aqueous solution and their hydrophobic (water-repelling) nature is thought to be responsible for a collapse to a compact structure. This collapse is believed to trigger the folding process.

Using her "water" model, Head-Gordon has been able to accurately characterize hydrophobic interactions for methane--a molecule in which a single carbon atom is bonded to four hydrogen atoms. Methane is analogous to a small alanine protein.

"This water model can be used to fully characterize the thermodynamics of protein folding," she says. "Future work with it will involve characterizing hydration for other amino acid sidechains and backbones to ultimately piece together a simple but physically realistic picture of early folding events."

The fact that her work with the water model to date implies that folding starts as a consequence of structural collapse means it is not the amino acid sequence but the environment that drives the process. However, Head-Gordon says further studies with solutions more realistic than pure water are needed.

The key to developing comprehensive predictive models of protein structures, she says, will be the continued development of reliable "neural networks" that associate patterns of amino acids with known protein structures. Out of a suspected 40,000 or more protein structures, only about 400 have been structurally identified.

"We have already achieved a significant improvement in the accuracy of our computational predictions by paying more attention to neural network designs," she says.

Expanding the number of known protein structures and their associated neural networks could help scientists find out how protein folding takes place.

"Predicting the structures of proteins from computational algorithms is certainly possible," she says. "The ultimate test will be to see how our models compare to experimental results."