Scientists have modified a language model to come up with enzymes that are just as active as those found in nature.
Need an enzyme for the production of a new kind of drink, paint or medicine? The new AI program ProGen helps. Scientists from the University of California, San Francisco, among others, believe that this new software will revolutionize the field of enzyme design. They published their invention in Nature Biotechnology.
Nobel Prize in Chemistry
Enzymes speed up chemical reactions and are vital in your body. But this type of protein is also frequently used in industry. No wonder the 2018 Nobel Prize in Chemistry went to research into the evolution of proteins. One of the developed methods to make new enzymes is directed evolution.
In directed evolution, the scientist first deliberately makes random mutations in the gene of a natural enzyme. This creates many mutated variants of the original protein, which all have slightly different properties as a result. For example, if you are looking for an enzyme that is heat-resistant, you then increase the temperature and only mutants with that new property remain active. Through a kind of imitated natural selection you have an improved enzyme, suitable for your specific application.
But the current team, led by James Fraser, now thinks they have created a computer model with ProGen that designs enzymes much better than directed evolution can. How? By using an intelligent language model as a basis.
Such a language program (such as Google Translate also uses) writes texts based on predictions. For example, it predicts which words a sentence will contain. The model has learned this based on the many texts he has read and remembered.
Amino acid sequences
Instead of words, ProGen uses amino acids, the building blocks of proteins and enzymes. Fraser and colleagues first ran the AI program on 280 million different amino acid sequences, called sequences, of pre-existing enzymes. This is how the program learned what logical and effective sequences were.
The system took a few weeks to process this information and put it into a computer algorithm. The researchers then refined this algorithm by entering 65,000 sequences from a particular enzyme family. This concerned lysozymes that are found in, among other things, tears, saliva and milk, where they kill bacteria by destroying their cell walls.
break down cell wall
Finally, the test followed the sum. The model generated a million sequences of artificial enzymes that are supposed to work. The research team chose five to synthesize and test their activity in the lab. Two of the five artificial lysozymes were actually found to be able to break down the cell wall of microbes.
That is remarkable, because only 18 percent of the sequence of the new enzymes matched that of natural lysozymes. The AI was also able to predict the correct shape of the proteins based purely on the raw sequence data. Special, because long amino acid chains can fold in all kinds of ways and thus take on many different shapes.
According to Fraser and colleagues, the possibilities with ProGen are endless. Just imagine; if you make a protein of 300 concatenated amino acids (the length of a lysozyme) with the twenty available amino acids, you can make exactly 20300 build protein sequences. That is more than the number of atoms in the entire visible universe!
Enzyme design scientists will no doubt rub their hands at this thought. Because this way you can find a suitable enzyme for every possible application, from food to medicine.
Breakthroughs in the future
“This is clearly the decade of AI,” says chemical biotechnologist Dick Janssen of the University of Groningen. “We already saw this in predicting protein structures, which has been drastically changed by AlphaFold. Expanding into enzyme design was an expected next step.”
“Incidentally, no completely new enzyme (with a completely new function, ed.) was obtained in this work, but ‘only’ a lysozyme with a new amino acid sequence,” he continues. “But in the future I expect that AI technology will be further developed and breakthroughs can certainly follow.”
Biochemist Ron Wever of the University of Amsterdam is also impressed by the study. “This is a great development based on AI and could be important for applications of enzymes in the biotechnology and pharmaceutical industry. This work also shows that as long as the active center of the enzyme (important for its functioning, ed.) is more or less preserved, the rest of the amino acid sequence can and may vary greatly in composition.”
Bronnen: Nature Biotechnology, University of California, San Francisco, via EurekAlert!