The genomes of 47 people reveal the most complete human ‘instruction code’ to date

by time news

2023-05-10 17:00:05

On April 6, 2000, the draft of the first human genome was announced, the ‘recipe’ that contains all the information for a person to develop and grow. In February 2001, the prestigious journals ‘Nature’ and ‘Science’ published the final sequencing. Not everything was done: there were ‘gaps’ – the first draft only completed 92% of the genome – and we had to wait until 2021, when the first complete sequence was generated, the entire recipe for the functioning of the human being.

But, as in any good recipe, everyone has their own version. Because although all of us belong to the human species, we are a compendium of genetic diversity that causes, for example, the Inuits to tolerate the cold better; that members of the African tribe of the Bajau Laut, also called ‘sea nomads’, can remain apnea for up to 13 minutes; or that Europeans and Asians have 2% Neanderthal genetics that has influenced them during the Covid-16 pandemic. You yourself don’t have to travel far to find differences: you are separated from other people, on average, by 0.4% of your genome, including your closest relatives.

All those variations escaped the Human Genome Project. Even when it was completed two years ago. The problem was that these data mainly reflected the information of a single person (although some of the ‘gaps’ were filled in with DNA from twenty more). Now, an international team made up of dozens of scientists from different centers (including the Barcelona Supercomputing Center) has achieved the draft of the first pangenome, a tool that brings together almost complete genomic data from 47 people whose ancestry goes back to different populations from all over the world. world. The results and their first applications have just been published in four studies in the journal ‘Nature’ (they can be consulted here, here, here y here).

“Everyone has a unique genome, so using a single reference genomic sequence for each person can lead to bias,” explains Adam Phillippy, Principal Investigator in the Statistical and Computational Genomics Branch within the Intramural Research Program of the NHGRI and co-author of the main study. “For example, predicting a genetic disease might not work as well for someone who doesn’t have a genome as close to the reference.”

In this regard, specific cases are already known: the risk of suffering angioedema (inflammation of the deep layers of the skin), caused by angiotensin-converting enzyme inhibitors, used in drugs to prevent blood pressure, is three times higher in black patients than in those of another race. Another reaction to cardiovascular medications, coughing, is also nearly three times more common in Asian patients than in Caucasians. And every day more and more differences of this type are discovered.

“In substance, we are 99.9% similar when we compare any pair of human beings. But if we have 3 billion pairs of letters in our genome (3 billion from our mother, and another 3 billion from our father) this represents between 3 and 6 million letters that, at least, are different between any pair of human beings,” Lluis Montoliu, researcher and deputy director of the National Center for Biotechnology (CNB-CSIC) and the Center for Network Biomedical Research in Diseases, explains to ABC. Rare (CIBERER-ISCIII).

“Among these differences will be found the one responsible for the congenital disease that we suffer from.” However, finding it implies knowing how to compare with sequences not associated with disease and taking into account all that individual variability. “Until now, we usually did not systematically take this into account in genetic analyses. That is the main novelty of the pangenome,” Montoliu points out.

In search of DNA

The samples to make these first 47 genomes (which are intended to be expanded to 350 by 2024) were taken from people who participated in the 1000 Genomes Project, an international effort completed in 2015 that sought to reflect the greatest possible genetic diversity thanks to the DNA of a thousand people of diverse ancestry (although the most represented here were African and Asian). This genetic base is open to any researcher in an open way.

Thanks to this initiative, different discoveries have been made. For example, it has been discovered that each healthy individual carries in their genome, on average, around 150 anomalies that cause premature termination of proteins (causing diseases such as Alzheimer’s, Parkinson’s or Huntington’s) and another 30 implicated in the appearance of rare diseases. The unique origin of all current humans in a common ancestor between 150,000 and 200,000 years ago has also been confirmed.

However, these results were obtained by the analysis of certain regions of the genetic material. The pangenome will now allow you to ask broader questions, or even discover things you didn’t even know were there.

An ‘unfoldable’ genome

This new tool is something like a ‘fold-out book’: the bases of the genomes are compared with each other at their exact point, being able to zoom in on more specific areas and see where the differences exist.

In the upper bar, how the human genome was shown up to now; in the middle bar, how it is shown overlaid with the 47 genomes; below, a part of the central bar displayed and enlarged, where you can see the different genomic variations

National Human Genome Research Institute

Thus, these differences can be small, of only one or a few ‘letters’ (bases) of DNA; or, conversely, they may be larger pieces, called structural variants, with a difference of 50 base pairs or more. In the case of the latter, scientists know that they can have important health implications. However, until now, it has not been possible to identify more than 70% of them because only one reference genome existed.

The pangenome has added 119 million new bases of which approximately 90 million are derived from structural variation. Here it has been observed that these changing pieces are very complex and can be of different types: from sequence inversions to insertions, through deletions or tandem repetitions (a segment of two or more bases repeated numerous times). These new bases will help to study regions of the genome for which there was previously no reference and, potentially, could associate structural variants with diseases in future studies.

“Now, we can map to more structural variants, so we’re finding features and areas in the genome that just weren’t there before,” explains Karen Miga, of the University of California, Santa Cruz, and one of the main drivers first in complete the human genome in 2021 and now in the first draft of the pangenome. “It’s exciting because it allows us to look at gene regulation in a unique way that we couldn’t study before, because those areas would likely have been inappropriately mapped or simply ignored entirely.”

Looking for more genetic variants

Still, the authors note that there is still work to be done. “Since the year 2000, we’ve had a series of increasingly accurate representations of a genome,” explains David Haussler, scientific director of the UCSC Institute for Genomics, who led the UCSC team on the original Human Genome Project and advises about the pangenome project.

The next steps will come hand in hand with expanding the sample beyond the 1000 Genomes Project, introducing DNA from more isolated ethnic groups. And even then, it won’t be enough to represent all the genetic diversity of our species. “No matter how accurately you represent a genome, that will not symbolize all of humanity. With this we are at a tipping point: it is no longer the genomics of the only standard human genome, but the genomics for all”.

Even so, his partner Karen Miga assures: “We are, without a doubt, facing a powerful and new era of medicine.”

#genomes #people #reveal #complete #human #instruction #code #date

You may also like

Leave a Comment