Advanced software tool discovers new cancer-causing genes


An advanced software tool for analyzing DNA sequences from tumor samples has discovered new genes that may cause cancer, in a study led by researchers at Weill Cornell Medicine.

In the study, published Sept. 26 in Nature Communications, the researchers designed the software, known as CSVDriver, to map and analyze the locations of large mutations, called structural variants (SVs), in datasets from tumor DNA. They then applied the tool to a dataset of 2,382 genomes from 32 different cancer types, analyzing cancer genomes from different organ systems separately. The results confirmed the likely roles of 47 genes in driving cancer, tentatively linked several of them to certain types of cancer for the first time, and pointed to another 26 genes as likely drivers of cancer, even though they had never been linked to cancer before.

“Our results show that CSVDriver could be broadly useful to the cancer research community, providing new insights into cancer development as well as potential new targets,” said the study’s lead author, Dr. Ekta Khurana, associate professor of physiology and biophysics and co-director of the cancer genetics and epigenetics program at the Meyer Cancer Center at Weill Cornell Medicine.

The study’s first author was Dr. Alexander Martinez-Fundichely, an instructor in physiology and biophysics at Weill Cornell Medicine and a member of the Khurana lab.

Cancers usually arise and progress to greater malignancy, when DNA mutations occur in a single cell and effectively remove or reverse the usual brakes on cell division. Over the past few decades, cancer biologists have documented hundreds of these cancer-causing mutations, and many of them are now the targets of drug treatments. Yet the discovery of carcinogenic mutations is far from complete.

The vast majority of mutations in cancer cells are not driver mutations. These are so-called transient or background mutations that do not improve tumor growth or survival. These passenger mutations are spread across the genome, and it can be difficult to tell driver mutations apart amidst all this “background noise”. Researchers have made great strides in distinguishing drivers from passengers in the simplest class of DNA mutations, point mutations, also known as single nucleotide variants. But they’ve made less progress with SVs, which are larger and more complex mutations, including deletions and extra copies of sometimes long stretches of DNA.

In the new study, the researchers developed CSVDriver to analyze SV datasets in cancer genomes to uncover likely drivers of cancer.

The general idea here was to model the distribution of background mutations that we would expect for a given cancer type, and then identify, as candidate pilot locations, regions where mutations occur more often than expected in a large proportion of patients.

Dr. Alexander Martinez-Fundichely, Instructor in Physiology and Biophysics, Weill Cornell Medicine

CSVDriver represents an advance over previous efforts in this area because it models the expected SV background in a way that accounts for tissue-specific factors that can influence this background, such as three-dimensional DNA folds.

In total, the analysis identified, as suspected cancer factors in the large SV dataset, 53 protein-coding genes, three DNA segments that code for regulatory RNAs, and 24 sites known as ” enhancers” because they attract transcription factor proteins that can stimulate the activity of other genes. Many of these suspects were already known to be cancer drivers from previous research, so to that extent the results validated the algorithm.

However, CSVDriver has also demonstrated its value as a discovery tool by uncovering certain known cancer-related genes as probable drivers of cancers to which they have not previously been linked, for example the gene DMD in esophageal cancer, and NF1 in ovarian cancer. Additionally, the results also highlighted 26 genes that had not previously been linked to cancer as likely cancer factors.

“These are findings that can be followed by further studies in the wet lab and animal models to explore the impacts of mutations in these genes, and which in turn could lead to the development of new cancer treatments targeting these mutations. said Dr. Khurana, who is also a WorldQuant Foundation Research Fellow at Weill Cornell Medicine.

Most of the genomes analyzed in the study came from primary cancers, but Drs. Khurana and Martinez-Fundichely and their colleagues now plan to use CSVDriver to uncover the drivers of advanced metastatic cancers, which bring the worst prognoses and have few effective treatments.


Journal reference:

Martinez-Fundichely, A., et al. (2022) Modeling the proximity of tissue-specific breakpoints to structural variations in whole genomes to identify cancer factors. Nature Communication.


Comments are closed.