New open-source software tool could speed up genetic discoveries


Commercially viable biofuel crops are essential for reducing greenhouse gas emissions, and a new tool developed by the Center for Advanced Bioenergy and Bioproducts Innovation (CABBI) should accelerate their development -; as well as advances in gene editing in general.

Crop genomes are adapted through generations of selection to optimize specific traits, and until recently breeders were limited to selecting on natural diversity. CRISPR/Cas9 gene-editing technology may change that, but the software tools needed to design and evaluate CRISPR experiments have so far been based on the editing needs of mammalian genomes, which do not do not share the same characteristics as the genomes of complex cultures.

Enter CROPSR, the first open-source software tool for genome-wide design and evaluation of guide RNA (gRNA) sequences for CRISPR experiments, created by scientists at CABBI, a bioenergy research center (BRC) funded by the Department of Energy. According to the study published in BMC Bioinformatics.

“CROPSR provides the scientific community with new methods and a new workflow for performing CRISPR/Cas9 knockout experiments,” said CROPSR developer Hans Müller Paul, a molecular biologist and Ph.D. student with the co -author Matthew Hudson, professor of crop science at the University of Illinois at Urbana-Champaign. “We hope the new software will speed up discovery and reduce the number of failed experiments.”

CROPSR developer Hans Müller Paul, molecular biologist and Ph.D. student with co-author Matthew Hudson, professor of crop science at the University of Illinois at Urbana-Champaign

To better meet the needs of crop geneticists, the team built software that lifts the restrictions imposed by other packages on the design and evaluation of gRNA sequences, the guides used to locate targeted genetic material. Team members also developed a new machine learning model that wouldn’t avoid guides for repetitive genomic regions often found in plants, a problem with existing tools. The CROPSR scoring model provided much more accurate predictions even in uncultured genomes, the authors said.

The objective was to integrate functionalities to make life easier for the scientist.”

Hans Müller Paul, CROPSR Developer

Many crops, especially bioenergy feedstocks, have very complex polyploid genomes, with multiple sets of chromosomes. And some gene-editing software tools based on diploid genomes (like those of humans) struggle with the peculiarities of crop genomes.

“It can sometimes take weeks or months to realize that you don’t have the result you expected,” Müller Paul said.

For example, a trait may be regulated by a collection of genes, particularly a gene involving plant stress where backup systems are useful. A scientist might design an experiment to knock out one gene and ignore that another performs the same function. The problem may not be discovered until the plant matures without altering the trait in any way. This is a particular problem with crops that require specific weather conditions to grow, where missing a season can mean a year’s delay.

Using a genome-wide approach allowed scientists to adapt CROPSR for plant use by removing the built-in biases found in existing software tools. Because they are based on human or mouse genomes, where multiple copies of genes are less common, these tools penalize gRNA sequences that hit the genome in more than one position, to avoid causing mutations in places where they are not intended. But with cultures, the goal is often to mutate more than one position to eliminate all copies of a gene. Previously, scientists sometimes had to design four or five mutation experiments to knock out each gene individually, which required extra time and effort.

CROPSR can generate a database of CRISPR guide RNAs that can be used for the entire genome of a culture. This process is time consuming and computationally intensive -; usually requiring several days -; but researchers only have to do this once to build a database that can then be used for ongoing experiments.

So rather than searching for a targeted gene in an online database, then using current tools to design separate guides for five different locations and doing multiple rounds of experiments, scientists could search for the gene in their own database and see all available guides. CROPSR would also indicate other locations to target in the genome. Researchers could select a guide that touches all genes, which would make the design of the experiment easier and faster.

“You can just access the database, retrieve all the information you need, be ready to go and start working,” Müller Paul said. “The less time you spend planning your experiments, the more time you can spend doing your experiments.”

For CABBI scientists, who often work with repetitive plant genomes, having a gRNA tool that allows them to design functional guides with confidence “should be a step forward,” he said.

As its name suggests, CROPSR was designed with crop genomes in mind, but it applies to any type of genome.

“CROPSR is also based on human genes, because the data availability for crop genes is just not there yet,” Müller Paul said, “but we are looking at collaborations with other BRCs to provide prediction performance based on biophysics to help alleviate some of the problems caused by the lack of data.”

In the future, he hopes researchers will record their failures as well as their successes to help generate the data needed to train a culture-specific model. If the collaborations materialize, “we could see some very exciting advances in training machine learning models for CRISPR applications, and potentially for other models as well.”


University of Illinois at Urbana-Champaign Institute for Sustainability, Energy, and the Environment

Journal reference:

Muller Paul, H., et al. (2022) CROPSR: an automated platform for complex genome-wide CRISPR gRNA design and validation. BMC Bioinformatics.


Comments are closed.