Case Study

AI/ML Optimization of Coding Sequences Reduces Genome Truncations in AAV Vectors

DNA icon

Context

The coding sequence (CDS) of a gene-of-interest (GOI) in AAV gene therapies provide little opportunity for optimization besides evolutionarily optimized codon frequency usage. However, if the CDS of a GOI has a risk for causing AAV genome truncations, it can lead to significant manufacturing challenges and provide limited options for subsequent bioprocessing improvement. Here, we explored using AI/ML to optimize the CDS with the aim of reducing truncation events in this genomic element.

methods icon

Methods

We used FORMsightAI to predict the truncation risk of two vector designs, AAV5_P1_PRO1_GOI containing a reporter gene with a P1 promoter and P1_PRO1_GOI containing a housekeeping gene with a P1 promoter (Table 1). We then optimized the CDS regions using FORMsightAI to create AAV5_P1_PRO1_GOI_OPT and P1_PRO1_GOI_OPT and then evaluated them using long-read sequencing.

Table 1 image
Table 1: Vector designs used in this experiment.
splitting molecule icon

Results

Although the coding sequences of AAV5_P1_PRO1_GOI (Figure 1) and P1_PRO1_GOI (Figure 2) had few truncation issues with limited opportunity for improvement (purple dots), FORMsightAI optimization still achieved a notable reduction in truncation events for both GOIs in AAV5_P1_PRO1_GOI_OPT and P1_PRO1_GOI_OPT (pink dots). This resulted in truncation reductions of 14% and 30% for AAV5_P1_PRO1_GOI and P1_PRO1_GOI, respectively, which can have a significant impact in manufacturing.

Figure 1 image
Figure 1: Estimated count of truncated reads along the CDS for the initial construct (purple dots), optimized construct (pink dots), location and annotations (x-axis), and the model estimated count of truncated reads (y-axis). Forward replication sequence shown here. The dashed black line marks the estimated noise-floor. Total read-depth is assumed to be 500,000 over the full construct.
Figure 2 image
Figure 2: Estimated count of truncated reads along the CDS for the initial construct (purple dots), optimized construct (pink dots), location and annotations (x-axis), and the model estimated count of truncated reads (y-axis). Forward replication sequence shown here. The dashed black line marks the estimated noise-floor. Total read-depth is assumed to be 500,000 over the full construct.

Impact

FORMsightAI can de-risk AAV manufacturing by optimizing the CDS regions of GOIs to reduce the formation of partial genomes.

Get Your Therapeutic Candidate to Market Faster

TALK TO OUR EXPERTS
Schedule a Discovery Call