Artificial Intelligence

Designing AAV Vectors To Produce Functional Proteins

Explore Form Bio's AI models that predict, optimize, and de-risk AAV vector design outcomes.

Anusha Sriraman, PhD

Anusha Sriraman, PhD

April 23, 2024

Designing AAV Vectors To Produce Functional Proteins

Adeno-associated viruses (AAVs) have emerged as a foundational vector for gene therapy delivery. Recognized for their remarkable ability to deliver a transgenic payload into both dividing and non-dividing cells, AAVs possess a unique capability to facilitate extrachromosomal or chromosomally-integrated expression of therapeutic genes.1  This characteristic has solidified their extensive utilization across multiple applications, including gene editing, replacement, or addition.2 

Choosing the appropriate viral vector for your gene therapy requires a thorough understanding and careful consideration of AAV vector components, including serotype, promoter, transgene sequence, and more.3 In the following blog, we’ll discuss components and key considerations for precise AAV vector design essential for producing functional proteins.

The Components of An AAV Viral Vector

AAVs were first discovered in 1965, and since then, significant strides have been made in understanding the biology of AAVs.4 Here are some basics on AAV capsids, genome organization, and recombinant constructs.

AAV viral capsids

There are 13 naturally occurring, well-characterized AAV serotypes, each with unique tissue tropisms due to variations in the cap ORF, encoding the capsid's structural components: VP1, VP2, and VP3.5  These serotypes have distinct receptors and co-receptors, facilitating cell recognition, internalization, and payload delivery.6   

AAV genome organization

The AAV genome, a 4.7 kb piece of single-stranded DNA, contains the rep and cap genes flanked by inverted terminal repeats (ITRs) at the 5′ and 3′ ends, crucial for genome packaging, replication, and gene expression.7   The rep ORF encodes proteins for genome replication,8 integration,9  transcription,10 and encapsidation.11  While the AAV genome is typically single-stranded, self-complementary (sc, considered ‘double-stranded’) recombinant AAV (rAAV) vectors, bypass an initial replication step, which can lead to faster and more potent transgene expression.12

AAV expression cassette

rAAV vectors express transgenes in target tissues using constitutive, tissue-specific, inducible, or synthetic promoters. Transgenes may be optimized for efficient transcription and translation, with additional regulatory elements like enhancers, miRNAs, and introns for fine-tuning expression.13  Optimization of coding sequences can be essential for fine-tuning transcription and translation of the transgene in host cells or for ensuring efficient packaging of rAAV genomes into capsids.14 

Molecular details about AAV capsid structure, genome, and use for rAAV development are still being elucidated. Incomplete understanding has led to many therapeutic programs encountering pre-clinical and clinical roadblocks. 

Key Considerations in AAV Vector Design

The characteristics of AAV discussed previously are crucial considerations when designing gene therapy vectors that are effective and safe. Early decisions in pre-clinical studies greatly impact long-term therapeutic success in human clinical trials.

Outlined below are specific factors to consider when designing AAV vectors for gene therapy applications.

Selecting the Right AAV Serotype

AAV serotypes are crucial in vector design as they dictate tissue tropism and transduction efficiency. Commonly used serotypes like AAV2, AAV5, AAV8, and AAV9 exhibit varying preferences for infecting tissues, providing researchers with diverse options for targeted gene delivery. For instance, while AAV2 targets hepatocytes, retina, skeletal muscles, CNS, and renal tissue, AAV5 targets vascular endothelial cells, smooth muscle, retina, CNS, airway epithelia, and hepatocytes.5 Distinct primary receptors and co-receptors guide these differing tropisms.5 The right serotype is critical for effective and safe gene therapy, considering factors such as target tissue, transduction efficiency, immune response, and potential off-target effects.

Minimizing the Formation of Truncated Genomes is Essential to Produce Functional Proteins

Truncated genomes refer to incomplete or shortened versions of the full-length viral genome carried by rAAVs.15  They arise due to various factors such as vector design flaws, or limitations in the viral capsid.These truncated genomes produce non-functional proteins and impurities that necessitate higher dosing driving life-threatening immune activation and adverse events, such as T-cell-mediated liver injury and thrombotic microangiopathy associated with complement activation.16

Form Bio’s FORMsightAI is a biologically validated AI/ML technology that plays a crucial role in minimizing the truncation risk of AAV vectors, essential for generating functional proteins during pre-clinical and clinical development of gene therapies. Through in silico assessments, FORMsightAI evaluates viral vector yield, immunogenicity, and manufacturability, empowering researchers to optimize and de-risk vector sequences and manufacturing protocols. 

FORMsightAI simulations provide a comprehensive overview of the contribution of different vector elements to truncation risk.  This predictive power allows for the rapid screening of potential designs, by altering vector elements, such as promoters, ITRs, introns, polyA tails to identify the best set of vector elements for a given transgene.  This significantly reduces the trial-and-error cycles in vector development and accelerates the pace at which new therapies can be developed. 

Importantly, FORMsightAI technology can be used to determine how susceptible a given vector is to transgene optimization. By optimizing the coding sequence (CDS), the truncation risk can be reduced, which improves manufacturability and safety. 

In a recent experiment conducted in collaboration with a major contract development and manufacturing organization (CDMO), FORMsightAI optimization of the CDS in a housekeeping gene removed truncation peaks, resulting in up to 28% increase in full reads.

Click here to review our biological validation results in detail.

AI Disclosure: Feature image was generated by an AI image development tool MidJourney.

Please join me in my upcoming webinar to explore how Form Bio’s AI models predict and optimize AAV vector design outcomes

Register Here


  1. Lundstrom K. Viral Vectors in Gene Therapy: Where Do We Stand in 2023?  Viruses. 2023;15(3):698.
  2. Wang, JH., Gessler, D.J., Zhan, W. et al. Adeno-associated virus as a delivery vector for gene therapy of human diseases. Sig Transduct Target Ther 9, 78 (2024).
  3. Viral Vectors Within Gene Therapy Clinical Trials. Form Bio Resource Center. Published Feb 23, 2023. Accessed March 24, 2024.
  4. Atchinson RW, Casto BC, Hammon W M.  Adenovirus-Associated Defective Virus Particles.  Science. 1965; 149(3685):754-756. 
  5. Issa SS, Shaimardanova AA, Solovyeva VV, Rizanov AA. Various AAV Serotypes and Their Applications in Gene Therapy: An Overview.  Cells. 2023;12(5):785.
  6. Wang D, Tai PWL, Gao G. Adeno-associated virus vector as a platform for gene therapy delivery. Nat Rev Drug Discov. 2019 May;18(5):358-378.
  7. Maurer AC, Weitzman MD. Adeno-Associated Virus Genome Interactions Important for Vector Production and Transduction. Hum Gene Ther. 2020 May;31(9-10):499-511. 
  8. Zhou X, Zolotukhin I, Im DS, Muzyczka N. Biochemical characterization of adeno-associated virus rep68 DNA helicase and ATPase activities. J Virol. 1999 Feb;73(2):1580-90.
  9. Kotin RM, Siniscalco M, Samulski RJ, et al.  Site-specific integration by adeno-associated virus.  Proc Natl Acad Sci USA. March 1, 1990. 87 (6) 2211-2215.
  10. Pereira DJ, McCarty DM, Muzyczka N. The adeno-associated virus (AAV) Rep protein acts as both a repressor and an activator to regulate AAV transcription during a productive infection. J Virol. 1997 Feb;71(2):1079-88.
  11. King JA, Dubielzig R, Grimm D, Kleinschmidt JA. DNA helicase-mediated packaging of adeno-associated virus type 2 genomes into preformed capsids. EMBO J. 2001 Jun 15;20(12):3282-91. 
  12. McCarthy D.  Self-complementary AAV vectors; Advances and Applications.  Molecular Therapy. Volume 16, Issue 10, October 2008, Pages 1648-1656.
  13. Kolesnik VV, Nurtdinov RF, Oloruntimehin ES, Karabelsky AV, Malogolovkin AS. Optimization strategies and advances in the research and development of AAV-based gene therapy to deliver large transgenes. Clin Transl Med. 2024 Mar;14(3):e1607. 
  14. Ketz. N. Leveraging Model Interpretability Methods to Predict Gene Therapy Manufacturing Failures.  Form Bio Resource Center. Accessed March 9, 2023.
  15. Aldridge, C. Distinguishing AAV Full/Empty and Fragmented Capsid Ratio. Form Bio Resource Center. Published August 23, 2023. Accessed February 2, 2024.
  16. Food and Drug Administration. CTGTAC Meeting #70. Toxicity Risks of Adeno-Associated Virus (AAV) Vectors for Gene Therapy. September 2-3, 2021.

More to Explore