Artificial Intelligence

AI and Gene Therapy: The Next Frontier in Life Science Innovation

Uncover the emerging role of AI in the revolutionary field of gene therapy and how it is paving the way for groundbreaking advancements.

Joe Nipko, PhD

Joe Nipko, PhD

July 11, 2023

AI and Gene Therapy: The Next Frontier in Life Science Innovation

By harnessing the power of genetic engineering, gene therapies aim to correct or replace faulty genes within an individual's cells, offering the potential of one-dose cures for a broad range of diseases. Thus far, the FDA has approved gene therapies for retinal dystrophy, spinal muscular atrophy, Hemophilia B, bladder cancer, and, most recently, Duchenne muscular dystrophy and Hemophilia A, marking a major advancement this year since the first gene therapy approval in 20171-7.

Alongside these landmark advancements in medicine come a heap of discovery, development, manufacturing, and clinical challenges for biopharmaceutical teams to overcome.  Safety concerns in gene therapy clinical trials were responsible for the large spike in clinical holds in 20218. Biomanufacturing these complex biologics remains challenging to scale while maintaining high-quality production processes. With the amount of activity and focus in the gene therapy space, many biotech and pharma companies are (or will be) attempting to solve these problems to bring safer and more effective therapies to market faster and cure a vast array of life threatening diseases.

With all of the recent breakthrough advancements of artificial intelligence (AI), large language models and transformer architecture hold significant promise for solving the many challenges that gene therapy developers face9-10.  In the following blog, we’ll discuss the benefits of AI, applications of AI in gene therapy, and the current and long-term practical challenges for implementing AI into your gene therapy development workflow.

The Rise of AI and its Future in Our Society

As AI is still a relatively new area for life scientists, let's begin by addressing a fundamental question: What is the origin of AI and what has led to its remarkable growth in various applications?

The foundation for AI was first laid in the arts in 1923, with science fiction stories about artificial humans11.   AI emerged from the computational brilliance of visionaries John Von Neumann and Alan Turing at the beginning of the 1950s12. They laid the foundation of cybernetics, unlocking the potential for human-machine interaction.  Then in 1956, AI took shape as the pursuit of crafting computer programs capable of executing high-level cognitive tasks like learning, memory, and reasoning and thus the term ‘artificial intelligence’ was formed13.

Fast forward to today, and the emergence of AI allows us to solve problems, improve efficiency, and navigate through uncharted territories.  From self-driving cars to virtual assistants and beyond, the possibilities are endless. 

So, how can we harness AI's power to improve our world, safely?  

There's a growing movement among government, public, and private organizations to collaborate on leveraging the opportunities that AI offers while mitigating its risks. Several initiatives have been undertaken of late, including the efforts of the President's Council of Science and Technology Forum and, more recently, Stanford's Hosting of the President for discussions on the topic14-15.

Improving Gene Therapy Development with AI

Over the past decade, the pipeline of gene therapies has expanded significantly. In 2023, it’s predicted that there will be 10 approval decisions made regarding cell and gene therapies, half of which will be about gene therapies targeting rare diseases16,17

As described above, several areas can be improved. First, the cost of FDA-approved gene therapies poses a challenge to accessibility for patients18. Second, the adeno-associated virus (AAV) vector-based gene therapies, the favored vector for gene delivery, are largely considered safe, yet there are still challenges with managing adverse events and preventing death. Unfortunately, in 2021, four boys died in a Phase I/II trial of Astellas Pharma’s resamirigene bilparvovec, an AAV-based gene therapy for X-linked Myotubular Myopathy (XLMTM)19,20.  

As an industry built to help people live longer and healthier, we need better outcomes.

In an AAV gene therapy development process, most biotech companies make AAVs by transfecting plasmids into cells. These plasmids encode genes that make functional AAVs. One plasmid encodes the ‘payload’ — usually a therapeutic human gene designed to treat disease — which gets coiled up and packaged into the capsid. By tweaking the sequences encoded on these plasmids, one can also alter an AAV’s shell or the payload inside resulting in better clinical outcomes.

Now, we’re seeing the use of AI to help us understand and optimize AAV viral vector engineering and beyond. With AI tools gaining more acceptance and trust, the cell and gene therapy development lifecycle is experiencing rapid advancements in research, discovery, clinical trials and biomanufacturing.

Application of AI in Gene Therapy Development

Beyond AAV construct design, AI has the potential to drive transformative advancements in several pre-clinical and clinical gene therapy development stages.

Personalized Medicine

By analyzing genomic data and identifying disease-associated genetic variations, AI can contribute to personalized treatment strategies. It assists in designing tailored gene therapy approaches and predicting therapeutic outcomes for individuals. Furthermore, AI enables real-time monitoring of patient data, facilitating adaptive therapies. It analyzes gene expression profiles, clinical parameters, and treatment responses, providing insights for personalized adjustments to treatment plans.

Biomarker Predictions

AI uncovers meaningful patterns and potential biomarkers through genomics and molecular datasets analysis. This discovery process aids in identifying novel therapeutic targets, developing patient stratification approaches, and establishing biomarkers to monitor treatment effectiveness.

Target Identification and Validation

AI algorithms can identify potential gene targets and validate their therapeutic relevance by analyzing large-scale genomic and molecular datasets. This enables the discovery of novel gene candidates and the evaluation of their suitability for gene therapy interventions.

Gene Delivery Optimization

AI also plays a vital role in optimizing gene delivery systems, such as viral vectors or non-viral carriers. AI algorithms use computational modeling and simulation to enhance gene delivery methods' efficiency, safety, and specificity.

Clinical Predictions

AI excels in gene therapy efficacy and safety prediction by integrating diverse datasets. It constructs predictive models to assess treatment outcomes and optimize protocols, leading to improved patient responses.

What Are The Current Challenges of Using AI in Gene Therapy Development?

While there is considerable promise for AI in gene therapy research, it is important to acknowledge the significant challenges that still need to be overcome. Many problems in AI stem from the quality of the data that is used to develop the AI algorithm. By addressing these challenges, we can ensure that the current partnership between AI and gene therapy development drives sustainable change. With continued advancements, research, and collaboration, we can unlock the full potential of AI in revolutionizing gene therapy and improving patient outcomes.

Limited and Heterogeneous Data

AI algorithms typically require large amounts of high-quality data to train and produce reliable results. Gene therapy data is often limited due to the rarity of certain diseases and the high costs associated with data collection. Additionally, the data obtained from different sources may be heterogeneous regarding formats, quality, and labeling standards. These variations produce challenges to ensure training models are accurate, reproducible and reliable.

Extensive Compute Power

The requirement for extensive computational power and large volumes of data to train AI systems stands as one of the most significant challenges. Regrettably, these resources are frequently accessible only to industry giants, leading to an imbalance within the AI development ecosystem. This disparity jeopardizes the fundamental scientific research necessary to achieve our ambitious technological goals. Resolving this imbalance calls for a substantial investment in AI from the public sector and effective public/private partnerships. 

Data Privacy and Access

Gene therapy data, especially patient-specific genomic data, is highly sensitive and subject to privacy regulations. Balancing the need for data access while respecting privacy constraints is a significant challenge. Access to diverse and representative datasets is crucial for training robust models, but strict regulations and consent requirements can limit data availability.

Labeling and Annotation

Deep learning models require labeled data for supervised training. However, labeling gene therapy data can be complex and time-consuming. Expert knowledge is often required to annotate and interpret genomic sequences, which may limit the scalability and efficiency of data labeling processes.

Interpretable Models 

Gene therapies involve complex biological processes, and understanding the underlying mechanisms is crucial for successful treatment. Deep learning models, particularly deep neural networks, are often considered black boxes, making it difficult to interpret their predictions and understand the reasoning behind them. Developing interpretable models that provide insights into the biological processes will enhance the trust and acceptance of AI within the life science industry.

Generalization Across Datasets

Deep learning models trained on one dataset may struggle to generalize well to other datasets due to differences in data distribution, batch effects, or technical variations. Achieving robust and generalizable models for gene therapies requires careful consideration of dataset biases and developing strategies to minimize these biases during model training.

Data Integration and Fusion

Gene therapy research often involves multiple types of data, including genomic, clinical, imaging, and more. Integrating and fusing heterogeneous data sources to create a comprehensive dataset can be complex. Developing effective techniques to combine and leverage different data modalities is essential for building accurate and holistic deep learning models.

Addressing these challenges requires collaboration between genomics, bioinformatics, and AI experts, along with advancements in data sharing frameworks, privacy-preserving techniques, and interpretability methods. Overcoming these obstacles will enable the development of robust deep learning models that can accelerate advances in gene therapies.

The Future of AI in Gene Therapy: Solutions on the Horizon

With several long-term solutions listed below, as an industry, AI-enabled biotechnology has the potential to drive the approval of safer, more effective, and accessible gene therapies for a wide range of diseases. 

Data availability and data generation standards

The absence of standardized data poses a significant hurdle as it hampers the ability to validate and reproduce results. Establishing robust data generation standards is essential to overcome this reproducibility problem.

Ethics and regulatory considerations

Addressing ethical and regulatory concerns becomes paramount as AI becomes more integral to gene therapy research, development, and manufacturing. Safeguarding patient privacy, ensuring informed consent, and establishing guidelines for responsible AI usage are essential for maintaining trust and ethical standards.

Converting data into valid insights

Extracting meaningful insights from the collected data is a critical challenge. Developing advanced algorithms and methodologies that can effectively process and interpret complex genomic and other ‘omics data is necessary to derive valid and actionable insights.

Internal infrastructure development

Establishing robust and secure internal systems, such as federated data systems, can facilitate collaborative research and data sharing while ensuring data privacy and security21.

How Form Bio is Accelerating Gene Therapy with AI

Harnessing the power of AI in gene therapy development and manufacturing requires the development of a robust AI infrastructure and overcoming several hurdles22. To provide a full-service solutions to common problems in gene therapy development, we’ve build a data management system,  formed a partnership with Google to streamline multiomic data analysis and launched FORMsightAI, an AI platform specific for derisking viral vector production. 

FORMsightAI uses advanced and customizable AI models so you can simulate and optimize viral constructs before spending millions on biomanufacturing runs and wasting significant time. In partnership with a publicly-traded gene therapy company, we’ve seen this play out by increasing the overall yield and quality of viral production. Leveraging FORMsightAI early on in preclinical gene therapy development programs will ultimately speed up time to clinical trials, lower costs of gene therapies and get safe life-saving therapies to patients.


How is AI Used in Gene Therapy?

AI is used in various aspects of gene therapy to enhance the development, optimization, and implementation of treatments. Applications include target identification and validation, drug discovery and development, biomarker prediction, viral vector and biomanufacturing optimization, and personalized medicine.

How is AI being used with CRISPR?

AI is being used with CRISPR-based gene therapies to enhance the efficiency and accuracy of gene editing. AI models are being used to predict off-target effects and optimize target selection and delivery for more precise and effective genome modifications.

What Are The Current Challenges with AI?

The current challenges with AI in gene therapy include limited and heterogeneous data, the need for extensive computational power, data privacy and access concerns, labeling and annotation complexities, interpretability of AI models, generalization across datasets, and effective integration and fusion of diverse data modalities.

AI Disclosure: Feature image was generated by an AI image tool MidJourney.

Want to keep up to date on the lastest AI trends in gene therapy development?

Sign up for our newsletter


  1. FDA Approved Cellular and Gene Products. FDA website.  Content current as of 6/30/23. Accessed July 1, 2023.
  2. FDA approves novel gene therapy to treat patients with a rare form of inherited vision loss.  FDA website.  Content current as of 3/16/2018.  Accessed July 1, 2023.
  3. FDA approves innovative gene therapy to treat pediatric patients with spinal muscularatrophy, a rare disease and leading genetic cause of infant mortality.  FDA website.  Content current as of 5/24/2019.  Accessed July 1, 2023.
  4. FDA Approves First Gene Therapy to Treat Adults with Hemophilia B.  FDA website.  Content current as of 11/22/2022.  Accessed July 1, 2023.
  5. FDA D.I.S.C.O. Burst Edition: FDA approval of Adstiladrin (nadofaragene firadenovec-vncg) for patients with high-risk Bacillus Calmette-Guérin unresponsive non-muscle invasive bladder cancer with carcinoma in situ with or without papillary tumors.  FDA website. Content current as of 1/20/23.  Accessed July 1, 2023.
  6. FDA Approves First Gene Therapy for Treatment of Certain Patients with Duchenne Muscular Dystrophy.  Content current as of 6/23/2023.  Accessed July 1, 2023.
  7. FDA Approves First Gene Therapy for Adults with Severe Hemophilia A. Content current as of 6/30/2023.  Accessed July 1, 2023.
  8. Cell and Gene Therapy Fueled Large Surge in Clinical Holds Last Year.  Published March 7, 2022.  Accessed July 1, 2023.
  9. Shortening Gene Therapy Development Time with Strategic Use of Large Language Models.  Form Bio Resource Center. Published March 2, 2023.  Accessed July 1, 2023.
  10. Exploring the Versatality of Large Transformer Architectures and Their Implications for Gene Therapy Developers.  Form Bio Resource Center.  Published March 22, 2023.  Accessed July 1, 2023.
  11. Karel Capek. R.U.R. A Fantastic Melodrama in Three Acts and an Epilogue. Published Jan 1, 1923.
  12. AI and Genomics:  A Constructive 21st Century Affair. Colossal Laboratories and Biosciences website.  Accessed July 1, 2023.
  13. Council of Europe website. History of Artificial Intelligence. Accessed July 1, 2023.
  14. The White House Office 2023 Meetings. May 18-19, 2023.  Day 2: Artificial Intelligence (AI) Enabling Science and AI Impacts on Society. Access July 1, 2023.
  15. Twitter @POTUS. Published June 21, 2023. Accessed July 5, 2023.
  16. FDA Braces for Looming Boom in Cell and Gene Therapy Submissions.  Published June 12, 2023. Accessed July 5, 2023.
  17. What the FDA’s New Safety Draft Guidance for AAV Gene Therapies Means and How to Be Ready.  Published March 15, 2023.  Accessed July 3, 2023.
  18. Cell and Gene Therapy Manufacturing Costs Limiting Access.  Published February 21, 2023.  Accessed July 1, 2023.
  19. Maurya, S., Sarangi, P. & Jayandharan, G.R. Safety of Adeno-associated virus-based vector-mediated gene therapy—impact of vector dose. Cancer Gene Ther 29, 1305–1306 (2022).
  20.  Fourth Boy Dies in Trial of Astellas Gene Therapy Candidate.  Genetic Engineering and Biotechnology News.  Accessed July 1, 2023.
  21. IBM website.  Federated Systems. Last updated April 13, 2021.
  22. Developing Machine Learning Powered Solutions for Cell and Gene Therapy Candidate Validation. Form Bio Resource Center.  Published December 2022.  Accessed July 5, 2023.

More to Explore