AI Platform for DNA Diagnostics, Therapeutics & Ancestry

R. Peterson, J. Kahn
DNA Analtytics,
United States

Keywords: DNA, RNA, Locked Nucleic Acids, drug development, gene diagnostics


DNA Analytics has developed a platform algorithm for scientists and clinical laboratories (AccuMatch) that predicts how strongly a synthetic gene probe (an oligonucleotide) will match its target gene. Using both newly-obtained experimental data and re-analyses of existing data, scientists at the company have trained this algorithm for 13 of the most valuable probe chemistries, including DNA, RNA, and an engineered chemistry called Locked Nucleic Acid (LNA), and considering variation in the position of modified components. DNA is very personal to each of us. It reveals our genetic ancestry and it can impact our health, as in genes that control cancer risk. DNA can be used as a drug, to prevent certain improper genes from acting. What’s important is that DNA tests and drugs rely on synthetic gene probes made of DNA and related chemistries like LNA. DNA is made of four chemical units referred to as A, C, G, & T. When a probe has the right sequence of As, Cs, Gs, & Ts, it will adhere strongly and specifically to its gene. The challenge of making DNA probes for drugs, diagnostics, and ancestry, is that humans have 3.2 billion As, Cs, Gs, & Ts ordered in sequence along our chromosomes. This makes for quite a challenge when developing a gene probe because there are almost as many chromosome match locations for a probe as there are As, Cs, Gs, & Ts. This means 3.2 billion possible matches. The gene probe challenge is to identify a gene probe that strongly adheres to its gene but does not strongly adhere anywhere else along our chromosomes. As you can image, this challenge has not yet been completely solved. Our research found a study of 80,000 experiments that shows what happens when scientists try to develop a synthetic gene probe to match a gene. In this study, scientists failed 17% of the time. Failure in gene matching accounted for 2/3rds of these failures. Failed experiments lead to an expensive revision process of fabricating more probes and testing them in the laboratory. This delays product development by months or years. It may require another clinical trial. And, most importantly, it impedes patient care by delaying reliable clinical tests or gene therapies. Our vision is to remove this bottleneck. The prediction algorithms in use today are based on data from studies published largely between 1995 and 2000. The current prediction benchmark for RNA, from 1998, improved the prior prediction benchmark (the “melting temperature”) by half a degree Celsius. The melting temperature describes binding stability, and an accurate prediction is critical for experimental design, so it is the standard measure of accuracy and precision. AccuMatch’s differentiation in melting temperature prediction is dramatic relative to the 1997 and 1998 studies. Its best-case improvements are up to 7 degrees Celsius, compared to the landmark study of half a degree. AccuMatch provides scientists the right gene match, and reduces the 17% failure rate.