Rare diseases remain underdiagnosed and underfunded, particularly in low- and middle-income countries. AI is revolutionising research in rare diseases by bridging data gaps, uncovering genetic insights, and enabling faster diagnoses and more effective treatments.

Searching for answers on rare diseases often leads to dead ends; AI-driven companies are improving diagnostic pathways and care. Photo: Nathana Rebouças
In 2020, amid the global chaos of the Covid-19 pandemic, Stefano Pacifico, like many people, faced unimaginable personal challenges. While mourning the loss of his father to cancer, he grappled with his one-year-old son’s devastating diagnosis of an ultra-rare sarcoma (a type of cancer that begins in the bones) known as malignant ectomesenchymoma (MEM).
After the initial shock, Pacifico, a computer engineer originally from Italy and now based in New York, started to search for more information about the disease. “I realised that, as with many rare diseases, there was so little information available just to help me understand,” he says.
Pacifico quickly came to learn that his situation was not unique and that there were hundreds of thousands of parents, doctors, and healthcare professionals also facing the same information void for rare diseases. Limited clinical trials, conflicting diagnoses, and a lack of standard treatments are some of the most common problems.
Rare disease research is chronically underfunded and there is a lack of sufficient investment from government or industry to find cures. “I realised that what I needed was out there, but it was buried—difficult to find and even harder to connect to,” Pacifico explains.
This experience became a turning point for Pacifico and the catalyst for the development of Epistemic AI, a company he co-founded with David Heeger, a professor of neuroscience at New York University. It set out to develop a platform to simplify and accelerate access to biomedical knowledge, which is a key element of research into rare diseases.
Five years later, Pacifico’s son is in good health and Epistemic AI has just received a US$4 million minority investment to scale up its activities. Its knowledge discovery platform is up and running, helping researchers make sense of hundreds of repositories of knowledge including regulatory documents, publications, clinical trials, and molecular biology databases. Users can quickly interrogate vast amounts of knowledge, uncovering hidden connections at a reasonable cost.
Tracking rare diseases data in one platform
Epistemic AI addresses one of the most pressing issues in rare disease research: the fragmented and inaccessible nature of biomedical data. Many rare diseases are caused by gene mutations which are reported in scientific journals or at conferences. The scientific community may become aware of many of these gene mutations, and pharmaceutical companies may target others through drugs—but others are stored away in a database and ultimately forgotten.
“We’re like Bloomberg for life sciences. Just as Bloomberg connects data for finance professionals, we provide tools and knowledge for researchers across the entire spectrum of biomedical discovery.”
Scientists, biopharma companies, and rare disease advocacy organisations struggle to keep track of information scattered across countless databases. Epistemic AI addresses the problem by aggregating and mapping data into a cohesive, user-friendly platform.
In one instance, a database for gene mutations for one rare disease was added to the Epistemic AI platform. In the process of integrating this database, the team discovered that some of the mutations were labeled incorrectly, and others were not up-to-date with other databases or new discoveries.
“What’s critical, especially for rare diseases, is that nobody knows in advance what connections exist that could be useful for the patient, even more so when the data quality suffers from age or other defects,” says Pacifico, who worked at the financial news organisation Bloomberg before co-founding Epistemic AI.
“We’re like Bloomberg for life sciences. Just as Bloomberg connects data for finance professionals, we provide tools and knowledge for researchers across the entire spectrum of biomedical discovery.”
One powerful application of Epistemic AI is drug repurposing, which is especially critical in rare diseases. The platform enables researchers to identify existing drugs that might be repurposed for new indications and understand potential toxicity and adverse events linked to an intervention.
Epistemic AI also maps links between drug targets and disease mechanisms, helping researchers identify promising therapies and detect trends across clinical trials and publications to understand emerging research landscapes.
The future integration of patient registries into the platform could further enhance its utility, allowing people researching rare diseases to combine real-world patient data with genomic and clinical information.
Reducing time to rare disease diagnosis
Meanwhile, Saventic Health, a Polish med-tech startup, is already leveraging cutting-edge AI to access electronic health records, lab results, imaging data, and clinical findings. This information helps to reduce diagnostic delays, improve access to treatments, and transform the lives of people with rare diseases.
Saventic Health has developed an advanced platform that combines AI algorithms, natural language processing (NLP), and medical data analytics to identify patterns and markers associated with rare diseases. The insights generated are then used to assign risk scores, highlight potential diagnoses, and suggest next steps, such as additional testing or specialist referrals.
“Patients with rare diseases face an average of five years and consultations with eight doctors before receiving a diagnosis,” explains Maciek Klein, global chief business officer at Saventic Health. “Our AI system reduces this time by a factor of 10 to 15, offering critical improvements in identifying and treating these conditions.”

Saventic Health is leveraging AI to identify markers of rare diseases. Photo: Maciek Klein
The platform operates as a standalone system, installed directly within hospital infrastructure. It is currently being used in 30 hospitals across Poland, Germany, France, Canada, and Brazil. “To comply with strict data privacy laws, we bring our own computer to the hospitals, conduct screenings on-site, and ensure that all data remains securely within the hospital’s IT systems,” Klein says.
“We don’t collect or store any data ourselves and our goal is to provide a powerful diagnostic tool while respecting the trust and security expectations of healthcare providers and their patients.”
The screening process involves pseudonymised data—data where personal information has been removed—that is extracted by the hospital IT team and uploaded to Saventic Health’s computer. The AI processes the data locally, generating reports that identify high-risk patients for more than 30 rare diseases, including rare blood disorders like Castleman disease, lung conditions such as pulmonary arterial hypertension, and lysosomal storage disorders.
Once the analysis is complete, the data remains in the hospital’s system, and any temporary files on Saventic Health’s computer are deleted. This decentralised approach reassures both hospitals and patients that sensitive medical information is never stored or transmitted externally.
“Patient privacy is our top priority,” Klein adds. “We don’t collect or store any data ourselves and our goal is to provide a powerful diagnostic tool while respecting the trust and security expectations of healthcare providers and their patients.”
Identifying higher risk of Fabry disease
Fabry disease is a rare lysosomal storage disorder—a genetic condition that causes a buildup of toxic materials in the body’s cells—that can lead to kidney failure, heart failure, and stroke, often at an early age. It can present with nonspecific symptoms such as fatigue, chronic pain, or kidney dysfunction, making timely diagnosis difficult. To address this challenge, Saventic Health developed a highly accurate NLP system that analyses lab results; this analysis supports better decision making for Fabry disease diagnosis.
The system extracts and analyses Fabry disease-specific characteristics from electronic health records, including laboratory results, ICD-10 codes (alphanumeric identifiers used globally to classify diseases and health conditions), and medical histories. NLP identifies clinical features associated with Fabry disease, which are then scored based on their relevance. The sum of these scores generates a ‘Fabry disease risk score’, flagging high-risk people for further review. Physicians can then review these flagged cases and decide whether additional testing, such as genetic assays (DNA-based analyses to detect disease-causing mutations), is necessary.
In one case, the system identified a person in Poland with a high risk of Fabry disease. A DBS assay (dried blood spot test) confirmed diagnosis, leading to further screenings that identified additional undiagnosed cases within the person’s family. This ensured early treatment, which significantly improved outcomes for everyone affected.

Saventic Health’s work in Brazil has paved the way for further outreach in LMICs. Photo: Maciek Klein
This method has so far helped 10 people receive a confirmed Fabry disease diagnosis. “Fabry disease, like many rare diseases, often hides behind common symptoms, but our AI cuts through the noise to prioritise those in need of further attention,” Klein says.
Diagnosing Castleman disease in Brazil
Saventic Health’s work in Brazil has paved the way for further outreach in low- and middle-income countries (LMICs), where diagnostic resources are often limited. It has applied its AI expertise to diagnose idiopathic multicentric Castleman disease (iMCD), a condition that involves multiple regions of enlarged lymph nodes, inflammatory symptoms, and problems with organ function.
This ultra-rare lymphoproliferative disorder, with a prevalence of just 6.9–9.7 cases per million, according to a study conducted in 2017 and 2018, is notoriously difficult to diagnose due to its rarity and a lack of awareness among physicians. In Brazil, the median time between symptom onset and diagnosis for iMCD is approximately 18 months.
Out of 594,953 patient charts screened, the system identified 102 patients...three were flagged for further evaluation, potentially leading to a diagnosis.
Saventic Health’s AI-driven screening tool analysed electronic health records from a tertiary hospital. The algorithm used advance language processing tools to scan medical texts and identify phrases that suggested symptoms of iMCD. Critical markers, such as lymphadenopathy—swelling of the lymph nodes—were flagged, and patients were ranked based on their likelihood of having iMCD.
Out of 594,953 patient charts screened, the system identified 102 people with clinical or laboratory features suggestive of iMCD. Among these, three were flagged for further evaluation, potentially leading to a diagnosis.
“This project highlights the potential of AI in identifying rare disease patients in LMICs,” Klein says, explaining that the startup offers its tools at no cost to hospitals and patients as it is financially backed by global pharmaceutical companies across Europe, North America, and Asia. “By integrating algorithms like this, we can support healthcare systems struggling with limited resources and improve diagnostic efficiency for conditions like iMCD.”
Creating rich datasets for African populations
Most rare disease research has excluded African populations. Yet Africa houses many consanguineous populations, and, in some cases, even higher incidences of certain rare diseases compared to other populations.
By tapping into Africa’s immense yet little-understood genetic diversity, a Nigeria-based startup, Syndicate Bio, is paving the way for new advancements in rare disease research. The company is building genetic, protein, and clinical datasets sourced from some of the most diverse populations in the world.
“Studying rare diseases across Africa’s genetically diverse population groups may provide novel insights into our understanding of the biology of some of these diseases and even help us find cures,” says CEO and founder Abasi Ene-Obong. “Our goal is to unlock the untapped potential of Africa’s genetic diversity, enabling breakthroughs that benefit people worldwide.”
Africa accounts for less than 2% of global genetic datasets, a disparity rooted in limited infrastructure and diagnostic capabilities. Syndicate Bio addresses this gap by combining data, including electronic health records and surveys, to create rich datasets. “We merge clinical [such as electronic health records] and genetic data to identify disease-causing variants, which can drive both diagnosis and treatment development,” Ene-Obong explains.
Syndicate Bio analyses these datasets using AI and machine learning to uncover patterns that might otherwise go undetected. “AI helps us navigate these barriers and find insights faster, enabling precision medicine and better outcomes,” Ene-Obong says.
While Syndicate Bio is currently focused on cancers and other non-communicable diseases (NCDs) with larger patient cohorts, he says the company is laying the foundation for rare disease research. “We’re building a rare disease focus by identifying variants and planning targeted initiatives.”
As the datasets, along with Epistemic AI and Saventic Health’s platforms, continue to grow, they bring hope for a future where people with rare diseases receive quicker diagnoses and more effective treatments, helping to bridge critical gaps in healthcare.
Comments