Embedding Cardiovascular Disease Patient Immune Cell Regulatory Network Information into Large Language Models
Xu Ouyang, Laboratory of Tom Hartvigsen, Computer Science
Kyubin Lee, Laboratory of Aakrosh Ratan and Yuh-Hwa Wang, of Biochemistry and Molecular Genetics
Since the beginning of the genomics and functional genomics era, biomedical investigators have been generating vast amounts of data, much of which is accessible through repositories such as Gene Expression Omnibus (GEO). Beyond investigator-led contributions, the NIH has also funded several large-scale data initiatives, including the Human Genome Project (HGP), the Encyclopedia of DNA Elements (ENCODE), and the Trans-Omics for Precision Medicine (TOPMed) program, which includes several projects like the Multi-Ethnic Study of Atherosclerosis (MESA). These initiatives provide a wealth of publicly available data that can be integrated for scientific discovery and hypothesis generation. However, these analyses are often time consuming and labor intensive. Even basic approaches, like differential expression analysis can yield lists of hundreds to thousands of genes, which are not humanly comprehensible. Summaries of these gene lists into pathways yield overly broad terms that don’t include critical molecular-level information contained in the original data. Constructing molecular networks is a common approach to capturing these relationships, but with thousands of nodes (e.g., genes and proteins) and potentially hundreds of thousands of edges (interactions), these networks become complex, making interpretation and navigation difficult. The recent development of large language models (LLMs) offers a promising solution by enabling the synthesis and summarization of complex information in a more interpretable form. By embedding molecular network information derived from datasets like MESA and ENCODE into LLMs, we propose a system where molecular-level information is easily queryable and interpretable. Specifically, we propose to embed data from immune cells in cardiovascular disease (CVD) patients, creating a powerful resource for investigators studying the immune system’s role in CVD.
Cracking the Cardiovagal Code: Novel Molecular Targets for Heart Disease Treatment
Maira Jalil, Laboratory of John Campbell, Biology
Sarah Goggin, Laboratory of Eli Zunder, Biomedical Engineering
Neurons of the parasympathetic nervous system generate cardiovagal tone and regulate resting heart rate. Our project aims to identify conserved markers and therapeutic targets for cardiovagal neurons in mice and humans. Eventually, this will help in increasing cardiovagal tone in humans and will improve clinical outcomes from cardiovascular diseases, including arrhythmia and hypertension.
Characterizing the regulatory role of TET2 in peritoneal cavity B-1 cell subtypes
Emily Dennis, Laboratory of Coleen McNamara, Cardiovascular Medicine
Maria Murach, Laboratory of Stefan Bekiranov, Biochemistry and Molecular Genetics
Emily Dennis of the McNamara Lab and Maria Murach of the Bekiranov Lab are interested in discovering the novel role TET2 plays in regulating B-1 cell immunoglobulin production and its potential for contributing to precision medicine in treatment of atherosclerosis. They are utilizing epigenetic and transcriptomic tools to assess differential gene expression, methylation, and BCR sequence clonality as well as measuring immune cell population changes and immunoglobulin production via flow cytometry and ELISAs. They plan to study if loss of TET2 in B-1 cells reduces atherogenesis, which would indicate that TET2 CHIP should be treated on a cell-specific basis with regards to preventing the acceleration of atherosclerosis. They are grateful for the funding and support from iPRIME.
Machine learning tool for lineage tracing and phenotypic prediction for personalized therapeutics for atherosclerosis
Anita Salamon, Laboratory of Gary Owens, Molecular Physiology and Biological Physics
Victoria Milosek, Laboratory of Gary Owens, Molecular Physiology and Biological Physics
This project aims to develop a deep learning model that predicts the origin and trajectory of smooth muscle cells (SMC) within atherosclerotic plaques to promote plaque stability in patients at high risk of myocardial infarction and stroke. A stable atherosclerotic plaque is characterized by a thick, SMC-rich fibrous cap. These SMC undergo phenotypic transitions, downregulating classic SMC markers such as MYH11 and ACTA2 while markers of other phenotypic states are upregulated. By leveraging large transcriptomics data and SMC-lineage tracing technology from our in vitro and in vivo models, we will create a neural network capable of identifying unique transcriptional signatures of SMC phenotypic states. This model will then be applied to predict cell state and origin in unannotated single-cell RNA-sequencing data from human atherosclerotic lesions, ultimately leading to personalized treatments for advanced atherosclerosis.
Mosaic Loss of the Y Chromosome in Heart Failure with Preserved Ejection Fraction
Jesse Cochran, Laboratory of Ken Walsh, Cardiovascular Medicine
Nivetha Jayakumar, Laboratory of Miaomiao Zhang, Electrical and Computer Engineering
Heart failure is a clinical syndrome characterized by an inability of one or both ventricles to pump a sufficient amount of blood to meet the body’s nutritional and energetic demands. Broadly, heart failure can be categorized into heart failure with reduced ejection fraction (HFrEF) or heart failure with preserved ejection fraction (HFpEF), which are classically associated with systolic dysfunction and diastolic dysfunction, respectively. Unlike HFrEF, the mortality rate of HFpEF has remain uncurbed, owing to the dearth of FDA-approved treatments. The reasons for the current state are multifold; however, they can largely be distilled into 2 looming clinical needs: (1) need to accurately diagnose HFpEF and its underlying pheno-groups and (2) need to further characterize the common molecular underpinnings of HFpEF. Therefore, leveraging the ‘HOO DATA’ study, this proposal will seek to address both of these needs. In the first aim, data from cardiac imaging, past medical history, and laboratory tests will be compiled to create a “digital twin” for each recruited patient. Then, using machine learning, we will identify key features from “digital twins” specific to the pheno-group of HFpEF. As the etiology of HFpEF is complex but unambiguously associated with advanced age and immune system dysfunction, in the second aim of this proposal, we will investigate the significance of the age-associated immunologic phenomenon loss of the Y chromosome (LOY) in the HFpEF pheno-groups. Towards this end, we will leverage clinical samples from the ‘HOO DATA’ study and digital PCR analyses to determine the frequency of LOY in leukocytes and associated LOY with HFpEF characteristics.