Endometriosis is a complex gynecological disorder affecting 6-10% of reproductive-aged women, characterized by significant diagnostic delays of 7-12 years.
Endometriosis is a complex gynecological disorder affecting 6-10% of reproductive-aged women, characterized by significant diagnostic delays of 7-12 years. This article explores the development and application of polygenic risk scores (PRS) for endometriosis subphenotypes to enable earlier detection and personalized treatment approaches. We review foundational genetic discoveries from genome-wide association studies that have identified multiple risk loci, particularly for moderate-to-severe disease. The content covers methodological advances in PRS construction, current challenges in clinical prediction, and emerging strategies that integrate epigenetic data such as methylation risk scores. For researchers and drug development professionals, we provide a comprehensive analysis of validation frameworks and comparative performance against traditional risk factors, highlighting future directions for implementing genetic risk stratification in clinical practice and therapeutic development.
Endometriosis is a complex, chronic inflammatory gynecological condition characterized by the presence of endometrial-like tissue outside the uterine cavity, affecting approximately 10% of women of reproductive age worldwide [1]. Its etiology involves a multifactorial interplay of genetic, hormonal, immune, and environmental factors. Establishing its heritability and genetic architecture is a critical foundation for developing polygenic risk scores (PRS) capable of stratifying disease risk and subphenotypes, ultimately advancing personalized medicine approaches for this heterogeneous condition [1] [2].
Family and twin studies provide the fundamental evidence for a significant genetic component in endometriosis. Family studies demonstrate a five- to seven-fold increased risk for first-degree relatives of affected individuals compared to the general population [2]. Twin studies reveal higher concordance rates in monozygotic twins compared to dizygotic twins, with estimated heritability reaching up to 50% based on genome-wide association studies (GWAS) and linkage analyses [1] [2]. Furthermore, familial cases often present with an earlier onset and more severe symptoms than sporadic cases, suggesting a potentially greater genetic burden in these families [2].
Table 1: Key Evidence of Heritability in Endometriosis
| Evidence Type | Key Finding | Implication for Genetics |
|---|---|---|
| Family Studies | 5-7x increased risk for first-degree relatives [2] | Strong evidence for inherited genetic components |
| Twin Studies | Higher concordance in identical twins; heritability ~50% [1] [2] | Indicates significant genetic contribution, separate from shared environment |
| Familial Case Presentation | Earlier onset and more severe symptoms [2] | Suggests a higher genetic burden or different genetic architecture |
GWAS have successfully identified multiple common, low-penetrance genetic variants associated with endometriosis risk. These studies have identified single nucleotide polymorphisms (SNPs) in genes often involved in sex steroid hormone pathways, including WNT4, VEZT, GREB1, ESR1, and FSHB [1] [2]. These common variants individually confer modest risk increases, but in combination, they account for a portion of the disease's heritability, supporting the polygenic nature of endometriosis.
Despite the success of GWAS, a substantial fraction of heritability remains unexplained, prompting investigations into the role of rare, higher-penetrance variants, particularly in multi-affected families. A recent exploratory whole-exome sequencing (WES) study of a multigenerational family with multiple affected members identified 36 co-segregating rare variants [2]. The top candidate genes from this study were LAMB4 (c.3319G>A, p.Gly1107Arg) and EGFL6 (c.1414G>A, p.Gly472Arg), which are associated with cancer growth and tissue remodeling. Variants in NAV3, ADAMTS18, SLIT1, and MLH1 were also identified as potential contributors, supporting a polygenic or oligogenic model where multiple rare variants act synergistically to increase disease susceptibility in familial cases [2].
Table 2: Summary of Key Genetic Findings in Endometriosis
| Genetic Element | Examples | Method of Discovery | Biological Implication |
|---|---|---|---|
| Common Variants (SNPs) | WNT4, VEZT, GREB1, ESR1, FSHB [1] [2] | GWAS | Hormone signaling, cellular growth and maintenance |
| Rare Variants (Candidate) | LAMB4, EGFL6 [2] | Whole-Exome Sequencing (Familial) | Cell adhesion, extracellular matrix formation, angiogenesis |
| Epigenetic Alterations | DNA methylation of estrogen metabolism genes; miRNA dysregulation [1] [2] | Epigenomic Studies | Altered gene expression contributing to estrogen dominance and progesterone resistance |
Objective: To identify rare, penetrant genetic variants that co-segregate with endometriosis in multi-affected families.
Workflow:
Objective: To identify common genetic variants associated with endometriosis risk and generate summary statistics for polygenic risk score calculation.
Workflow:
PRS = β₁SNP₁ + β₂SNP₂ + ... + βₙSNPₙ, where β is the effect size from the GWAS [3].Robust PRS analysis requires stringent QC on both base (GWAS) and target datasets [3]:
Table 3: Essential Research Materials and Reagents
| Item / Reagent | Function / Application | Example Use in Context |
|---|---|---|
| Illumina WGS/WES Platform | High-throughput sequencing to identify genetic variants. | Germline variant discovery in multi-affected families [2]. |
| SOMAscan Proteomic Platform | Multiplexed immunoaffinity assay to measure plasma protein levels (pQTLs). | Identifying protein biomarkers and therapeutic targets via Mendelian randomization [4]. |
| Human R-Spondin3 (RSPO3) ELISA Kit | Quantitatively measure protein concentration in plasma. | Validating RSPO3 as a potential therapeutic target in patient plasma samples [4]. |
| Galaxy Platform | Web-based platform for accessible, reproducible bioinformatic analysis. | Processing WES data (read mapping, duplicate removal, variant calling) [2]. |
| PLINK Software | Whole-genome association analysis toolset. | Performing LD clumping and basic QC for PRS calculation [3]. |
Objective: To assess causal relationships between putative risk factors (e.g., plasma proteins, metabolites) and endometriosis using genetic variants as instrumental variables.
Workflow:
Machine learning and deep learning models are increasingly applied to enhance genomic prediction of complex diseases like endometriosis. These models can capture non-linear effects and complex interactions between genetic variants that are missed by traditional linear PRS methods [5] [3]. For instance, a multi-variant deep neural network (DNN) approach has been explored to improve the genomic prediction of endometriosis, demonstrating the potential of AI to handle the high-dimensional nature of genomic data and integrate it with other clinical risk factors for more accurate risk stratification [5].
Evidence from twin and family studies unequivocally establishes a substantial genetic component in endometriosis, with a heritability estimate of approximately 50%. Its genetic architecture is complex, involving a spectrum of variants from common, low-penetrance SNPs identified by GWAS to rare, potentially higher-penetrance variants discovered in familial cases. Methodologies like family-based WES, large-scale GWAS, and advanced analytical frameworks such as Mendelian randomization and AI-driven modeling are critical for dissecting this burden. A comprehensive understanding of this genetic landscape is the essential foundation for developing next-generation polygenic risk scores that can stratify subphenotypes and drive forward personalized therapeutic strategies and preventive care for endometriosis.
Endometriosis is a common, estrogen-dependent inflammatory gynecological disorder that affects approximately 10% of women of reproductive age, representing over 190 million women worldwide [6] [7]. The disease is characterized by the presence of endometrial-like tissue outside the uterine cavity and is associated with chronic pelvic pain, reduced fertility, and decreased quality of life [8]. The heritability of endometriosis is estimated to be 47-52%, indicating a strong genetic component [8] [9]. Genome-wide association studies (GWAS) have emerged as a powerful hypothesis-free approach for identifying common genetic variants underlying complex diseases like endometriosis. This application note summarizes key GWAS discoveries of genetic loci associated with overall endometriosis risk, framed within the context of developing polygenic risk scores for endometriosis subphenotypes.
Over the past decade, multiple GWAS and meta-analyses have substantially expanded our understanding of the genetic architecture of endometriosis. The largest initial GWAS meta-analysis published in 2017 analyzed 17,045 cases and 191,596 controls of European and Japanese ancestry, identifying 19 independent single nucleotide polymorphisms (SNPs) robustly associated with endometriosis risk [10]. These SNPs together explained approximately 5.19% of the disease variance, highlighting the highly polygenic nature of endometriosis [10]. More recent combinatorial analytics approaches have identified additional multi-SNP disease signatures, comprising 2,957 unique SNPs in combinations of 2-5 SNPs, that were associated with increased prevalence of endometriosis [6].
Table 1: Key Endometriosis Risk Loci Identified through GWAS
| Genomic Region | Lead SNP | Nearest Gene(s) | Reported OR | P-value | Primary Biological Pathway |
|---|---|---|---|---|---|
| 1p36.12 | rs7521902 | WNT4 | 1.11 | 1.8 × 10-15 | Reproductive development, hormone signaling |
| 2p25.1 | rs13391619 | GREB1 | 1.09 | 4.5 × 10-8 | Estrogen regulation, cell proliferation |
| 6q25.1 | rs71575922 | SYNE1, ESR1 | 1.11 | 2.02 × 10-8 | Sex steroid hormone signaling |
| 7p15.2 | rs12700667 | Intergenic | 1.12 | 1.6 × 10-9 | Inflammatory response |
| 9p21.3 | rs1537377 | CDKN2B-AS1 | 1.14 | 1.5 × 10-8 | Cell cycle regulation |
| 12q22 | rs10859871 | VEZT | 1.12 | 4.7 × 10-15 | Cell adhesion |
| 11p14.1 | rs74485684 | FSHB | 1.11 | 2.00 × 10-8 | Gonadotropin hormone production |
| 2q35 | rs1250241 | FN1 | 1.23 | 2.99 × 10-9 | Tissue remodeling, fibrosis |
| 6q25.1 | rs1971256 | CCDC170 | 1.09 | 3.74 × 10-8 | Estrogen receptor signaling |
Notably, most endometriosis risk loci discovered through GWAS are located in non-coding regions of the genome, suggesting they likely influence gene regulation rather than protein structure [8]. Integration of GWAS findings with expression quantitative trait loci (eQTL) data from physiologically relevant tissues (uterus, ovary, vagina, colon, ileum, and peripheral blood) has provided insights into the functional consequences of these variants [7].
Objective: To identify genetic variants associated with endometriosis risk through a genome-wide case-control association study.
Materials:
Procedure:
Objective: To identify multi-SNP combinations associated with endometriosis risk using combinatorial analytics.
Materials:
Procedure:
Integration of GWAS findings with functional genomic data has elucidated key biological pathways involved in endometriosis pathogenesis. The diagram below illustrates the major signaling pathways through which GWAS-identified genetic loci contribute to endometriosis risk.
Diagram 1: Signaling pathways connecting GWAS-identified loci to endometriosis pathogenesis. Genetic variants influence disease risk through hormonal signaling, immune regulation, and cellular processes.
Table 2: Essential Research Reagents for Endometriosis Genetic Studies
| Reagent/Material | Function/Application | Example Specifications |
|---|---|---|
| High-Density SNP Arrays | Genome-wide genotyping | Illumina Global Screening Array (700,000+ markers) |
| Whole Genome Sequencing Kits | Comprehensive variant detection | Illumina NovaSeq, PacBio HiFi for structural variants |
| DNA Extraction Kits | High-quality DNA isolation from blood/tissue | QIAamp DNA Blood Maxi Kit (Qiagen) |
| eQTL Reference Datasets | Functional annotation of risk variants | GTEx v8 ( uterus, ovary, blood tissues) |
| Pathway Analysis Software | Biological interpretation of GWAS hits | GSEA-MSigDB, Ingenuity Pathway Analysis |
| Genotype Imputation Services | Increased SNP coverage from array data | Michigan Imputation Server (TOPMed reference) |
| Cell Line Models | Functional validation of risk genes | Endometrial stromal cells, epithelial organoids |
| CRISPR-Cas9 Systems | Gene editing for functional studies | Lentiviral CRISPR libraries for high-throughput screening |
The GWAS discoveries summarized herein provide the foundation for developing polygenic risk scores (PRS) for endometriosis. A PRS derived from 14 genome-wide significant variants has demonstrated association with endometriosis in multiple cohorts, with odds ratios ranging from 1.28 to 1.59 per standard deviation increase in PRS [11]. Importantly, the PRS was associated with all major subtypes of endometriosis (ovarian, infiltrating, and peritoneal) but not with adenomyosis, suggesting specificity for endometriosis rather than general gynecological pathology [11].
Recent PRS-phenome-wide association studies have revealed pleiotropic effects of endometriosis genetic risk, including associations with lower testosterone levels, suggesting potential causal relationships [9]. Combinatorial analytics approaches have identified additional multi-SNP signatures that show high reproducibility (73-85%) across diverse ancestries, providing enhanced resolution for subtype-specific genetic architecture [6].
The continuing expansion of GWAS discoveries, including recent multi-ancestry studies encompassing ~1.4 million women, will further refine PRS development and enable more precise stratification of endometriosis subphenotypes [12]. Integration of functional genomic data with GWAS findings will facilitate the translation of genetic discoveries into pathogenic mechanisms and therapeutic targets, ultimately enabling precision medicine approaches for this complex disorder.
Endometriosis is a heterogeneous gynecological condition affecting approximately 10% of reproductive-aged women globally, characterized by the presence of endometrial-like tissue outside the uterine cavity [1] [13]. The disease manifests in distinct subphenotypes including ovarian endometriosis (endometriomas), deep infiltrating endometriosis (DIE), and superficial peritoneal endometriosis (SPE), each demonstrating unique clinical presentations and molecular characteristics [14]. Understanding the genetic architecture underlying these subphenotypes is crucial for developing polygenic risk scores (PRS) with improved predictive accuracy and clinical utility. This application note synthesizes current evidence on subphenotype-specific genetic associations and provides methodological frameworks for PRS development in endometriosis research.
Endometriosis subphenotypes are classified based on lesion location, invasiveness, and histological features [14]. Ovarian endometriosis presents as cystic lesions (endometriomas) containing dark, chocolate-colored fluid. Deep infiltrating endometriosis penetrates more than 5 mm beneath the peritoneal surface and can involve uterosacral ligaments, rectovaginal septum, bowel, bladder, and ureters. Superficial peritoneal endometriosis appears as superficial implants on pelvic peritoneum. A recent classification system stages genital endometriosis from minimal (Stage I) to severe (Stage IV) based on lesion number, penetration depth, adhesion presence, and concomitant adenomyosis [14].
Table 1: Clinical and Pathological Features of Endometriosis Subphenotypes
| Subphenotype | Lesion Characteristics | Common Locations | Invasiveness | Associated Symptoms |
|---|---|---|---|---|
| Ovarian Endometriosis | Cystic lesions (endometriomas) containing old blood | Ovaries | Non-infiltrating, expansive growth | Pelvic pain, dysmenorrhea, infertility |
| Deep Infiltrating Endometriosis (DIE) | Solid lesions with >5mm penetration depth | Rectovaginal septum, uterosacral ligaments, bowel, bladder, ureters | Highly infiltrative | Severe chronic pelvic pain, dyspareunia, dyschezia, infertility |
| Superficial Peritoneal Endometriosis (SPE) | Superficial implants, powder-burn lesions, red vesicles | Pelvic peritoneum, cul-de-sac | Superficial, non-infiltrating | Dysmenorrhea, mild pelvic pain, often asymptomatic |
Large-scale genetic studies have revealed significant differences in the genetic architecture of endometriosis subphenotypes. A landmark GWAS meta-analysis comprising 60,674 cases and 701,926 controls of European and East Asian ancestry identified 42 genome-wide significant loci comprising 49 distinct association signals [15]. Critically, this study demonstrated that ovarian endometriosis has a different genetic basis than superficial peritoneal disease, with distinct risk loci and effect sizes [15]. The identified signals explain up to 5.01% of disease variance, a threefold increase from previous studies, highlighting the importance of subphenotype stratification in genetic analyses.
The genetic heritability of endometriosis is estimated at approximately 50%, with common genetic variation accounting for 26% of cases [15]. Key implicated genes include WNT4, VEZT, GREB1, FN1, CCDC170, SYNE1, and ESR1, which play roles in sex hormone signaling, cell adhesion, proliferation, and inflammation [1] [15]. Deep infiltrating endometriosis demonstrates stronger genetic correlations with pain-related conditions including migraine, back pain, and multi-site pain, suggesting genetic contributions to central nervous system sensitization in chronic pain development [15].
Table 2: Selected Genetic Loci Associated with Endometriosis Subphenotypes
| Gene/Locus | Reported Function | Ovarian Endometriosis | Deep Infiltrating Endometriosis | Superficial Peritoneal Endometriosis |
|---|---|---|---|---|
| WNT4 | Sex development, estrogen signaling | Strong association | Moderate association | Weak association |
| VEZT | Cell adhesion | Strong association | Strong association | Moderate association |
| GREB1 | Estrogen-regulated growth | Strong association | Moderate association | Weak association |
| FN1 | Extracellular matrix organization | Moderate association | Strong association | Limited data |
| ESR1 | Estrogen receptor | Moderate association | Strong association | Moderate association |
| CCDC170 | Nuclear envelope organization | Strong association | Limited data | Limited data |
Subphenotype-specific biomarker profiles reflect underlying genetic differences. Aromatase (CYP19A1) shows increased expression in endometriotic tissues with a diagnostic sensitivity of 79% and specificity of 89% [1]. Progesterone resistance, characterized by reduced progesterone receptor-B (PR-B) expression and disrupted signaling, is particularly prominent in deep infiltrating lesions [1] [13]. Inflammatory biomarkers including macrophage migration inhibitory factor (MIF), interleukin-1 (IL-1), MMP-1, MMP-2, and MMP-3 demonstrate subphenotype-specific expression patterns, with elevated levels in deep infiltrating lesions contributing to tissue remodeling and invasion [1] [16].
Matrix metalloproteinases (MMPs) show distinct activity across subphenotypes, with pro-MMP-2 activity significantly higher in endometriotic lesions compared to eutopic endometrium and control tissue [16]. MMP-1 and MMP-3 protein levels are similarly elevated in lesions, creating a tissue microenvironment conducive to ectopic implantation and lesion establishment through extracellular matrix remodeling [16].
Objective: To identify genetic variants associated with specific endometriosis subphenotypes through large-scale GWAS meta-analysis.
Materials:
Methodology:
Expected Outcomes: Identification of subphenotype-specific risk loci, calculation of subtype-specific heritability, and genetic correlation analyses between subphenotypes and related traits.
Objective: To construct and validate subphenotype-specific polygenic risk scores for endometriosis classification and risk prediction.
Materials:
Methodology:
Expected Outcomes: Subphenotype-specific PRS with improved predictive accuracy compared to general endometriosis PRS, assessment of clinical utility for risk stratification and early intervention.
Objective: To integrate genomic, transcriptomic, and epigenomic data for comprehensive molecular characterization of endometriosis subphenotypes.
Materials:
Methodology:
Expected Outcomes: Comprehensive molecular maps of endometriosis subphenotypes, identification of subtype-specific regulatory mechanisms, and functional validation of GWAS loci.
The pathophysiology of endometriosis subphenotypes involves dysregulation of multiple signaling pathways. Ovarian endometriosis demonstrates prominent abnormalities in estrogen biosynthesis with overexpression of aromatase (CYP19A1) and steroidogenic factor-1 (SF-1) [1]. Deep infiltrating endometriosis shows activation of invasion-promoting pathways including MMP-mediated extracellular matrix degradation, epithelial-mesenchymal transition, and neuroangiogenesis [16]. Progesterone resistance, characterized by reduced PR-B expression and altered FKBP4 signaling, is common across subphenotypes but most pronounced in deep infiltrating disease [1] [13].
Table 3: Essential Research Reagents for Endometriosis Subphenotype Studies
| Reagent/Category | Specific Examples | Function/Application | Subphenotype Relevance |
|---|---|---|---|
| Genotyping Arrays | Global Screening Array, UK Biobank Axiom Array | Genome-wide variant genotyping | All subphenotypes - genetic association studies |
| Sequencing Kits | Illumina NovaSeq, PacBio HiFi, Oxford Nanopore | Whole genome, transcriptome, epigenome sequencing | All subphenotypes - comprehensive molecular profiling |
| Antibodies for IHC | Anti-aromatase (CYP19A1), anti-PR-B, anti-MMP-2, anti-CD56 | Protein localization and quantification in tissues | Subphenotype-specific protein expression validation |
| Cell Culture Media | Stromal cell media, epithelial organoid culture systems | In vitro modeling of endometriosis lesions | Subphenotype-specific cellular behavior studies |
| Cytokine Assays | Luminex multiplex panels, ELISA kits for IL-1, MIF, IL-6 | Quantification of inflammatory biomarkers | Subphenotype-specific inflammatory microenvironment |
| DNA/RNA Extraction Kits | QIAamp DNA FFPE, RNeasy, MagMAX for blood | Nucleic acid isolation from various sample types | Multi-omics analyses across subphenotypes |
| qPCR Reagents | TaqMan assays, SYBR Green master mixes | Gene expression validation | Candidate gene verification in subphenotypes |
| Methylation Arrays | Infinium MethylationEPIC | Genome-wide DNA methylation profiling | Epigenetic regulation in subphenotypes |
Subphenotype-specific genetic analysis represents a paradigm shift in endometriosis research, moving beyond the traditional one-size-fits-all approach. The distinct genetic architectures of ovarian, deep infiltrating, and superficial peritoneal endometriosis underscore the necessity for stratified approaches in both basic research and clinical translation. Future directions should focus on: (1) expanding diverse ancestral representation in genetic studies, (2) integrating multi-omics data to functionalize genetic associations, (3) developing refined PRS with improved predictive accuracy across subphenotypes, and (4) translating genetic findings into subtype-specific therapeutic strategies. These advances will ultimately enable precision medicine approaches to endometriosis diagnosis, prevention, and treatment.
Endometriosis (EM) and Adenomyosis (AM) are prevalent gynecological disorders that pose significant diagnostic and therapeutic challenges in clinical practice. While both conditions share common symptoms, including chronic pelvic pain and infertility, they are recognized as distinct pathological entities. Endometriosis is characterized by the presence of endometrial-like tissue outside the uterine cavity, whereas adenomyosis involves the invasion of endometrial tissue into the myometrium.
Understanding the genetic architecture of these conditions is crucial for developing precise diagnostic tools and targeted therapies. This application note explores the fundamental genetic distinctions between endometriosis and adenomyosis, with a specific focus on implications for polygenic risk score (PRS) development for endometriosis subphenotypes. We present comprehensive genetic association data, detailed experimental protocols for analysis, and visualization of key biological pathways to advance research in this field.
Recent advances in genetic research have revealed substantial differences in the genetic architecture of endometriosis and adenomyosis. A landmark multi-ancestry genome-wide association study (GWAS) of approximately 1.4 million women, including 105,869 cases, identified 80 genome-wide significant associations, with 37 representing novel discoveries [17]. Crucially, this study identified five loci representing the first genetic variants ever reported for adenomyosis, providing initial insights into its unique genetic underpinnings [17].
Table 1: Summary of Key Genetic Associations for Endometriosis and Adenomyosis
| Genetic Feature | Endometriosis | Adenomyosis |
|---|---|---|
| Number of GWAS loci | 80 (37 novel) in recent large study [17] | 5 first-ever variants reported [17] |
| Heritability | 47-51% [9] | Not well established |
| PRS performance | OR = 1.57-1.59 per SD increase [11] | Not associated with endometriosis PRS [11] |
| Key pathways | Immune regulation, tissue remodeling, cell differentiation [17] | Shared and distinct mechanisms from endometriosis [18] |
| Multi-omics integration | Transcriptomic, epigenetic, and proteomic regulation across tissues [17] | Limited data available |
Combinatorial analytics applied to UK Biobank and All of Us datasets have further elucidated these distinctions, revealing distinct mechanistic drivers for each condition, including multiple genes shared across both diseases and dozens of novel adenomyosis-associated genes not previously reported in endometriosis GWAS [18] [19]. This research supports the development of non-invasive differential diagnostic tools to improve patient triage across overlapping pelvic pain conditions [19].
The functional impact of endometriosis-associated genetic variants exhibits notable tissue specificity. Analysis of 465 endometriosis-associated GWAS variants using GTEx v8 database revealed that regulatory effects differ significantly across tissues [7]. In reproductive tissues (ovary, uterus, vagina), endometriosis-associated variants predominantly regulate genes involved in hormonal response, tissue remodeling, and cell adhesion [7]. In contrast, in peripheral blood and intestinal tissues, these variants primarily influence immune and epithelial signaling genes [7].
Key regulators such as MICB, CLDN23, and GATA4 have been consistently linked to hallmark pathways including immune evasion, angiogenesis, and proliferative signaling in endometriosis [7]. The tissue-specific regulatory patterns of these variants provide crucial insights for understanding the pathophysiology of endometriosis and its distinction from adenomyosis.
Objective: To identify genetic variants associated with endometriosis and adenomyosis risk across diverse ancestries.
Materials:
Procedure:
Association Analysis
Meta-Analysis
Fine-Mapping and Colocalization
Validation: Replicate findings in independent cohorts; perform functional validation through in vitro and in vivo models.
Objective: To construct and validate polygenic risk scores for endometriosis subphenotypes.
Materials:
Procedure:
Bayesian Polygenic Scoring
PRS Calculation
Validation
Application: The endometriosis PRS demonstrates significant association with all disease subtypes (ovarian OR = 1.72, infiltrating OR = 1.66, peritoneal OR = 1.51) but shows no association with adenomyosis, supporting distinct genetic architectures [11].
Genetic research has revealed that endometriosis risk variants exert their effects through transcriptomic, epigenetic, and proteomic regulation across multiple tissues [17]. These mechanisms converge on pathways involved in immune regulation, tissue remodeling, and cell differentiation [17].
Table 2: Key Pathways and Biological Processes in Endometriosis and Adenomyosis
| Pathway Category | Specific Pathways | Implications |
|---|---|---|
| Immune regulation | Antigen processing and presentation, cytokine signaling | Altered immune surveillance, chronic inflammation [17] [7] |
| Tissue remodeling | Extracellular matrix organization, angiogenesis | Lesion establishment and growth [17] |
| Cell differentiation | Epithelial-mesenchymal transition, stem cell pathways | Tissue plasticity, invasive potential [17] |
| Metabolic pathways | Linoleic acid metabolism, glycerophospholipid metabolism | Shared alterations in EM and AM [20] |
| Hormone response | Estrogen receptor signaling, progesterone resistance | Hormone dependency of lesions [7] |
Multi-omics studies integrating metabolomic and microbiome profiling have identified distinct metabolic and microbial signatures in both conditions. Specific pathways, including linoleic acid metabolism and glycerophospholipid metabolism, show significant alterations in both endometriosis and adenomyosis [20]. Notably, metabolites such as phosphatidylcholine 40:8 [PC(40:8)] exhibit marked changes in both conditions, suggesting some shared pathological features despite distinct genetic architectures [20].
The following diagram illustrates the integrated multi-omics approach to understanding endometriosis pathogenesis:
Diagram 1: Multi-omics integration in endometriosis research. This workflow illustrates how different data types inform our understanding of biological processes and clinical applications.
A significant finding from PRS phenome-wide association studies is the association between genetic liability to endometriosis and lower testosterone levels, with Mendelian randomization analyses suggesting that lower testosterone may be causal for both endometriosis and clear cell ovarian cancer [9]. This highlights the importance of hormonal pathways in endometriosis pathogenesis and the potential for endocrine-focused interventions.
The tissue-specific regulatory patterns of endometriosis-associated variants further emphasize the role of hormonal responses. In reproductive tissues, these variants predominantly regulate genes involved in hormonal response, creating a permissive environment for lesion establishment and growth [7].
Table 3: Essential Research Reagents for Genetic Studies of Endometriosis and Adenomyosis
| Reagent/Category | Specific Examples | Application and Function |
|---|---|---|
| Genotyping platforms | Illumina Global Screening Array, Affymetrix Axiom Biobank array | Genome-wide SNP genotyping for association studies |
| Bioinformatics tools | PLINK, METAL, GCTB, PRSice, SBayesR | Statistical genetics analysis, meta-analysis, PRS calculation |
| eQTL resources | GTEx v8 database, eQTLGen Consortium | Mapping genetic variants to gene expression regulation |
| Cohort data | UK Biobank, All of Us, FinnGen, International Endogene Consortium | Large-scale genetic and phenotypic data for discovery and validation |
| Metabolomics platforms | Untargeted LC-MS (Liquid Chromatography-Mass Spectrometry) | Comprehensive metabolic profiling of endometrial samples [20] |
| Microbiome analysis | 16S rRNA sequencing (5R approach) | Characterization of endometrial microbial communities [20] |
| Functional validation | CRISPR/Cas9 systems, organoid cultures, animal models | Mechanistic validation of genetic findings |
The genetic distinctions between endometriosis and adenomyosis are becoming increasingly clear through large-scale genetic studies and multi-omics approaches. While they share some clinical manifestations and pathological features, their genetic architectures demonstrate significant differences, with unique risk loci and distinct regulatory mechanisms. These findings have profound implications for the development of polygenic risk scores specifically for endometriosis subphenotypes.
The experimental protocols and analytical frameworks presented in this application note provide researchers with robust methodologies for advancing this field. Future research directions should include expanded trans-ancestry genetic studies, functional characterization of novel loci, and the integration of polygenic risk scores with clinical factors for improved diagnosis and personalized treatment strategies.
Endometriosis, a chronic inflammatory and estrogen-dependent condition, affects approximately 10% of women of reproductive age and is a leading cause of pelvic pain and infertility [13] [1]. The diagnostic journey for patients is often protracted, spanning 7 to 12 years from symptom onset, largely due to the invasive nature of the current diagnostic gold standard—laparoscopic surgery with histological confirmation [9] [1]. This substantial delay underscores the critical need for non-invasive diagnostic strategies and improved risk stratification tools. In this context, the development of polygenic risk scores (PRS) for endometriosis subphenotypes represents a promising frontier. A PRS aggregates the effects of numerous genetic variants, each with small effect sizes, into a single quantitative measure of an individual's genetic liability to a disease [21]. Research confirms that a PRS for endometriosis captures an increased risk for the condition and its major subtypes, including ovarian, infiltrating, and peritoneal disease [21]. This application note details how the integration of hormonal and inflammatory pathway biology is fundamental to refining these genetic tools, thereby offering insights for researchers and drug development professionals aiming to deconstruct the heterogeneity of endometriosis and develop targeted therapeutic and diagnostic solutions.
The hormonal landscape of endometriosis is characterized by two defining features: local estrogen dominance and progesterone resistance. Contrary to systemic circulation, local estrogen bioavailability is heightened within endometriotic lesions. This is driven by the overexpression of the enzyme aromatase (CYP19A1), which converts androgens into estrogens, and the downregulation of 17β-hydroxysteroid dehydrogenase type 2 (17β-HSD2), which inactivates estradiol [13] [1]. This creates a self-sustaining, estrogen-rich microenvironment. Concurrently, progesterone resistance—a failure of target tissues to respond adequately to progesterone—perpetuates lesion survival. This resistance is marked by a significant reduction in the progesterone receptor-B (PR-B) isoform, attributed to promoter hypermethylation and microRNA dysregulation [13].
A pivotal recent discovery from a polygenic risk score phenome-wide association study (PRS-PheWAS) is the genetic association between a higher liability to endometriosis and lower testosterone levels [9]. Follow-up Mendelian randomization analysis suggested that lower testosterone may have a causal effect on endometriosis risk, revealing a previously underappreciated role for androgen signaling in disease etiology [9].
Protocol 1: Assessing Local Estrogen Biosynthesis in Eutopic Endometrium
CYP19A1 (aromatase), HSD17B2, and a reference gene (e.g., GAPDH); RNA extraction kit; reverse transcription and quantitative PCR (RT-qPCR) reagents.CYP19A1 and HSD17B2.CYP19A1 to HSD17B2 expression ratio is a strong indicator of local estrogen dominance. One study reported that aromatase expression in menstrual blood achieved an Area Under the Curve (AUC) of 0.977 for discriminating endometriosis patients from controls [1].Protocol 2: Evaluating Progesterone Resistance via PR-B Immunohistochemistry
Table 1: Key Hormonal Biomarkers in Endometriosis
| Biomarker | Molecular Function | Alteration in Endometriosis | Potential Diagnostic Utility |
|---|---|---|---|
| Aromatase (CYP19A1) | Converts androgens to estrogens | Overexpressed in lesions | High diagnostic accuracy (Sens: 79%, Spec: 89%) in meta-analysis [1] |
| Progesterone Receptor B (PR-B) | Mediates progesterone signaling | Significantly reduced in lesions | Indicator of progesterone resistance; correlates with infertility [13] |
| Testosterone | Androgen hormone | Genetically correlated with lower levels | Mendelian randomization suggests a causal, protective role [9] |
| 17β-HSD2 | Inactivates estradiol | Downregulated in lesions | Contributes to local estrogen dominance [13] |
| Nicotinamide N-methyltransferase (NNMT) | Modulates cell proliferation | Overexpressed, induced by estrogen | Potential new therapeutic target [1] |
Endometriosis is a state of pervasive immune dysfunction and chronic inflammation. The peritoneal fluid of affected women becomes a pro-inflammatory milieu, characterized by altered populations and functions of immune cells [13]. Key alterations include:
This inflammatory state is not isolated but is genetically intertwined with broader autoimmune conditions. A recent study demonstrated significant genetic correlations between endometriosis and several immunological diseases, including rheumatoid arthritis (rg = 0.27), osteoarthritis (rg = 0.28), and multiple sclerosis (rg = 0.09). Mendelian randomization further suggested a potential causal relationship from endometriosis to rheumatoid arthritis (OR = 1.16) [22].
Protocol 3: Flow Cytometric Analysis of Peritoneal Immune Cell Populations
Protocol 4: Cytokine Profiling in Serum or Peritoneal Fluid
Table 2: Key Inflammatory and Immune Biomarkers in Endometriosis
| Biomarker / Cell Type | Function | Alteration in Endometriosis | Research/Cinical Implication |
|---|---|---|---|
| M1/M2 Macrophages | Phagocytosis, tissue repair, angiogenesis | M1 dominant in eutopic endometrium; M2 dominant in lesions [13] | Drives inflammation and supports lesion survival; therapeutic target |
| CD56dimCD16+ NK cells | Cytotoxic activity | Severely reduced cytotoxicity [13] | Enables immune escape of ectopic cells |
| Macrophage Migration Inhibitory Factor (MIF) | Regulates immune responses, angiogenesis | Upregulated [1] | Contributes to inflammation and estrogen production |
| M2 Macrophages / γδ T cells | Immunomodulation | Infiltration associated with disease [23] | Identified as key players in the shared pathogenesis of EMs and RIF [23] |
| Rheumatoid Arthritis (RA) | Systemic autoimmune disease | Genetically correlated (rg = 0.27) [22] | Suggests shared biological mechanisms and comorbidity risk |
The biological pathways of hormone metabolism and inflammation provide a functional context for the genetic variants incorporated into PRS. The integration of these multi-omics layers is crucial for moving beyond a general disease PRS to subphenotype-specific prediction.
Connecting Genetics to Biology: The genetic variants identified in GWAS for endometriosis are enriched in genes involved in sex steroid hormone signaling, inflammatory pathways, and oncogenesis [24]. For instance, a recent study identified 51 methylation quantitative trait loci (mQTLs)—genetic variants that regulate DNA methylation—that were also associated with endometriosis risk, highlighting candidate genes like GREB1 and KDR that contribute to disease risk through epigenetic mechanisms [24]. This functionally annotates GWAS hits and prioritizes them for inclusion in refined PRS models.
Informing Subphenotype Stratification: The distinct hormonal and inflammatory profiles of different disease manifestations (e.g., ovarian vs. deep infiltrating endometriosis) or comorbidities (e.g., infertility vs. pain) can be used to validate and refine subphenotype-specific PRS. For example, a PRS was shown to be associated with all major subtypes of endometriosis but not with adenomyosis, confirming that the latter is driven by different genetic risk variants [21]. Furthermore, multi-omics analysis has identified shared diagnostic genes (e.g., PDIA4 and PGBD5) and immune microenvironment alterations (involving M2 macrophages and γδ T cells) between endometriosis and recurrent implantation failure (RIF), offering a molecular basis for stratifying patients based on infertility risk [23].
Enhancing Predictive Power: While the discriminative accuracy of a 14-SNP PRS alone is not yet sufficient for standalone clinical use (OR = 1.28-1.59 per SD increase) [21], combining PRS with classical clinical risk factors, hormonal levels (e.g., testosterone), and inflammatory biomarkers represents a powerful strategy for developing urgently needed risk stratification tools [9] [21] [1].
Table 3: Essential Research Reagents for Endometriosis Pathway Analysis
| Item | Function/Application | Example Use Case |
|---|---|---|
| SBayesR Software | Bayesian method for adjusting GWAS summary statistics to calculate improved PRS weightings [9]. | Generating polygenic risk scores with optimized effect size estimates for association studies. |
| Illumina Infinium MethylationEPIC BeadChip | Genome-wide DNA methylation profiling of over 850,000 sites [24]. | Identifying differential methylation patterns associated with menstrual cycle phase or disease state. |
| Validated PR-B Antibody | Specific detection of the Progesterone Receptor-B isoform in tissue sections via IHC. | Confirming progesterone resistance in endometrial stromal cells. |
| Multiplex Cytokine Panel (Luminex/MSD) | Simultaneous quantification of multiple cytokine and chemokine proteins in biofluids. | Profiling the inflammatory milieu in serum or peritoneal fluid. |
| Fluorochrome-conjugated Antibody Panel (CD14, CD56, CD16, CD3) | Immunophenotyping of immune cells from peritoneal fluid or blood by flow cytometry. | Characterizing shifts in macrophage and NK cell populations. |
Primer Assays for CYP19A1, HSD17B2 |
Quantitative measurement of gene expression via RT-qPCR. | Assessing local estrogen biosynthesis activity in tissue or menstrual blood. |
Core Pathways in Endometriosis Pathogenesis
PRS-PheWAS Workflow for Comorbidity Discovery
Polygenic risk scores (PRS) have emerged as a powerful tool for quantifying an individual's genetic susceptibility to complex diseases. For endometriosis, a condition with a significant heritable component estimated at 47-52%, PRS represents a promising approach for risk prediction and stratification [25] [8]. The development of accurate PRS for endometriosis requires careful consideration of single nucleotide polymorphism (SNP) selection and weighting strategies, particularly when addressing the challenge of disease subphenotypes. This application note details standardized protocols for constructing and validating endometriosis PRS, with emphasis on translating genetic discoveries into biologically and clinically relevant tools.
Endometriosis affects approximately 10% of women of reproductive age and is characterized by the presence of endometrial-like tissue outside the uterine cavity [11]. The disease demonstrates substantial heterogeneity in clinical presentation, with different subtypes including ovarian, peritoneal, and infiltrating endometriosis [11]. Genome-wide association studies (GWAS) have identified numerous genetic loci associated with endometriosis risk, enabling the development of PRS that aggregate the effects of multiple variants into a single quantitative measure [8].
The genetic architecture of endometriosis is polygenic, with each variant contributing modestly to disease risk. Early GWAS identified 12 SNPs at 10 independent loci, while more recent studies have expanded this to 42 significant loci [25] [9]. These discoveries provide the foundation for PRS development, though careful methodological approaches are required to optimize their predictive power and clinical utility.
The most straightforward approach to SNP selection involves including variants that reach genome-wide significance (p < 5 × 10⁻⁸) in GWAS. This method was employed in several early endometriosis PRS studies, such as one utilizing 14 lead SNPs from a large-scale meta-analysis [11]. While this approach ensures the inclusion of robustly associated variants, it may exclude SNPs with smaller but genuine effects, potentially limiting predictive accuracy.
The clumping and thresholding method represents an evolution beyond simple significance thresholding. This iterative process selects the SNP with the lowest p-value in a genomic region, removes SNPs in linkage disequilibrium (LD) with it, and repeats this process across the genome [26]. This strategy was applied in a PRS-PheWAS study that revealed an association between endometriosis genetic liability and testosterone levels [9].
Advanced Bayesian methods represent the current state-of-the-art in SNP selection for PRS construction. Methods such as PRS-CS and SBayesR utilize shrinkage priors to model the genetic architecture of complex traits, allowing for the inclusion of a larger number of SNPs while accounting for LD structure [26] [9]. These approaches have demonstrated superior performance in endometriosis PRS applications, particularly in cross-ancestry contexts [27].
Table 1: Comparison of SNP Selection Methods for Endometriosis PRS
| Method | Key Features | Advantages | Limitations | Representative Applications |
|---|---|---|---|---|
| Genome-Wide Significance | Includes SNPs with p < 5 × 10⁻⁸ | High specificity for true associations | Limited number of SNPs; may miss polygenic signal | 14-SNP PRS for endometriosis subtypes [11] |
| Clumping + Thresholding | LD-based pruning with p-value thresholds | Reduces redundancy; computationally efficient | Performance depends on threshold selection | PRS-PheWAS of endometriosis comorbidities [9] |
| Bayesian Methods | Shrinkage priors accounting for LD | Improved prediction accuracy; handles large SNP sets | Computationally intensive; requires careful prior specification | Cross-ancestry PRS in multi-ancestry GWAS [27] |
The most common approach to SNP weighting in PRS construction utilizes effect sizes (beta coefficients or odds ratios) derived from GWAS summary statistics. Each risk allele is weighted by its estimated effect size, with the overall PRS calculated as the weighted sum of risk alleles across all included SNPs [11]. This method was validated in a study of surgically confirmed endometriosis cases, where each standard deviation increase in PRS was associated with an odds ratio of 1.57 for endometriosis diagnosis [11].
Bayesian shrinkage methods, such as those implemented in SBayesR, adjust SNP effect sizes based on prior assumptions about the genetic architecture of the trait [9]. These approaches help mitigate the winner's curse phenomenon and provide more accurate effect size estimates, particularly for SNPs with modest associations. In a recent PRS-PheWAS, this approach demonstrated significant associations between endometriosis PRS and multiple biomarkers, including testosterone levels [9].
For pharmacogenomic applications, novel weighting strategies have been developed that simultaneously model both prognostic (main) and predictive (interaction) effects. The PRS-PGx-Bayes method employs a Bayesian framework to estimate posterior distributions for both effect types, enabling the construction of separate prognostic and predictive PRS [28]. This approach has shown superior performance in predicting drug response compared to traditional disease PRS methods.
Objective: Identify genetic variants associated with endometriosis risk for inclusion in PRS.
Materials:
Procedure:
This protocol was successfully implemented in a cross-ancestry meta-analysis of ∼1.4 million women, identifying 80 genome-wide significant associations including 37 novel loci [27].
Objective: Construct PRS for endometriosis and validate its predictive performance.
Materials:
Procedure:
This methodology was applied in a study of Danish and UK Biobank cohorts, demonstrating significant association between PRS and all endometriosis subtypes [11].
Figure 1: PRS Development and Validation Workflow. This diagram illustrates the standardized pipeline for polygenic risk score construction, from initial quality control to final validation and application.
Objective: Evaluate whether PRS performance varies across endometriosis subphenotypes.
Materials:
Procedure:
This approach revealed that PRS was associated with all major subtypes of endometriosis, with the strongest association for ovarian endometriosis (OR = 1.72) [11].
Advanced PRS applications increasingly integrate multiple layers of genomic information. This includes transcriptomic data to identify expression quantitative trait loci (eQTLs), epigenetic data to map regulatory elements, and proteomic data to elucidate downstream pathways [27]. In a recent multi-ancestry study, integration of multi-omics data revealed that genetic variation influences endometriosis risk through transcriptomic, epigenetic, and proteomic regulation across multiple tissues [27].
A significant challenge in PRS development is ensuring transferability across diverse ancestral populations. Recent efforts have focused on developing cross-ancestry PRS frameworks that incorporate data from multiple population groups [27]. These approaches typically involve:
Table 2: Performance Metrics of Endometriosis PRS Across Studies
| Study Cohort | Sample Size (Cases/Controls) | Number of SNPs in PRS | Odds Ratio per SD | p-value | Subtype-Specific Effects |
|---|---|---|---|---|---|
| Danish Surgical Cohort [11] | 249/348 | 14 | 1.59 | 2.57×10⁻⁷ | Ovarian: OR=1.72; Infiltrating: OR=1.66 |
| Danish Twin Registry [11] | 140/316 | 14 | 1.50 | 0.0001 | Not reported |
| UK Biobank [11] | 2,967/256,222 | 14 | 1.28 | <2.2×10⁻¹⁶ | Not reported |
| UK Biobank PRS-PheWAS [9] | 188,221 females | SBayesR | N/A | N/A | Association with testosterone levels |
Table 3: Essential Research Reagents and Resources for Endometriosis PRS Studies
| Resource Category | Specific Tools/Platforms | Application in Endometriosis PRS |
|---|---|---|
| Genotyping Arrays | Illumina Global Screening Array [29] | Genome-wide SNP genotyping for PRS calculation |
| Imputation Resources | TOPMed Imputation Server [29], 1000 Genomes Project [25] | Inference of ungenotyped variants using reference panels |
| GWAS Software | PLINK [9] [29], METAL [9], SAIGE | Association analysis and meta-analysis |
| PRS Methods | PRS-CS [26], SBayesR [9], LDpred [26] | Polygenic risk score calculation with various weighting approaches |
| Biobanks | UK Biobank [11] [9], FinnGen [9], All of Us [27] | Large-scale cohorts for discovery and validation |
| Functional Annotation | ENCODE [8], GTEx, GWAS Catalog [29] | Biological interpretation of risk loci |
Figure 2: Biological Pathways Implicated by Endometriosis PRS. Genetic risk variants for endometriosis aggregate in key signaling pathways involved in disease pathogenesis, including immune/inflammatory responses, hormonal regulation, developmental processes, and cellular functions.
Robust quality control procedures are essential for reliable PRS construction. Standard protocols include:
Consistent phenotype definitions across cohorts are critical for PRS validation. The Endometriosis Phenome and Biobanking Harmonization Project (EPHect) has developed standardized protocols for endometriosis data collection, including surgical and clinical phenotypes [25]. Implementation of these standards enables more reliable cross-study comparisons and meta-analyses.
SNP selection and weighting strategies for endometriosis PRS have evolved significantly, from early approaches using a handful of genome-wide significant variants to contemporary methods incorporating thousands of SNPs with Bayesian shrinkage. The continued expansion of GWAS sample sizes, improved representation of diverse ancestries, and integration of multi-omics data will further enhance the precision and utility of endometriosis PRS. These advances hold promise for refining endometriosis subphenotype classification, elucidating biological mechanisms, and ultimately improving risk prediction and targeted interventions.
Endometriosis, a complex gynecological disorder affecting 5-10% of women of reproductive age, presents substantial diagnostic challenges, with average delays of 4-11 years from symptom onset to definitive surgical diagnosis [30]. The disease demonstrates strong heritability estimates of 47-51% from twin studies and 26% from common SNP-based heritability, highlighting the significant genetic component that makes it amenable to polygenic risk scoring approaches [30] [9]. Current diagnostic limitations, including the requirement for invasive laparoscopic confirmation and the heterogeneity of clinical presentations, have created an urgent need for improved risk stratification tools [11].
Polygenic risk scores (PRS) aggregate the effects of numerous genetic variants across the genome to quantify an individual's genetic predisposition to a trait or disease. In endometriosis research, PRS has emerged as a promising approach for identifying high-risk individuals, elucidating biological pathways, and potentially reducing diagnostic delays [11]. However, standard PRS methods face several limitations, including limited predictive power, sensitivity to genetic architecture, and challenges in modeling the complex genetic underpinnings of endometriosis subphenotypes.
Bayesian methods and machine learning approaches offer sophisticated solutions to these limitations by incorporating prior biological knowledge, accommodating complex genetic architectures, and integrating diverse data types. This application note provides detailed protocols and methodologies for implementing these advanced computational techniques to optimize PRS for endometriosis subphenotype research, specifically targeting researchers, scientists, and drug development professionals working in this field.
Bayesian methods for PRS construction fundamentally differ from traditional approaches by incorporating prior distributions over SNP effect sizes, allowing for more flexible modeling of genetic architecture. The core Bayesian linear regression framework is expressed as:
y = Xβ + ε
Where y is the vector of phenotypic measurements, X is the genotype matrix, β is the vector of effect sizes, and ε captures residual effects [31]. The Bayesian approach specifies prior distributions on the effect sizes β, which are then updated through the likelihood to obtain posterior distributions given the observed data [32].
The key advantage of Bayesian methods lies in their ability to model genetic architectures through specific prior distributions. The spike-and-slab prior implements a mixture distribution:
βj ~ πN(βj; 0, σβ²) + (1 - π)δ0
This formulation specifies that each SNP effect size βj follows a normal distribution with probability π (the fraction of causal variants) or is exactly zero with probability (1-π) [31]. Continuous shrinkage priors, such as those implemented in PRS-CS, provide an alternative approach that allows for marker-specific adaptive shrinkage, eliminating the need for discrete mixture distributions while effectively modeling varying genetic architectures [32].
VIPRS utilizes variational inference to approximate posterior distributions for effect sizes, offering computational advantages over traditional Markov Chain Monte Carlo (MCMC) methods [31].
Protocol Steps:
Key Advantages: VIPRS demonstrates competitive predictive accuracy while being more than twice as fast as MCMC-based approaches, with robust performance across diverse genetic architectures [31].
PRS-CS employs a continuous shrinkage prior that enables conjugate block updates of SNP effect sizes, providing accurate modeling of local LD patterns [32].
Protocol Steps:
Performance Characteristics: PRS-CS demonstrates substantial improvements in prediction accuracy across varying genetic architectures, particularly with large training sample sizes [32].
Figure 1: Bayesian PRS Optimization Workflow
Machine learning approaches enable the integration of polygenic risk scores with diverse clinical and demographic variables to improve endometriosis prediction. The gradient boosting algorithm CatBoost has demonstrated particularly strong performance in this domain, achieving an area under the ROC curve (AUC) of 0.81 when combining genetic, clinical, and lifestyle factors [30].
Protocol: Integrated ML Pipeline for Endometriosis Risk Prediction
Feature Engineering and Selection
Model Training with CatBoost
Model Interpretation
Polygenic risk score phenome-wide association studies (PRS-PheWAS) enable the systematic investigation of genetic liability to endometriosis across diverse phenotypes and biomarkers, revealing important pleiotropic effects [9].
Protocol: PRS-PheWAS Implementation
Cohort Definition
Phenotype Processing
Association Testing
Table 1: Performance Comparison of PRS Methods for Endometriosis
| Method | Key Features | Genetic Architecture Handling | Computational Efficiency | Reported Performance (OR/AUC) |
|---|---|---|---|---|
| Traditional PRS | Clumping and thresholding | Limited | High | OR: 1.57-1.72 for subtypes [11] |
| VIPRS | Variational inference, spike-and-slab prior | Robust | High (2x faster than MCMC) | Competitive with state-of-art [31] |
| PRS-CS | Continuous shrinkage priors, LD modeling | Excellent | Moderate | Improved accuracy vs. alternatives [32] |
| CatBoost ML | Integrated PRS + clinical factors | N/A | Moderate | AUC: 0.81 [30] |
Bayesian approaches have demonstrated particular utility in dissecting the genetic architecture of endometriosis subphenotypes. Research has shown that PRS can capture increased risk across all types of endometriosis rather than specific locations, with odds ratios of 1.72 for ovarian endometriosis, 1.66 for infiltrating endometriosis, and 1.51 for peritoneal endometriosis [11]. Furthermore, these approaches have revealed that endometriosis PRS is not associated with adenomyosis, suggesting distinct genetic etiologies despite clinical similarities [11].
The application of Bayesian methods to gene identification has successfully prioritized high-confidence candidate genes, with studies identifying 24 genes with high-confidence scores including HLA-DQB1 and PPARA as central to the endometriosis network [34]. These findings provide biological insights that may inform future therapeutic development.
PRS-PheWAS approaches have revealed significant associations between genetic liability to endometriosis and multiple biomarkers, most notably identifying an association with lower testosterone levels that may be causal for both endometriosis and clear cell ovarian cancer [9]. This finding highlights the value of these methods for uncovering novel biological pathways and potential therapeutic targets.
Table 2: Key Endometriosis PRS Associations from PRS-PheWAS
| Category | Specific Association | Direction | Potential Clinical Relevance |
|---|---|---|---|
| Reproductive Factors | Menstrual cycle length | Positive | Early risk indicator |
| Comorbid Conditions | Irritable bowel syndrome | Positive | Diagnostic clarification |
| Biomarkers | Testosterone levels | Negative | Novel therapeutic target |
| * Psychiatric Comorbidities* | Depression | Positive | Comprehensive patient care |
Table 3: Essential Research Reagents and Computational Tools
| Resource Category | Specific Tool/Dataset | Application | Key Features |
|---|---|---|---|
| GWAS Summary Statistics | Sapkota et al. 2017 meta-analysis [9] | PRS weight derivation | 14,926 cases; 189,715 controls |
| LD Reference Panels | 1000 Genomes European sample [32] | LD adjustment | N = 503; Population-specific |
| Biobank Data | UK Biobank [30] [9] | Method validation | ~500,000 participants; Rich phenotyping |
| Software Tools | GCTB 2.02 (SBayesR) [9] | Bayesian PRS | Summary-based Bayesian analysis |
| Clinical Validation Cohorts | Western Danish endometriosis cohort [11] | Clinical translation | Surgically confirmed cases |
Figure 2: Endometriosis Subphenotype Research Workflow
Bayesian methods and machine learning approaches significantly advance PRS optimization for endometriosis research by improving predictive accuracy, enabling subphenotype stratification, and uncovering novel biological insights. The integration of these computational approaches with comprehensive phenotypic data from biobanks provides a powerful framework for addressing the diagnostic challenges in endometriosis and facilitating the development of targeted therapeutic strategies. Future directions should focus on increasing ancestral diversity in genetic studies, refining subphenotype definitions, and translating these computational advances into clinical tools for risk stratification and early intervention.
The diagnostic odyssey for endometriosis, often protracted by 7 to 12 years, underscores the critical need for innovative risk stratification tools [1]. Current reliance on laparoscopic surgery for definitive diagnosis creates a significant barrier to timely intervention. Polygenic risk scores (PRS), which aggregate the effects of numerous genetic variants, offer a promising avenue for understanding disease susceptibility. However, the discriminative accuracy of PRS alone remains insufficient for standalone clinical prediction, with odds ratios (OR) for endometriosis typically ranging from 1.28 to 1.59 per standard deviation increase in PRS [11]. This protocol details methodologies for integrating PRS with non-genetic risk factors to enhance predictive power and provide a more comprehensive framework for endometriosis research and potential future clinical application.
Empirical evidence from large-scale biobank studies provides a solid foundation for integrating PRS with clinical risk factors. The interactions between genetic susceptibility and clinical manifestations are complex and multidimensional, as summarized in the table below.
Table 1: Evidence for PRS-Clinical Factor Interactions in Endometriosis
| Factor Category | Specific Factor | Nature of Interaction with PRS | Key Findings | Source |
|---|---|---|---|---|
| Comorbidities | Uterine Fibroids, Heavy Menstrual Bleeding, Dysmenorrhea | Significant interaction | Absolute increase in endometriosis prevalence greater in individuals with high PRS vs. low PRS when comorbidity present. | [35] [36] |
| Comorbidity Burden | Overall diagnosed condition count | Negative correlation in cases | Comorbidity burden positively correlated with PRS in women without endometriosis but negatively correlated in diagnosed cases. | [35] [36] |
| Hormonal Biomarkers | Testosterone | Causal relationship suggested | Genetic liability to lower testosterone identified as potentially causal for endometriosis via Mendelian Randomisation. | [9] |
| Disease Subtypes | Ovarian, Infiltrating, Peritoneal | PRS association varies | PRS associated with all subtypes (ORs: Ovarian=1.72, Infiltrating=1.66, Peritoneal=1.51). | [11] |
| Clinical Presentation | Spread of disease, GI tract involvement | Inverse association | Higher PRS unexpectedly associated with less spread and fewer GI symptoms in one clinical cohort. | [37] |
Objective: To generate a standardized endometriosis PRS for research applications.
Workflow Overview: The following diagram outlines the core PRS calculation workflow.
Methodology:
Objective: To systematically collect and code non-genetic risk factors for integration with PRS.
Methodology:
Objective: To quantify the combined effect of PRS and clinical factors on endometriosis risk.
Methodology:
logit(P) = β₀ + β₁(PRS) + Σγ_i(Covariates_i)
where covariates include age, genetic principal components (PCs 1-10), and other relevant confounders [35] [9].logit(P) = β₀ + β₁(PRS) + β₂(Clinical Factor) + β₃(PRS * Clinical Factor) + Σγ_i(Covariates_i)
The coefficient β₃ tests for the interaction between genetic liability and the clinical factor.Table 2: Essential Reagents and Resources for Integrated PRS-Endometriosis Research
| Item/Category | Specification/Example | Primary Function in Protocol |
|---|---|---|
| Genotyping Array | Illumina Global Screening Array | High-throughput genotyping of DNA samples to obtain raw SNP data. |
| Imputation Reference Panel | TOPMed (Version R2 on GRC38) | Statistical inference of non-genotyped markers to increase SNP density, using a large, diverse reference panel. |
| GWAS Summary Statistics | Sapkota et al. (2017) meta-analysis + FinnGen R8 | Provides effect sizes (beta coefficients) for risk alleles used to weight the PRS. |
| Bioinformatics Software - QC | PLINK 1.9/2.0 | Data management and quality control of genotype data (filtering, relatedness checks). |
| Bioinformatics Software - PRS | GCTB (for SBayesR), PLINK --score |
Refines SNP weights and calculates the polygenic risk score for each individual. |
| Phenotype Codification | ICD-10 to Phecode Mapping (v1.2) | Aggregates specific ICD-10 codes into broader, clinically meaningful phenotypes for association testing. |
| Statistical Software | R Statistical Environment | Data analysis, statistical modeling (logistic regression), and visualization. |
Emerging evidence suggests that integrating epigenetic markers can further improve risk prediction. A recent study developed an MRS for endometriosis using endometrial tissue methylation data from 908 samples [38].
Understanding the biological pathways connecting genetic risk to comorbidities can illuminate disease mechanisms.
Key Pathway Insight: A PRS-phenome-wide association study (PheWAS) revealed an association between genetic liability for endometriosis and lower testosterone levels. Mendelian randomization analysis suggested that lower testosterone may have a causal effect on endometriosis risk [9]. This finding points to specific hormonal pathways that may be therapeutic targets and should be considered when integrating hormonal profiles into risk models.
Polygenic risk scores (PRS) have emerged as a valuable tool for quantifying an individual's genetic susceptibility to complex diseases like endometriosis. Traditionally, PRS are calculated from the cumulative effect of common single nucleotide polymorphisms (SNPs) identified through genome-wide association studies (GWAS) [11]. For endometriosis, these common variants explain only a portion of the known heritability, estimated at approximately 50% based on twin and family studies [39] [40]. This "missing heritability" problem has prompted increased investigation into the role of rare genetic variants and structural variations in endometriosis pathogenesis.
Endometriosis affects approximately 10% of reproductive-aged women globally and presents a substantial diagnostic challenge, with delays of 7-10 years from symptom onset to definitive diagnosis [41] [39]. The development of more accurate PRS that incorporate diverse genetic elements could significantly improve early detection and risk stratification, particularly for different endometriosis subphenotypes. This protocol outlines comprehensive methodologies for identifying, analyzing, and integrating rare variants and structural variations into endometriosis PRS frameworks.
Table 1: Types of Genetic Variations in Endometriosis
| Variant Category | Definition | Detection Method | Contribution to Endometriosis |
|---|---|---|---|
| Common Variants | SNPs with minor allele frequency (MAF) >5% | GWAS | ~26% of polygenic risk [29] |
| Rare Variants | SNPs with MAF <1% | Whole-exome sequencing, whole-genome sequencing | Increased risk in familial cases [2] |
| Copy Number Variants (CNVs) | Structural variations >1kb | High-density microarrays, sequencing | Rare, large-effect deletions [42] |
| Regulatory Variants | Non-coding variants affecting gene expression | eQTL analysis, functional genomics | Tissue-specific effects [39] [43] |
Whole-exome sequencing (WES) provides an efficient approach for identifying rare coding variants with potentially significant functional impact in endometriosis patients.
Protocol: Family-Based WES Analysis
Sample Selection: Prioritize multi-generational families with multiple affected individuals to enhance detection of co-segregating rare variants [2]. The study should include at least three affected family members across different generations.
DNA Extraction and Library Preparation: Extract genomic DNA from peripheral blood leukocytes using standardized protocols. Prepare sequencing libraries with exome capture kits (e.g., Illumina Exome Panel) following manufacturer specifications.
Sequencing Parameters: Sequence on platforms such as Illumina NovaSeq with a minimum coverage of 100x to ensure reliable variant calling. Include both affected and unaffected family members for comparison.
Bioinformatic Analysis:
Variant Filtering Strategy:
Table 2: Candidate Rare Variants Identified in Familial Endometriosis
| Gene | Variant | Function | Evidence |
|---|---|---|---|
| LAMB4 | c.3319G>A (p.Gly1107Arg) | Extracellular matrix protein | Co-segregation in multigenerational family [2] |
| EGFL6 | c.1414G>A (p.Gly472Arg) | Angiogenesis regulator | Co-segregation in multigenerational family [2] |
| NAV3 | Rare variants | Neuronal development | Potential role in pain perception [2] |
| NPSR1 | High-penetrance variants | Neuropeptide signaling | Associated with endometriosis risk [2] |
After identifying candidate rare variants, functional validation is essential to establish pathogenicity:
Copy number variants (CNVs) are structural variations ≥1kb in length that contribute significantly to genomic diversity and disease susceptibility.
Protocol: Genome-Wide CNV Analysis
Sample Preparation and Quality Control:
CNV Calling and Filtering:
Statistical Analysis:
Technical Validation:
A recent CNV analysis in endometriosis identified three significant deletions associated with disease risk: a deletion at SGCZ on 8p22 (OR = 8.5, P = 7.3×10⁻⁴), a deletion in MALRD1 on 10p12.31 (OR = 14.1, P = 5.6×10⁻⁴), and a deletion at 11q14.1 (OR = 33.8, P = 5.7×10⁻⁴) [42]. These CNV loci were detected in 6.9% of affected women compared to 2.1% in the general population [42].
Developing comprehensive PRS that incorporate both common and rare variants requires specialized statistical approaches:
Variant Weighting:
Integration Methods:
Subphenotype Stratification:
Research demonstrates that PRS based on 14 common variants shows differential association with endometriosis subtypes: ovarian (OR = 1.72), infiltrating (OR = 1.66), and peritoneal (OR = 1.51) [11]. This suggests that incorporating additional variant types may further improve subphenotype discrimination.
Enhance PRS interpretation by incorporating functional genomic data:
Regulatory variants in genes such as IL-6, CNR1, and IDO1 have been identified through eQTL analysis and may interact with environmental factors like endocrine-disrupting chemicals [39].
The following protocol outlines an integrated approach for detecting diverse variant types in endometriosis research studies.
This protocol details the identification and functional characterization of regulatory variants in endometriosis.
Table 3: Essential Research Reagents and Platforms for Endometriosis Genetic Studies
| Category | Item | Specifications | Application |
|---|---|---|---|
| Sequencing Platforms | Illumina NovaSeq | 100x coverage, 150bp paired-end | Whole genome/exome sequencing [2] |
| Genotyping Arrays | Illumina HumanOmniExpress | ~720,000 markers | CNV detection, common variants [42] |
| CNV Calling Software | PennCNV | Minimum 10 probes, LRR-SD filter | Structural variant identification [42] |
| Variant Annotation | Ensembl VEP | GRCh37/hg38, population frequencies | Functional consequence prediction [43] |
| eQTL Resources | GTEx Portal v8 | Multiple tissues, FDR <0.05 | Regulatory variant mapping [43] |
| Statistical Analysis | PLINK, PRSice | QC filters, clumping, weighting | Polygenic risk score calculation [11] [29] |
| Functional Validation | CRISPR-Cas9 | Gene editing, reporter assays | Mechanistic studies of priority variants [39] |
Incorporating rare genetic variants and structural variations into polygenic risk models represents a promising frontier in endometriosis research. The protocols outlined here provide a comprehensive framework for detecting, validating, and integrating these diverse genetic elements to enhance PRS accuracy and clinical utility. As research in this area advances, multi-ancestry studies and standardized bioinformatic pipelines will be essential for developing PRS that effectively stratify risk across diverse populations and endometriosis subphenotypes. This integrated approach ultimately promises to improve early detection, personalized treatment strategies, and our fundamental understanding of endometriosis pathogenesis.
The Area Under the Receiver Operating Characteristic Curve (AUC) serves as the most prevalent metric for evaluating the discriminative ability of polygenic risk models, quantifying how well a model distinguishes between individuals who will or will not develop a disease [44]. In the context of endometriosis subphenotype research, accurate prediction is particularly challenging due to the disease's heterogeneity, multifactorial etiology, and the subtle contribution of individual genetic variants [13] [1]. While AUC provides a valuable overview of model performance, reliance on this single metric presents significant limitations, especially when evaluating incremental improvements offered by polygenic risk scores (PRS) for complex traits.
Recent research highlights that AUC values often fail to detect clinically relevant improvements when new genetic markers are added to existing models, a phenomenon particularly problematic in endometriosis research where the goal is to stratify risk across diverse disease manifestations [44] [45]. This application note examines the inherent limitations of AUC, explores complementary metrics and methodological refinements, and provides structured experimental protocols to enhance the predictive power and clinical utility of PRS models for endometriosis subphenotypes.
The AUC metric possesses several inherent limitations that constrain its utility for comprehensive model evaluation:
Insensitivity to Clinically Meaningful Improvements: The AUC can remain virtually unchanged even when risk predictions improve meaningfully for substantial portions of the population, particularly when models already demonstrate moderate discriminative ability (AUC > 0.70) [44]. This insensitivity stems from AUC's focus on ranking rather than the magnitude of risk differences.
Failure to Capture Risk Reclassification: AUC does not measure whether individuals move across clinically relevant risk thresholds when new predictors are added to models, a critical consideration for stratified screening and prevention strategies [44].
Dependence on Overall Model Performance: The interpretability of AUC changes (ΔAUC) diminishes as baseline model performance increases, making it difficult to evaluate PRS value when added to strong clinical predictors [44].
Table 1: Comparison of Metrics for Evaluating Polygenic Risk Model Improvements
| Metric | What It Measures | Interpretation | Advantages | Limitations |
|---|---|---|---|---|
| ΔAUC | Improvement in discrimination between cases and controls | Higher values indicate better separation | Intuitive, widely understood | Insensitive to clinically important improvements |
| NRI (Net Reclassification Improvement) | Proportion of individuals reclassified into more appropriate risk categories | Positive values indicate improved reclassification | Captures movement across risk thresholds | Depends on predefined risk categories |
| IDI (Integrated Discrimination Improvement) | Improvement in average predicted risks between events and non-events | Positive values indicate better risk separation | Sensitive to risk magnitude changes | Less familiar to researchers; no universal benchmarks |
| Predictive R² | Proportion of variance explained by the model | Higher values indicate better fit | Direct interpretation; useful for power calculations | Depends on disease prevalence; not a discrimination measure |
Application of AUC to endometriosis subphenotype prediction faces specific methodological challenges:
Heterogeneous Disease Manifestations: Endometriosis encompasses multiple subphenotypes (peritoneal, ovarian, deep infiltrating) with potentially distinct genetic architectures, complicating the interpretation of aggregate AUC values [46]. A model might demonstrate excellent discrimination for one subphenotype while performing poorly for others, yet report a deceptively adequate overall AUC.
Sample Size Requirements: Detection of statistically significant AUC improvements requires large sample sizes, a particular challenge for rare endometriosis subphenotypes. For instance, one study predicting severe endometriosis achieved an AUC of 0.744 using a random forest model, but required 308 patients with surgical confirmation for development [47].
Stage-Dependent Genetic Effects: Advanced-stage endometriosis (rASRM stage III/IV) demonstrates stronger genetic effects than earlier stages, suggesting that PRS performance may vary substantially across disease severity spectra [24]. AUC comparisons across studies that enroll different disease severity distributions can be misleading.
To address AUC limitations, researchers should incorporate complementary metrics that capture different dimensions of predictive performance:
Net Reclassification Improvement (NRI): Quantifies the proportion of individuals correctly reclassified into higher or lower risk categories after adding PRS to a baseline model [44]. In practice, NRI calculation requires defining clinically meaningful risk thresholds specific to endometriosis subphenotypes (e.g., thresholds for surgical intervention, fertility preservation, or targeted medical therapy).
Integrated Discrimination Improvement (IDI): Measures the average improvement in predicted risks between cases and controls, capturing increases in separation between event and non-event distributions [44]. IDI is particularly valuable for detecting improvements when risk distributions shift in ways not reflected in AUC changes.
Calibration Metrics: Assessment of how well predicted probabilities match observed risks is crucial for clinical implementation. This can be evaluated using Hosmer-Lemeshow goodness-of-fit tests or calibration plots [48].
Table 2: Multimarker Assessment Strategies for Enhanced Endometriosis Prediction
| Assessment Approach | Application in Endometriosis | Implementation Considerations |
|---|---|---|
| Multimarker Panels | Combining genetic, epigenetic, and protein biomarkers | Machine learning approaches (RF, XGBoost) effectively handle high-dimensional data [47] [1] |
| Clinical-Genetic Integration | Adding PRS to clinical risk factors (symptoms, imaging) | Requires standardized collection of clinical metadata [45] [46] |
| Disease Subphenotyping | Developing subtype-specific prediction models | Necessitates precise phenotypic characterization and sufficient sample sizes [24] [46] |
| Longitudinal Performance | Monitoring model performance across disease progression | Demands well-annotated longitudinal cohorts with repeated measures |
Substantial improvements in predictive performance can be achieved through methodological innovations in both PRS construction and model development:
Advanced PRS Methods: Novel approaches like EB-PRS that leverage effect size distributions across markers have demonstrated substantial improvements over standard PRS methods, with relative improvements in predictive R² ranging from 3.1% to 307.1% across various complex diseases [49]. These methods do not require external linkage disequilibrium reference panels or parameter tuning.
Machine Learning Integration: Ensemble methods like random forest can capture non-linear relationships and complex interactions between genetic and clinical predictors. One study predicting severe pelvic endometriosis found that a random forest model incorporating both clinical and ultrasound features achieved the best performance (AUC = 0.744) among seven machine learning algorithms tested [47].
Multi-ancestry PRS Development: Current PRS models predominantly reflect European-ancestry genetics, limiting generalizability. Developing multi-ancestry PRS (MA-PRS) that incorporate both disease-associated and ancestry-informative SNPs represents a critical direction for improving predictive power across diverse populations [45].
Objective: To implement a multidimensional evaluation strategy for polygenic risk models of endometriosis subphenotypes that moves beyond sole reliance on AUC.
Materials:
Procedure:
Expected Outcomes: This comprehensive protocol will determine whether PRS provides value beyond current prediction methods across multiple dimensions, not just discrimination.
Objective: To implement advanced PRS methods that leverage effect size distributions for improved prediction of endometriosis subphenotypes.
Materials:
Procedure:
Expected Outcomes: Implementation of this protocol typically yields PRS with improved predictive accuracy compared to standard methods, particularly for complex subphenotypes with heterogeneous genetic architectures.
Table 3: Research Reagent Solutions for Enhanced Predictive Modeling
| Category | Specific Resource | Function | Implementation Considerations |
|---|---|---|---|
| Genetic Data | GWAS summary statistics | Discovery of variant-trait associations | Ensure ancestral diversity; large sample sizes [45] |
| Genotyping Arrays | Illumina Global Screening Array | Genome-wide variant genotyping | Consider custom content for endometriosis-relevant loci |
| PRS Software | PRSice2, LDpred, LDPred2 | Polygenic risk score calculation | LDpred requires LD reference panel [49] |
| Machine Learning Libraries | scikit-learn, mlr3, XGBoost | Advanced predictive modeling | Effective for clinical-genetic integration [47] |
| Bioinformatics Tools | PLINK, QCTOOL, R/Bioconductor | Genetic data processing and analysis | Standardized pipelines enhance reproducibility |
| Validation Cohorts | Deeply phenotyped endometriosis cohorts | Model validation and calibration | Must include relevant subphenotypes [24] |
Moving beyond the limitations of AUC requires a fundamental shift in how we evaluate polygenic risk models for endometriosis subphenotypes. By implementing multidimensional assessment frameworks that incorporate reclassification metrics, calibration measures, and clinical utility analyses, researchers can more accurately characterize the value of genetic information for risk prediction. Methodological innovations in PRS construction, particularly approaches that leverage effect size distributions and incorporate functional genomic data, offer promising pathways for enhanced predictive power.
Future directions should prioritize multi-ancestry model development, integration of multi-omics data (epigenetic, transcriptomic, proteomic), and application of sophisticated machine learning approaches capable of capturing complex interactions. Furthermore, establishing standardized evaluation frameworks specific to endometriosis subphenotypes will enable more meaningful comparisons across studies and accelerate progress toward clinically implementable risk prediction tools. Through these coordinated advances, the field can overcome current limitations in predictive power and deliver on the promise of personalized risk assessment for this complex gynecological disorder.
Endometriosis is a multifaceted inflammatory disease with significant heterogeneity in its clinical presentation, encompassing three recognized lesion phenotypes (peritoneal, ovarian endometrioma, and deep infiltrating endometriosis) and diverse symptom profiles ranging from chronic pelvic pain to infertility [13] [50]. This clinical diversity presents substantial challenges for developing effective polygenic risk scores (PRS), as general PRS constructed for endometriosis as a single entity often fail to capture the genetic architecture underlying specific subphenotypes. The disease's complex pathophysiology involves interconnected mechanisms including hormonal dysregulation, immune dysfunction, oxidative stress, genetic and epigenetic alterations, and microbiome imbalances [13]. While PRS aggregates the effects of multiple genetic variants into a single risk measure, current evidence demonstrates that existing endometriosis PRS show limited utility in predicting specific clinical presentations, disease severity, or anatomical localization [29]. This application note examines the technical limitations of general PRS for endometriosis subphenotype prediction and provides detailed experimental protocols for developing more refined, phenotype-specific genetic risk tools.
Table 1: Discriminatory Performance of Endometriosis PRS in Validation Studies
| Cohort Description | Sample Size (Cases/Controls) | PRS Construction | OR per SD Increase | p-value | Subtype Analysis | Citation |
|---|---|---|---|---|---|---|
| Surgically confirmed cases (Western Danish Center) | 249/348 | 14-SNP PRS | 1.59 | 2.57×10-7 | Ovarian: OR=1.72 (p=6.7×10-5); Infiltrating: OR=1.66 (p=2.7×10-9); Peritoneal: OR=1.51 (p=2.6×10-3) | [11] |
| Danish Twin Registry | 140/316 | 14-SNP PRS | 1.50 | 0.0001 | Not reported | [11] |
| UK Biobank | 2,967/256,222 | 14-SNP PRS | 1.28 | <2.2×10-16 | Limited subtype differentiation | [11] |
| Swedish clinical cohort | 172 (cases only) | 13-SNP weighted PRS | Not significant | >0.05 | Inverse association with spread (p-trend not significant) | [29] |
Table 2: Association Between PRS and Endometriosis Clinical Presentations
| Clinical Characteristic | Association with PRS | Statistical Significance | Cohort | Implication | Citation |
|---|---|---|---|---|---|
| Spread of endometriosis | Inverse association | Lost significance when calculated as p for trend | Swedish cohort (N=172) | PRS not predictive of disease severity | [29] |
| Gastrointestinal tract involvement | Inverse association | Not significant | Swedish cohort (N=172) | Limited utility for predicting bowel endometriosis | [29] |
| Hormone treatment | Inverse association | Not significant | Swedish cohort (N=172) | Treatment response not genetically predicted | [29] |
| Ovarian endometriosis | Positive association | OR=1.72, p=6.7×10-5 | Danish surgical cohort | Moderate predictive value for specific subtype | [11] |
| Infiltrating endometriosis | Positive association | OR=1.66, p=2.7×10-9 | Danish surgical cohort | Better performance for infiltrating disease | [11] |
| Peritoneal endometriosis | Weakest association | OR=1.51, p=2.6×10-3 | Danish surgical cohort | Limited utility for peritoneal disease | [11] |
The limited performance of general PRS for specific presentations stems from the diverse molecular mechanisms driving different endometriosis subphenotypes. Recent multi-omics analyses reveal distinct pathways contribute to the disease heterogeneity:
Endometriosis exhibits local estrogen dominance despite normal circulating levels, driven by overexpression of aromatase (CYP19A1) and downregulation of 17β-hydroxysteroid dehydrogenase type 2 in ectopic lesions. Concurrent progesterone resistance results from reduced PR-B isoform expression due to promoter hypermethylation and microRNA dysregulation (e.g., miR-26a, miR-181) [13]. These hormonal variations differ across subphenotypes, contributing to PRS inaccuracies.
Pervasive immune dysregulation characterizes endometriosis, with macrophages constituting over 50% of immune cells in peritoneal fluid and exhibiting impaired phagocytic activity due to downregulated CD36 expression. Alterations in natural killer (NK) cell cytotoxicity and T-cell subset dysregulation (increased Th2, Th17, and Treg cells) vary across disease presentations [13]. This immunological heterogeneity is not captured by general PRS.
Beyond common variants included in PRS, epigenetic modifications such as N6-methyladenosine (m6A) methylation regulators (HNRNPA2B1 and HNRNPC) serve as potential biomarkers for endometriosis-related infertility [51]. These epigenetic mechanisms contribute to subphenotype specificity but are not incorporated into current PRS models.
Background: Standard PRS includes variants that affect disease risk independent of environmental factors, diluting signals from variants involved in specific pathways. Pathway PRS (pPRS) focuses on biologically relevant variant subsets to improve subphenotype prediction [52].
Materials:
Procedure:
Validation: Assess pPRS discrimination for specific subphenotypes using ROC analysis and calculate net reclassification improvement compared to general PRS.
Background: Integrating genetic risk with transcriptomic, epigenomic, and proteomic data can capture the molecular diversity of endometriosis subphenotypes [13] [51].
Materials:
Procedure:
Validation: Perform cross-validation within cohort and external validation in independent population. Compare subphenotype classification accuracy against clinical assessment.
Background: Traditional PRS methods assume linear additive effects, potentially missing non-linear relationships between genotypes and subphenotypes. Neural network approaches can learn complex annotation-function relationships [53].
Materials:
Procedure:
Validation: Benchmark against other PRS methods (LDpred, PRS-CS) using out-of-sample validation. Evaluate improvement in subphenotype classification accuracy.
Table 3: Key Research Reagents and Platforms for Endometriosis Subphenotype PRS Development
| Category | Specific Tool/Reagent | Application in PRS Research | Key Features | Representative Use |
|---|---|---|---|---|
| Genotyping Platforms | Illumina Global Screening Array | Genome-wide variant detection | ~650,000 markers optimized for imputation | Initial genotyping in PRS studies [29] |
| Color Health NGS panel | Targeted sequencing for PRS | Customizable SNP content (75-126 SNPs) | PRS implementation in WISDOM trial [54] | |
| Functional Genomics | Proseek Multiplex Inflammation panel (Olink) | Inflammation protein quantification | 92 inflammatory proteins, high sensitivity | Mapping molecular subphenotypes [29] |
| ENCODE4 database | Transcription factor binding annotation | 1,600+ experiments across cell types | Functional annotation for neural network PRS [53] | |
| Bioinformatics Tools | PRSice-2 | PRS calculation and validation | Fast, efficient, clumping and thresholding | Standard PRS development [11] |
| SBayesR (GCTB) | Bayesian PRS optimization | Sparse effects modeling, improves prediction | Enhanced effect size estimation [9] | |
| FlashPCA | Population stratification control | Efficient principal component analysis | Ancestry adjustment in PRS [29] | |
| Validation Assays | Visual Analog Scale for IBS (VAS-IBS) | Bowel symptom quantification | Patient-reported outcome measure | GI subphenotype characterization [29] |
| ENDOGRAM tissue classification | Molecular subtyping | Histopathological and molecular analysis | Linking pathology to genetics [50] |
The development of subphenotype-specific PRS for endometriosis requires a paradigm shift from general genetic risk assessment to integrated multi-omics approaches. Current evidence demonstrates that general endometriosis PRS, while informative for overall disease risk, lack precision for predicting specific clinical presentations, anatomical localizations, or treatment responses. The future of endometriosis PRS development lies in pathway-specific approaches, neural network methods incorporating functional annotations, and sophisticated integration of genetic data with transcriptomic, epigenomic, and proteomic profiles. These advanced methodologies promise to bridge the current subphenotype prediction gaps, ultimately enabling personalized risk prediction and targeted interventions for this heterogeneous disease. Implementation of the detailed experimental protocols outlined in this application note will accelerate progress toward clinically useful subphenotype prediction tools for endometriosis management.
The development of polygenic risk scores (PRS) for endometriosis represents a significant advance in understanding the genetic architecture of this complex condition. However, a substantial limitation impedes their broader application: current PRS models are predominantly derived from genome-wide association studies (GWAS) conducted in populations of European ancestry (EUR) [55] [56]. This creates a critical global imbalance in precision medicine, as PRS generated from GWAS in one population typically provide attenuated predictive accuracy when applied to other populations [57] [56]. The transferability challenge arises from multiple factors, including differences in linkage disequilibrium (LD) patterns between populations, allele frequency differences, SNP array design biases toward European variants, and the limited representation of diverse populations in genetic research cohorts [57] [55]. Until recently, over 80% of participants in genetic studies were of European descent, with only approximately 4% of GWAS participants representing East Asian ancestry [57] [55]. This bias risks exacerbating health disparities if clinically implemented, as PRS may misestimate genetic risk for individuals of non-European ancestry [57]. This Application Note provides a comprehensive framework for addressing population-specific biases in endometriosis PRS development, enabling more equitable precision medicine approaches across diverse populations.
Table 1: PRS Performance Comparison Across Ancestral Groups
| Ancestral Group | GWAS Representation | PRS Transfer Performance | Key Limiting Factors |
|---|---|---|---|
| European | ~80% of participants [57] | Reference standard (OR = 1.28-1.59 for endometriosis) [11] | Baseline reference |
| East Asian | ~4% of participants [55] | Moderately reduced | LD differences, allele frequency spectra |
| African | <3% of participants [56] | Severely reduced | Greater genetic diversity, limited reference panels |
| Admixed Populations | Highly underrepresented | Unpredictable biases [57] | Differential genetic drift, complex ancestry |
Table 2: Endometriosis-Specific Genetic Discovery by Ancestry
| Parameter | European Ancestry | East Asian Ancestry | African Ancestry |
|---|---|---|---|
| Sample Size in Largest GWAS | ~60,674 cases [15] | Limited representation in international consortia | Minimal representation |
| Number of Identified Loci | 42 genome-wide significant loci [15] | Data insufficient for comparison | Data unavailable |
| Variance Explained | Up to 5.01% [15] | Expected reduction in transferred PRS | Expected significant reduction |
| Population-Specific Variants | 14-SNP PRS developed [11] | Potential undiscovered variants | Likely numerous undiscovered variants |
Purpose: To identify population-specific and shared genetic risk variants for endometriosis across diverse ancestral groups.
Materials:
Procedure:
Troubleshooting:
Purpose: To develop and optimize PRS for endometriosis in underrepresented populations.
Materials:
Procedure:
Troubleshooting:
Diagram 1: Comprehensive workflow for developing ancestry-aware polygenic risk scores for endometriosis, highlighting parallel analysis pathways across diverse populations.
Diagram 2: Key sources of population-specific bias in PRS development and corresponding mitigation strategies.
Table 3: Essential Research Reagents and Computational Tools
| Tool/Reagent | Specification | Application in Endometriosis PRS |
|---|---|---|
| Cosmopolitan SNP Arrays | Designed with variants informative across diverse populations | Reduces genotyping ascertainment bias in multi-ancestry cohorts [57] |
| GTEx Database v8 | Tissue-specific eQTL data from uterus, ovary, blood [7] | Functional characterization of endometriosis risk variants across ancestries |
| LD Reference Panels | Population-specific (AFR, EAS, EUR) from 1000 Genomes | Improves PRS accuracy in target populations [56] |
| GCTB Software | Implements SBayesR method | Bayesian approach for PRS construction with improved cross-population performance [9] |
| PRSice-2 | Clumping and thresholding algorithm | Computes and validates PRS across multiple p-value thresholds [56] |
| METAL Software | Trans-ancestry meta-analysis | Combines GWAS results across diverse cohorts with heterogeneity testing [8] |
The Taiwan Precision Medicine Initiative (TPMI) demonstrates the substantial benefits of population-specific genomic research, having developed PRS for Han Chinese ancestry that account for up to 10.3% of health variation in that cohort [55]. This initiative identified 95 new genetic associations that were previously undetected in European-focused studies, primarily due to allele frequency differences and population-specific genetic effects [55]. Similarly, for endometriosis research, expanding GWAS in diverse populations may reveal ancestry-specific variants in genes like WNT4, VEZT, and GREB1, which have established roles in endometriosis pathogenesis [8] [1].
Future directions should prioritize the development of "polyethnic" scores that optimally combine trans-ethnic and ethnic-specific information [56]. Methods like XP-BLUP and multi-ethnic PRS are showing promise in improving predictive accuracy across diverse populations [56]. Additionally, integrating functional genomics data, including endometrial DNA methylation quantitative trait loci (mQTLs) and tissue-specific regulatory elements, will enhance the biological interpretation of population-specific risk variants [24]. As these approaches mature, researchers must simultaneously address non-genetic sources of health disparities, including healthcare access, environmental exposures, and social determinants of health, to ensure equitable benefits from precision medicine advances in endometriosis care [57].
Endometriosis is a complex, chronic inflammatory gynecological disease affecting approximately 10% of women of reproductive age worldwide and is found in 30-50% of women undergoing infertility evaluation [58] [13]. Its pathogenesis involves a multifactorial etiology with an estimated 50% heritable genetic component and 50% contribution from environmental factors, which often manifest through epigenetic modifications such as DNA methylation [59]. The disease demonstrates remarkable heterogeneity in clinical presentation, lesion distribution, and molecular profiles, complicating both diagnosis and treatment [14].
The integration of multi-omics data represents a transformative approach for deciphering this complexity. By combining methylation risk scores (MRS) with protein biomarker signatures, researchers can now develop more precise classification systems that correlate molecular subphenotypes with clinical manifestations and therapeutic responses [60] [14]. This application note provides detailed protocols for generating and validating these multi-omics signatures within the context of polygenic risk score development for endometriosis subphenotypes.
Endometriosis pathophysiology involves several interconnected mechanisms that create a hostile reproductive environment:
The molecular heterogeneity of endometriosis is evident across different lesion types—superficial peritoneal endometriosis, ovarian endometriomas, and deep infiltrating endometriosis—each demonstrating distinct transcriptional and epigenetic profiles [14].
Current endometriosis diagnosis relies on surgical visualization with histologic confirmation, resulting in an average diagnostic delay of 7-11 years from symptom onset [14] [59]. This diagnostic lag allows disease progression and potentially irreversible damage to reproductive organs. The development of non-invasive biomarkers using multi-omics approaches represents an urgent unmet clinical need that could enable earlier intervention and personalized treatment strategies [14] [61].
The following diagram illustrates the comprehensive workflow for integrating multi-omics data to develop refined endometriosis subphenotypes:
This integrated approach enables researchers to move beyond traditional classification systems (rASRM, ENZIAN) toward molecularly-defined subphenotypes with distinct clinical trajectories and therapeutic responses [60] [14].
The pathophysiology of endometriosis involves dysregulation of several key signaling pathways, many influenced by epigenetic modifications:
DNA methylation modifications in endometriosis affect genes involved in these critical pathways, contributing to disease establishment and progression [59]. Notably, hypomethylation of ESR2 (encoding ERβ) and aromatase promoters enhances local estrogen production, while hypermethylation of progesterone receptor promoters drives progesterone resistance [58] [13] [59].
Objective: Develop a methylation risk score (MRS) for endometriosis classification using endometrial tissue methylation data.
Sample Requirements:
Experimental Workflow:
Detailed Methodology:
DNA Extraction and Bisulfite Conversion
Methylation Profiling
Quality Control and Normalization
MRS Model Construction
Performance Metrics:
Table 1: Methylation Risk Score Performance Characteristics
| Parameter | Value | Description |
|---|---|---|
| Sample Size | 908 individuals | 590 cases, 318 controls |
| Optimal CpG Panel | 746 sites | Selected via elastic net regression |
| Area Under Curve (AUC) | 0.6748 | Classification performance |
| Variance Captured | 12-19.58% | Independent of common genetic variants |
| Covariates | Age, institution, genetic ancestry | Included in final model |
| Validation Approach | Train-test split by institution | Prevents overfitting |
The MRS demonstrates that DNA methylation profiles in endometrial tissue provide significant predictive value for endometriosis classification beyond genetic factors alone [60]. This epigenetic component likely reflects the environmental contributions to endometriosis risk and progression.
Objective: Identify and validate protein biomarkers in serum/plasma that complement MRS for endometriosis subphenotyping.
Sample Requirements:
Experimental Workflow:
Detailed Methodology:
Sample Preparation
LC-MS/MS Analysis
Data Processing
Biomarker Validation
Cell-Free DNA Quantification Protocol:
Key Findings:
Table 2: Protein and cf-DNA Biomarker Performance
| Biomarker Type | Analytical Platform | Key Findings | Performance |
|---|---|---|---|
| Cell-Free DNA | QIAamp Circulating Nucleic Acid Kit | 3.9x higher in endometriosis vs controls | Sensitivity 70%, Specificity 87% |
| Methylation Signature | Targeted bisulfite sequencing | 9 genes with differential methylation | Improved classification when combined with cf-DNA |
| Proteomic Profile | LC-MS/MS | Multiple inflammatory markers elevated | Complementary to epigenetic markers |
Objective: Integrate MRS, PRS, and protein biomarkers to define molecular subphenotypes of endometriosis.
Computational Workflow:
Data Preprocessing
Dimension Reduction
Subphenotype Identification
Clinical Correlation
Validation Approach:
Table 3: Essential Research Reagents for Multi-Omics Endometriosis Studies
| Category | Specific Product | Application | Key Features |
|---|---|---|---|
| DNA Methylation | Illumina Infinium MethylationEPIC Kit | Genome-wide methylation profiling | >850,000 CpG sites, comprehensive coverage |
| Bisulfite Conversion | Zymo Research EZ-96 DNA Methylation-Lightning MagPrep | DNA treatment for methylation analysis | Rapid 90min protocol, >99% conversion efficiency |
| cf-DNA Extraction | QIAamp Circulating Nucleic Acid Kit (Qiagen) | Isolation from serum/plasma | Optimized for low-abundance circulating DNA |
| Protein Depletion | ProteoPrep Immunoaffinity Albumin and IgG Depletion Kit | Serum proteome simplification | Removes >95% abundant proteins |
| Mass Spectrometry | Orbitrap Fusion Lumos Tribrid Mass Spectrometer | Proteomic quantification | High sensitivity and resolution |
| Data Analysis | MOFA+ (Multi-Omics Factor Analysis) | Multi-omics integration | Identifies latent factors across data types |
The integration of MRS and protein biomarkers offers significant promise for enriching clinical trials and personalizing therapeutic approaches:
Recent analyses of FDA submissions demonstrate increasing use of polygenic risk scores in early-phase clinical trials, particularly in neurology, oncology, and psychiatry [62]. This trend is now extending to endometriosis with the development of validated MRS and protein signatures.
The integration of methylation risk scores and protein biomarkers represents a powerful approach for deciphering endometriosis heterogeneity and advancing personalized medicine. The protocols detailed in this application note provide a roadmap for generating validated multi-omics signatures that can classify disease subphenotypes, predict treatment responses, and identify novel therapeutic targets.
As these technologies mature, multi-omics profiling is poised to transform endometriosis management—reducing diagnostic delays, enabling targeted interventions, and ultimately improving reproductive outcomes for the millions of women affected by this complex condition.
Endometriosis demonstrates profound clinical heterogeneity, with presentation varying from superficial peritoneal lesions to deeply infiltrating disease and ovarian endometriomas [63] [46]. This heterogeneity represents a significant challenge in genetic studies, as traditional genome-wide association studies (GWAS) that treat endometriosis as a single entity have explained only approximately 2.2% of disease variance despite an estimated heritability of ~50% [63] [64]. The limited observed heritability in large genetic association studies may be attributable to underlying heterogeneity of disease mechanisms, creating critical statistical power constraints that necessitate sophisticated approaches to subphenotype characterization and sample size determination [65] [64].
Emerging evidence suggests that different endometriosis subtypes likely have distinct genetic architectures. Interim results from a large meta-analysis identified 27 genome-wide significant loci, with 78% demonstrating greater effect sizes in stage III/IV disease compared to stage I/II, and 63% showing greater effect sizes in endometriosis with infertility [63]. This genetic heterogeneity underscores the necessity of well-powered subphenotype studies to uncover the full spectrum of endometriosis risk variants and facilitate meaningful polygenic risk score development.
Table 1: Sample Size Benchmarks in Recent Endometriosis Genetic Studies
| Study Focus | Total Sample Size | Cases | Controls | Key Findings | Reference |
|---|---|---|---|---|---|
| Multi-ancestry GWAS | ~1.4 million participants | 105,869 | ~1.3 million | 80 genome-wide significant associations (37 novel) | [27] |
| Indian population study | 4,000 | 2,000 | 2,000 | Aimed to address representation gap in global consortia | [63] |
| Clinical subphenotype clustering | 12,350 cases | 12,350 | 466,261 | 5 distinct subphenotype clusters with specific genetic associations | [65] [64] |
| DNA methylation analysis | 984 | 637 | 347 | 15.4% of endometriosis variation captured by DNA methylation | [24] |
The sample size requirements for endometriosis subphenotype studies are substantially influenced by several key factors:
Recent research indicates that for well-powered subphenotype analyses, individual clusters should ideally contain at least 1,000 cases to detect moderate genetic effects (OR > 1.3) for common variants [65] [64]. The identification of five clinical subphenotype clusters in electronic health record data demonstrates how subphenotype stratification can enhance genetic discovery, with each cluster showing distinct genetic associations including PDLIM5 for pain comorbidities, GREB1 for uterine disorders, and WNT4 for pregnancy complications [64].
Table 2: Endometriosis Subphenotype Classification Systems
| Classification System | Subphenotypes Identified | Application in Genetic Studies | Strengths |
|---|---|---|---|
| rASRM Surgical Staging | Stages I-IV based on lesion appearance and extent | GWAS stratification by disease severity | Widely adopted, standardized scoring |
| Lesion Type Classification | Superficial Peritoneal (SUP), Ovarian Endometrioma (OMA), Deep Infiltrating Endometriosis (DIE) | Differential genetic effect sizes across subtypes | Direct mapping to pathological processes |
| Clinical Symptom Clustering | Pain comorbidities, Uterine disorders, Pregnancy complications, Cardiometabolic comorbidities, Asymptomatic | EHR-based clustering for genetic association | Captures clinical heterogeneity beyond surgical findings |
| Genital vs. Extragenital | Reproductive organ involvement vs. non-reproductive organ involvement | Understanding somatic mutation patterns | Accounts for lesion location heterogeneity |
Objective: To identify clinically meaningful endometriosis subphenotypes from electronic health record data for enhanced genetic association power.
Materials:
Procedure:
Expected Outcomes: Identification of 5 distinct subphenotype clusters with characteristic clinical profiles and differential genetic associations [64].
Subphenotype Study Design Workflow: This diagram illustrates the comprehensive approach from patient recruitment through genetic discovery, emphasizing the iterative process of phenotypic characterization and genetic validation necessary for well-powered subphenotype studies.
Objective: To identify epigenetic regulation of endometriosis risk through methylation quantitative trait loci analysis.
Materials:
Procedure:
Expected Outcomes: Identification of approximately 118,185 independent cis-mQTLs including 51 associated with endometriosis risk, as demonstrated in recent large-scale analyses [24].
Table 3: Essential Research Reagents for Endometriosis Subphenotype Studies
| Reagent/Category | Specific Examples | Application | Function in Research |
|---|---|---|---|
| DNA Collection | EDTA blood collection tubes, DNA extraction kits (Qiagen, Thermo Fisher) | Genetic variant identification | High-quality DNA for genotyping and sequencing |
| Methylation Analysis | Illumina Infinium MethylationEPIC BeadChip | Epigenetic profiling | Genome-wide DNA methylation quantification at >850,000 sites |
| Protein Biomarkers | ELISA kits (Human R-Spondin3 ELISA Kit) | Protein quantification | Validation of proteomic findings from pQTL studies |
| Tissue Processing | TRIzol reagent, RNAlater | Transcriptomic analysis | RNA preservation and extraction for gene expression studies |
| Single-Cell Technologies | 10x Genomics Chromium System, dissociation enzymes | Cellular heterogeneity characterization | Resolution of cell-type specific signatures in lesions |
| Immunoassays | Multiplexed immunoaffinity assays (SOMAscan) | Plasma protein measurement | High-throughput proteomic profiling for pQTL studies |
Adequate statistical power for endometriosis subphenotype studies requires careful consideration of multiple factors:
For rare subphenotypes (prevalence < 5% in endometriosis population), sample sizes exceeding 50,000 total cases may be required to detect moderate genetic effects (OR > 1.5). The recent identification of five endometriosis subphenotype clusters through EHR data mining demonstrated that cluster-specific genetic associations could be detected with cluster sizes ranging from 441 to 1,151 in a discovery cohort of 4,078 cases [64].
Addressing statistical power constraints in endometriosis subphenotype studies requires multi-faceted strategies including international collaborations to achieve sufficient sample sizes, standardized phenotyping protocols to reduce heterogeneity, innovative clustering approaches to identify biologically meaningful subgroups, and integration of multi-omic data to enhance discovery power. The development of robust polygenic risk scores for endometriosis subphenotypes depends on overcoming these power constraints through carefully designed studies that acknowledge and account for the substantial clinical and genetic heterogeneity of this complex disease.
Future directions should prioritize diverse population inclusion, longitudinal phenotype assessment, and integration of functional genomic data to further refine subphenotype definitions and enhance the translational potential of genetic discoveries for personalized endometriosis management.
Within endometriosis research, accurate case definition is a fundamental prerequisite for valid genetic and epidemiological studies. The development of polygenic risk scores (PRS) for disease subphenotypes is particularly sensitive to how endometriosis cohorts are ascertained. This document outlines application notes and protocols for validating two primary cohort definitions: those based on surgical confirmation (the clinical gold standard) and those derived from administrative health data (e.g., ICD codes).
A critical understanding of the operating characteristics—including sensitivity, specificity, and agreement metrics—between these two definitions is essential. It ensures that PRS models are trained on reliably classified phenotypes, thereby enhancing the predictive accuracy and clinical utility of the resulting scores for specific endometriosis manifestations [66] [11].
The table below summarizes key validation metrics from recent studies comparing surgically confirmed endometriosis with cases identified through administrative health data.
Table 1: Validation Metrics of Administrative Data Against Surgical Confirmation for Endometriosis
| Endometriosis Phenotype | Sensitivity (Range) | Specificity (Range) | Agreement (Kappa Statistic) | Key Findings and Implications for PRS |
|---|---|---|---|---|
| Overall Endometriosis [66] | 0.86 - 0.88 | 0.83 - 0.87 | 0.65 - 0.74 (Substantial) | Administrative data shows high validity for etiologic studies of general endometriosis risk. Suitable for initial PRS development. |
| Superficial Peritoneal Disease [66] | ~0.86 | ~0.83 | ~0.65 (Substantial) | Reasonably well-captured, allowing for genetic studies of this common subphenotype. |
| Ovarian Endometrioma [66] | ~0.82 | ~0.92 | ~0.58 (Moderate) | High specificity is valuable for case-control genetic studies focusing on ovarian disease. |
| Deep Infiltrating Endometriosis [66] | ~0.12 (Very Low) | ~0.99 (Very High) | ~0.17 (Slight) | Poorly captured by codes. Low sensitivity undermines statistical power for subtype-specific PRS; high specificity is only useful for pure control selection. |
| Self-Reported Endometriosis [67] | Variable (Literature: 32-89%) | N/A | N/A | Concordance varies widely by population and questionnaire. Requires rigorous validation against clinical records before use in genetic studies. |
Table 2: Characteristics of Cohort Types for Endometriosis PRS Research
| Cohort Definition | Gold Standard Status | Primary Advantages | Primary Limitations | Recommended Use in PRS Pipeline |
|---|---|---|---|---|
| Surgically Confirmed | Yes | High diagnostic certainty; allows for precise subphenotyping (rASRM stage, lesion location) [66] [47]. | Invasive; expensive; cohort sizes may be limited; potential selection bias towards symptomatic cases. | Ideal for discovery and training of subphenotype-specific PRS models. |
| Administrative Health Data (ICD Codes) | No | Large sample sizes; population-based; cost-effective for very large studies [66] [11]. | Misclassification bias (see Table 1); limited clinical detail; heterogeneity in coding practices. | Best for initial testing and validation in large, independent cohorts, or for studies of broad endometriosis risk. |
| Self-Reported | No | Easy to collect via questionnaire; can reach very large numbers. | High potential for misclassification; recall bias; cannot distinguish subtypes [67]. | Use with extreme caution; requires internal validation substudy against clinical data. |
Objective: To quantify the agreement between endometriosis diagnoses recorded in administrative health databases (e.g., using ICD-9/10 codes) and surgically confirmed diagnoses in a cohort of individuals who underwent laparoscopy/laparotomy.
Materials: Cohort with linked surgical and administrative data (e.g., Utah Population Database, ENDO Study cohort) [66].
Procedure:
Objective: To develop a PRS for endometriosis and test its performance in cohorts defined by surgical confirmation and administrative codes.
Materials: Genotyped cohorts: 1) Surgically confirmed cases and controls from a clinical referral center, 2) Cases and controls identified from a population biobank using ICD-10 codes (e.g., UK Biobank, Danish registries) [11] [68].
Procedure:
PRS = Σ (β_i * G_i), where β_i is the effect size of the i-th risk allele from the GWAS summary statistics, and G_i is the individual's allele count (0, 1, 2) [11] [9].
Table 3: Essential Research Reagent Solutions for Cohort Validation & PRS Studies
| Item / Resource | Function / Application | Specific Examples / Notes |
|---|---|---|
| Linked Biobanks & Health Registries | Provides genotype data linked to longitudinal health records for validation and large-scale genetic studies. | Utah Population Database (UPDB) [66], UK Biobank [11] [9], Danish National Patient Registry [11]. |
| Standardized Surgical Forms | Ensures consistent and comprehensive intraoperative data collection for precise phenotyping. | Revised ASRM (rASRM) operative form [66] [47]. |
| Genotyping Arrays & Imputation | Provides genome-wide SNP data for PRS calculation. | Commercial arrays (e.g., Illumina Global Screening Array) followed by imputation to reference panels (e.g., 1000 Genomes). |
| PRS Calculation Software | Tools to compute polygenic risk scores from genotype data using external GWAS summary statistics. | PLINK1.9/2.0 [9], PRSice, LDPred2. |
| GWAS Summary Statistics | The base data containing SNP effect sizes and p-values used to weight SNPs in the PRS. | Publicly available from largest endometriosis GWAS meta-analyses [11] [69]. |
| Statistical Analysis Software | Platform for performing validation statistics and genetic association analyses. | R, Python, SAS. |
This application note provides a detailed examination of the performance metrics and discriminatory accuracy of polygenic risk scores (PRS) across different subtypes of endometriosis. Endometriosis is a complex gynecological disorder affecting 6-10% of reproductive-aged women, characterized by the presence of endometrial-like tissue outside the uterine cavity [70] [1]. The disease demonstrates significant heterogeneity in its clinical presentation and localization, necessitating subtype-specific diagnostic and risk assessment approaches.
The current gold standard for diagnosis—laparoscopic surgery with histological confirmation—presents significant clinical challenges, with diagnostic delays typically ranging from 7 to 11 years [9] [1]. Polygenic risk scores, which aggregate the effects of multiple genetic risk variants into a single measure, offer promising avenues for non-invasive risk stratification and early detection. However, their performance varies considerably across different endometriosis subtypes, necessitating careful evaluation of their discriminatory accuracy for each major disease manifestation.
This document provides researchers and drug development professionals with comprehensive experimental protocols, performance metrics, and methodological frameworks for assessing PRS utility across the endometriosis subtype spectrum, with particular focus on ovarian, infiltrating, peritoneal, and deep infiltrating disease variants.
Table 1: PRS Discriminatory Performance Across Endometriosis Subtypes
| Endometriosis Subtype | Cohort | Odds Ratio (OR) per SD PRS Increase | P-value | Sample Size (Cases/Controls) |
|---|---|---|---|---|
| Overall Endometriosis | Danish Combined | 1.57 | 2.5×10−11 | 389/664 |
| Overall Endometriosis | UK Biobank | 1.28 | <2.2×10−16 | 2,967/256,222 |
| Ovarian (N80.1) | Danish Combined | 1.72 | 6.7×10−5 | 75/NR |
| Infiltrating (N80.4, N80.5) | Danish Combined | 1.66 | 2.7×10−9 | 210/NR |
| Peritoneal (N80.2, N80.3) | Danish Combined | 1.51 | 2.6×10−3 | 60/NR |
| Superficial | Utah ENDO Study | - | - | 143/412 |
| Ovarian Endometriomas | Utah ENDO Study | - | - | 38/412 |
| Deep Infiltrating | Utah ENDO Study | - | - | 58/412 |
The PRS demonstrates varying discriminatory ability across endometriosis subtypes, with the strongest association observed for ovarian endometriosis (OR=1.72) and the weakest for peritoneal disease (OR=1.51) [11]. Notably, the discriminative accuracy is not yet sufficient for standalone clinical utility but may add significant value when combined with classical clinical risk factors and symptoms [11].
Table 2: Validation Metrics for Administrative Health Data vs. Surgical Diagnosis
| Endometriosis Subtype | Sensitivity | Specificity | Kappa (Κ) Agreement |
|---|---|---|---|
| Overall Endometriosis | 0.88 | 0.87 | 0.74 |
| Superficial Endometriosis | 0.86 | 0.83 | 0.65 |
| Ovarian Endometriomas | 0.82 | 0.92 | 0.58 |
| Deep Infiltrating Endometriosis | 0.12 | 0.99 | 0.17 |
Deep infiltrating endometriosis shows notably low sensitivity (0.12) in administrative health data, indicating this subtype is not reliably annotated in healthcare records and may require specialized detection approaches [66]. This has significant implications for PRS validation studies that rely on diagnostically coded cohorts.
Table 3: Global Prevalence of Endometriosis and Adenomyosis Subtypes
| Condition/Subtype | Population | Prevalence % (95% CI) | Number of Studies |
|---|---|---|---|
| Adenomyosis (focal) | General | 17% (7-30) | 59 |
| Adenomyosis (diffuse) | General | 15% (9-23) | 59 |
| Peritoneal Endometriosis | General | 6% (1-15) | 68 |
| Ovarian Endometriosis | General | 13% (5-24) | 68 |
| Deep Endometriosis | General | 10% (2-24) | 68 |
| Endometriosis (any) | Infertile women | 38% (25-51) | 68 |
| Adenomyosis (any) | Infertile women | 31% (10-58) | 59 |
Recent systematic reviews indicate that endometriosis affects approximately 38% of women experiencing infertility, with ovarian endometriosis being the most prevalent specific subtype (13%) in the general population [71]. The PRS for endometriosis shows no significant association with adenomyosis, suggesting these conditions are driven by different genetic risk variants despite shared clinical features [11].
Protocol 1: Polygenic Risk Score Development for Endometriosis Subtyping
4.1.1 Study Design and Cohort Identification
4.1.2 Genotyping and Quality Control
4.1.3 PRS Calculation
4.1.4 Statistical Analysis
4.2.1 Surgical Confirmation Protocol
4.2.2 Administrative Data Validation
Figure 1: PRS Development and Validation Workflow for Endometriosis Subtypes
Figure 2: Pathophysiological Pathways in Endometriosis Subtypes
Table 4: Essential Research Reagents and Platforms for Endometriosis PRS Studies
| Category | Specific Product/Platform | Application in Endometriosis PRS Research |
|---|---|---|
| Genotyping Platforms | Illumina Global Screening Array | High-density genotyping for GWAS and PRS derivation [29] |
| Imputation Resources | TOPMed Imputation Server (Version R2) | Genotype imputation using diverse reference panels [29] |
| Analysis Software | PLINK (v1.9/v2.0) | PRS calculation, basic QC, and association testing [9] [29] |
| Analysis Software | FlashPCA | Principal component analysis for population stratification control [29] |
| Analysis Software | METAL | GWAS meta-analysis for variant effect size estimation [9] |
| Biomarker Assays | Proseek Multiplex Inflammation I | Inflammation panel (92 proteins) for subtype characterization [29] |
| Biomarker Assays | Olink Target 96 Platform | High-sensitivity protein biomarker detection [29] |
| Cell Type Characterization | CIBERSORTx Algorithm | Deconvolution of bulk transcriptomic data to estimate cell type proportions [70] |
| Single-Cell Reference | Marečková et al. Endometriosis Atlas | Reference scRNA-seq data for cell type annotation [70] |
The discriminatory accuracy of polygenic risk scores across endometriosis subtypes demonstrates significant variability, with the strongest associations observed for ovarian and infiltrating subtypes compared to peritoneal disease [11]. This heterogeneity likely reflects underlying differences in the genetic architecture and pathophysiological mechanisms driving distinct disease manifestations.
Notably, the combination of PRS with classical clinical risk factors and symptoms represents a promising approach for risk stratification tools [11]. However, researchers must account for the substantial differences in validity of subtype classification across data sources, particularly the poor sensitivity of administrative health data for deep infiltrating disease [66].
Future research directions should include the development of subtype-specific PRS using larger GWAS datasets with well-characterized surgical phenotypes, integration of multi-omics data to enhance predictive power, and investigation of gene-environment interactions across different endometriosis manifestations. The emerging understanding of cellular heterogeneity in endometriosis, particularly the role of MUC5B+ epithelial cells and dStromal late mesenchymal cells, provides new opportunities for refining subtype classification and understanding the biological mechanisms underlying genetic risk [70].
Furthermore, the association between genetic liability to endometriosis and hormonal factors, particularly the causal relationship with lower testosterone levels identified through Mendelian randomization approaches, suggests promising avenues for integrating endocrine biomarkers with genetic risk profiles for improved subtype discrimination [9].
Endometriosis, a complex gynecological disorder affecting approximately 10% of reproductive-aged women, presents significant diagnostic challenges, with average delays of 7-10 years between symptom onset and definitive diagnosis [72] [73]. The gold standard for diagnosis remains laparoscopic surgery with histological confirmation, an invasive approach with inherent risks and limited patient acceptability [73]. This application note provides a comprehensive comparison between emerging polygenic risk score (PRS) methodologies and established diagnostic markers for endometriosis, contextualized within research frameworks for subphenotype investigation and therapeutic development.
Table 1: Comparative performance metrics of PRS versus traditional diagnostic markers for endometriosis
| Parameter | Polygenic Risk Score (PRS) | Traditional Biomarker CA-125 | Laparoscopy (Gold Standard) |
|---|---|---|---|
| Predictive Area (AUC) | 0.744 for severe endometriosis (ML model) [47] | Limited standalone diagnostic value; often elevated but nonspecific | Not applicable (definitive diagnosis) |
| Odds Ratio (OR) | OR=1.57-1.72 per SD increase for various subtypes [21] | Not applicable | Not applicable |
| Sensitivity | Varies by model and population | 36-84% (high variability) [47] | High (visual confirmation) |
| Specificity | Varies by model and population | 41-95% (high variability) [47] | High (histological confirmation) |
| Sample Requirement | DNA from blood or saliva | Serum | Tissue biopsy |
| Key Advantages | Captures genetic predisposition; applicable pre-symptomatically; quantifiable risk stratification | Minimally invasive; low cost | Definitive diagnosis; allows concurrent treatment |
| Key Limitations | Currently limited predictive power as standalone tool; population-specific performance | Poor specificity; influenced by menstrual cycle, pregnancy, other pathologies | Invasive procedure; surgical risks; cost |
Purpose: To derive SNP effect sizes for PRS calculation through large-scale genetic association studies.
Detailed Protocol:
Purpose: To quantify CA-125 levels in serum for endometriosis assessment.
Detailed Protocol:
Purpose: To classify endometriosis molecular subtypes for stratified genetic analysis.
Detailed Protocol:
Diagram 1: Integrated workflow for endometriosis diagnostics combining PRS, traditional biomarkers, and clinical assessment
Diagram 2: Molecular pathways in endometriosis showing potential intervention points for PRS and biomarker applications
Table 2: Essential research reagents and computational tools for endometriosis biomarker studies
| Category | Specific Tool/Reagent | Application in Endometriosis Research | Key Features |
|---|---|---|---|
| Genotyping Arrays | Illumina Global Screening Array [29] | PRS variant genotyping | High-throughput SNP coverage |
| Imputation Reference | TOPMed Version R2 [29] | Genotype imputation | Diverse population representation |
| PRS Software | plink1.9 [9], SBayesR [9], GCTB 2.02 [9] | PRS calculation and weighting | Bayesian approaches for improved prediction |
| Biomarker Assays | Proseek Multiplex Inflammation I [29] | Inflammatory protein profiling | 92 inflammation-related proteins |
| Immunoassays | Electro Chemi Luminescence Immunoassay (ECLI) [29] | Autoantibody detection (e.g., TRAb) | High sensitivity detection |
| Transcriptomic Tools | Illumina next Seq NGS technology [75] | RNA-seq for molecular subtyping | High-throughput gene expression |
| Clustering Algorithms | ConsensusClusterPlus [74] | Molecular subtype identification | Unsupervised pattern discovery |
| Machine Learning | randomForest, XGBoost, LASSO [47] | Predictive model development | Feature selection and classification |
The integration of PRS with traditional diagnostic markers represents a promising avenue for advancing endometriosis research and clinical management. Current evidence demonstrates that PRS captures a distinct dimension of endometriosis risk—genetic predisposition—that complements the pathophysiological information provided by traditional biomarkers. Notably, PRS shows association with all endometriosis subtypes (ovarian: OR=1.72, infiltrating: OR=1.66, peritoneal: OR=1.51) [21], suggesting broad applicability across disease manifestations.
Critical research gaps remain in optimizing PRS for diverse populations, understanding the genetic factors underlying disease subphenotypes, and integrating multimodal data sources. The identification of distinct molecular subtypes (stroma-enriched S1 and immune-enriched S2) with differential responses to hormone therapy [74] highlights the potential for PRS to guide personalized treatment approaches. Future studies should focus on developing subphenotype-specific PRS models and validating their utility in prospective clinical cohorts.
For researchers and drug development professionals, the protocols and frameworks presented herein provide a foundation for advancing precision medicine approaches in endometriosis, potentially reducing diagnostic delays and improving therapeutic outcomes for this complex condition.
The integration of polygenic risk scores (PRS) into endometriosis care represents a paradigm shift with the potential to redefine early detection and intervention strategies for this complex gynecological condition. Endometriosis, affecting an estimated 10% of women of reproductive age, is characterized by a substantial diagnostic delay of 6-11 years, during which disease progression and pain sensitization may occur [29] [76]. Clinical utility assessment provides a critical framework for evaluating how PRS—as a quantitative measure of genetic susceptibility—can impact patient outcomes and healthcare efficiency when applied to endometriosis subphenotypes. This protocol outlines comprehensive methodologies for establishing the clinical utility of PRS in expediting diagnosis, personalizing interventions, and ultimately improving quality of life for affected individuals.
The clinical landscape for endometriosis diagnosis remains challenging due to non-specific symptoms, the invasiveness of definitive laparoscopic diagnosis, and the absence of reliable non-invasive biomarkers [77] [76]. This diagnostic dilemma creates significant barriers to early intervention:
These limitations underscore the urgent need for innovative risk stratification tools like PRS that can identify candidates for targeted diagnostic interventions earlier in the disease course.
Polygenic risk scores aggregate the effects of numerous genetic variants into a single measure of genetic susceptibility [56]. The standard approach calculates PRS as the sum of risk alleles weighted by their effect sizes derived from genome-wide association studies (GWAS) [78].
Table 1: Essential Quality Control Steps for PRS Analysis
| Data Component | QC Parameter | Threshold | Rationale |
|---|---|---|---|
| Base Data (GWAS) | Heritability (h²snps) | >0.05 | Ensures sufficient genetic signal |
| Effect allele specification | Must be clearly defined | Prevents direction errors in association | |
| Target Data | Sample missingness | <0.02 | Reduces genotyping error |
| Minor allele frequency | >0.01 | Filters rare variants | |
| Imputation quality | INFO score >0.8 | Ensures reliable imputed genotypes | |
| Heterozygosity | P > 1×10⁻⁶ | Identifies sample contamination |
Critical quality control measures must be implemented to ensure PRS validity [78]:
The following diagram illustrates the standard workflow for PRS development and validation:
Recent studies demonstrate the discriminative ability of PRS for endometriosis across diverse cohorts:
Table 2: Performance Metrics of Endometriosis PRS Across Studies
| Cohort | Sample Size | Odds Ratio per SD | P-value | Clinical Implications |
|---|---|---|---|---|
| Surgically Confirmed Cases [79] | 249 cases, 348 controls | 1.59 | 2.57×10⁻⁷ | Strong association with confirmed disease |
| Danish Twin Registry [79] | 140 cases, 316 controls | 1.50 | 0.0001 | Validates genetic component |
| Combined Danish Cohorts [79] | 389 cases, 664 controls | 1.57 | 2.5×10⁻¹¹ | Consistent effect across recruitment strategies |
| UK Biobank [79] | 2,967 cases, 256,222 controls | 1.28 | <2.2×10⁻¹⁶ | Confirmation in large-scale biobank |
| Subtype Analysis (Combined) [79] |
Key findings from these studies indicate:
Robust phenotyping is fundamental to PRS clinical utility assessment for endometriosis subphenotypes. The World Endometriosis Research Foundation Endometriosis Phenome and Biobanking Harmonisation Project (EPHect) has developed standardized instruments for surgical data collection [80]:
Objective: To evaluate the association between PRS and specific endometriosis clinical presentations.
Methodology:
Objective: To determine whether PRS-guided triage reduces time to diagnosis.
Methodology:
Table 3: Essential Research Reagents and Platforms for PRS Investigations
| Reagent/Platform | Specification | Research Function | Considerations |
|---|---|---|---|
| Genotyping Array | Illumina Global Screening Array | Genome-wide variant detection | Coverage of endometriosis-associated SNPs [29] |
| Imputation Reference | TOPMed Panel R2 on GRCh38 | Enhances variant coverage | INFO score >0.8 recommended for QC [29] |
| PRS Calculation Software | PLINK (v1.9+) | Clumping/thresholding method | Industry standard for PRS computation [78] [29] |
| Alternative PRS Tools | PRSice, LDpred | Advanced scoring methods | Bayesian approaches for improved prediction [56] |
| Inflammatory Protein Panel | Olink Multiplex Inflammation I | Analyzes 92 inflammatory proteins | Identifies protein correlates of PRS [29] |
| Quality Control Tools | FlashPCA, standard GWAS QC | Population stratification control | Essential for confounding reduction [78] |
The clinical utility of PRS must be evaluated through multidimensional assessment:
Despite promising associations, several challenges remain for clinical implementation:
Future research should prioritize:
Polygenic risk scores represent a promising tool for enhancing early detection and intervention timing in endometriosis. Current evidence demonstrates significant association between PRS and endometriosis risk across multiple cohorts, with potential for stratifying women based on genetic susceptibility. However, clinical implementation requires careful attention to standardized phenotyping, methodological rigor in PRS calculation, and comprehensive assessment of clinical utility across multiple domains. As research advances, PRS-guided strategies may ultimately reduce the protracted diagnostic journey that currently characterizes endometriosis, enabling earlier intervention and improved quality of life for affected individuals.
The integration of polygenic risk scores (PRS) into endometriosis care represents a promising paradigm shift towards precision medicine. Current evidence, primarily from modeling studies, indicates a positive trend toward the cost-effectiveness of PRS-based strategies. These approaches largely focus on optimizing screening programs and refining eligibility for preventive therapies [83]. However, the field faces significant challenges, including limited real-world evidence, questions concerning the generalizability of findings across diverse populations, and a need to fully account for implementation costs and long-term benefits [83]. The following analysis provides a structured overview of the economic landscape, detailed protocols for evaluation, and essential research tools to advance the cost-benefit understanding of PRS implementation for endometriosis.
Table 1: Summary of Economic Evaluation Evidence for PRS-based Approaches
| Evaluation Aspect | Current Evidence Status |
|---|---|
| Overall Trend | Positive trend towards cost-effectiveness identified in systematic review [83]. |
| Primary Applications | 1. Optimization of cancer screening programs (16 out of 24 studies) [83].2. Refinement of eligibility for preventive therapies (esp. in cardiovascular and other diseases) [83]. |
| Analysis Quality | Generally high quality among 24 included cost-utility analyses [83]. |
| Key Methodological Limitations | Reliance on hypothetical cohorts; limited generalizability; insufficient attention to implementation costs and delivery models; focus on clinical benefits only [83]. |
| Evidence Gaps | Limited use of real-world data; issues of population representativeness; gaps in accounting for long-term health and non-health benefits [83]. |
Table 2: Performance Metrics of an Endometriosis-Specific PRS Data based on a PRS derived from 14 genetic variants, validated across multiple cohorts [21] [11] [68].
| Cohort | Case Definition | Odds Ratio (OR) per SD increase in PRS | P-value |
|---|---|---|---|
| Danish Clinical Cohort | Surgically confirmed | 1.59 | 2.57 × 10-7 |
| Danish Registry Cohort | ICD-10 codes | 1.50 | 0.0001 |
| UK Biobank | ICD-10 codes | 1.28 | < 2.2 × 10-16 |
This protocol outlines a methodology for evaluating the long-term economic and health impacts of implementing a PRS for endometriosis risk stratification.
1. Study Design and Model Framework
2. Model Parameters and Data Inputs
3. Outcome Measures
4. Analysis
The workflow for this economic evaluation is outlined below.
This protocol describes how to conduct a Phenome-Wide Association Study using an endometriosis PRS to uncover genetic correlations with comorbid conditions, which can inform a more complete assessment of the economic impact of PRS implementation [9].
1. Polygenic Risk Score Calculation
2. Phenotype Data Preparation
3. Statistical Analysis
4. Interpretation and Economic Implications
The workflow for this analysis is as follows.
Table 3: Essential Research Materials and Tools for Endometriosis PRS and Economic Research
| Item / Tool | Function / Application | Example / Note |
|---|---|---|
| Genotyping Array | Genome-wide genotyping to generate raw genetic data for PRS calculation. | Illumina Global Screening Array [29]. |
| Imputation Reference Panel | Increases genomic coverage by predicting ungenotyped variants. | TOPMed Imputation Server (Version R2 on GRC38) [29]. |
| GWAS Summary Statistics | Source of SNP effect sizes (weights) for PRS calculation. | Published large-scale endometriosis GWAS (e.g., GCST004549) [29]. |
| Statistical Software (PLINK) | A core tool for genome data management, quality control, and PRS calculation. | Used for --score function to calculate PRS [9]. |
| Bayesian Analysis Tool (GCTB) | Implements advanced methods for PRS weighting to improve predictive performance. | SBayesR method for adjusting GWAS summary statistics [9]. |
| Phecode Catalog | Provides the mapping system to convert ICD codes into research-ready phenotype groups (phecodes). | Essential for PRS-PheWAS to define traits for association testing [9]. |
| Health Economic Modeling Software | Platform for building and running state-transition models for cost-effectiveness analysis. | R, TreeAge, SAS, or Excel with specialized add-ins. |
The development of polygenic risk scores for endometriosis subphenotypes represents a transformative approach to addressing the significant diagnostic delays and heterogeneity that have long challenged clinical management. Current evidence demonstrates that PRS can stratify risk for all major endometriosis subtypes, though standalone predictive power remains insufficient for direct clinical implementation. Future research must prioritize the development of subphenotype-specific PRS, integration of multi-omics data including methylation risk scores, and diversification of genetic studies across ancestral populations. For drug development, these tools offer unprecedented opportunities for patient stratification in clinical trials and identification of novel therapeutic targets based on genetic pathways. The convergence of PRS with artificial intelligence and comprehensive biomarker panels will ultimately enable the precision medicine paradigm that endometriosis patients urgently require, potentially reducing diagnostic delays and improving therapeutic outcomes through genetically-informed care pathways.