TR EN

MSc Thesis Defense: Ceren Yıldırım, PHYLOGENY-AWARE INFERENCE OF NUCLEOTIDE VARIANT TOLERANCE ACROSS THE GENOME, Date & Time: June 23, 2026 – 1:30 PM, Place: FENS L027

PHYLOGENY-AWARE INFERENCE OF NUCLEOTIDE VARIANT

TOLERANCE ACROSS THE GENOME

 

Ceren Yıldırım
Molecular Biology, Genetics, and Bioengineering, MSc Thesis, 2026

 

Thesis Jury

     Assoc. Prof. Ogün Adebali (Thesis Advisor)

  Assoc. Prof. Öznur Taştan

  Prof. Dr. Uğur Özbek

 

  

Date & Time: June 23th, 2026 – 1.30 PM

Place: FENS L027

Zoom: https://sabanciuniv.zoom.us/j/2054490808



Keywords : phylogenetics, Mendelian diseases, pathogenicity scoring, single nucleotide variant effect prediction

 

Abstract

 

Accurate classification of single-nucleotide variants remains a challenge in genomics. We present PHACTn, a phylogeny-aware probabilistic method for scoring variant tolerability across the human genome. Using a 470-way mammalian alignment, PHACTn derives scores from an explicit model of nucleotide substitution histories across the phylogenetic tree, requiring no training data, no learned parameters, and no specialised hardware. Evaluated on a dataset comprising pathogenic variants from ClinVar and benign variants from gnomAD, PHACTn outperformed all classical conservation scores, including phyloP, phastCons, and GERP. In non-coding variant prediction, it outperformed all tools on a curated set of variants associated with Mendelian diseases. On the hard-case subset, comprising clinically ambiguous variants where established tools disagree, it ranked first in AUROC, F1, and MCC, suggesting that phylogenetic independence captures complementary information that existing methods largely miss. Despite using only four interpretable parameters, PHACTn remains competitive with or superior to large foundation models such as GPN-MSA and Evo2-7B across multiple variant categories, while requiring a fraction of the computational resources. Each prediction can be traced directly through the phylogenetic tree, offering a level of transparency that sequence-based deep learning models cannot provide. PHACTn thus offers a principled, accessible, and interpretable framework for variant effect prediction, with particular strength in non-coding and clinically ambiguous genomic regions.

Home

Orta Mahalle, 34956 Tuzla, İstanbul, Türkiye

Telefon: +90 216 483 90 00

Fax: +90 216 483 90 05

© Sabancı Üniversitesi 2023