FRONTIERS IN GENETICS
Isak Falk, Millie Zhao, Juba Nait Saada, Qi Guo
Abstract
Compared to Genome-Wide Association Studies (GWAS) for common variants, single-marker association analysis for rare variants is underpowered. Set-based association analyses for rare variants are powerful tools that capture some of the missing heritability in trait association studies.We extend the convex-optimized SKAT (cSKAT) test set procedure which learns from data the optimal convex combination of kernels, to the full Generalised Linear Model (GLM) setting with arbitrary non-genetic covariates. We call this extended cSKAT (ecSKAT) and show that the resulting optimization problem is a quadratic programming problem which can be solved at no additional cost compared to cSKAT. ecSKAT enables correcting for important confounders in association studies such as age, sex or population structure for both quantitative and binary traits.We show that a modified objective upper bounds the p-value through a decreasing exponential term in the objective, indicating that optimizing this objective is a principled way of learning the combination of kernels. We evaluate the performance of the proposed method on continuous and binary traits using simulation studies and illustrate its application using UK Biobank Whole Exome Sequencing (WES) data on hand grip strength and systemic lupus erythematosus rare variant association analysis.
Back to publications