1.Devise algorithms that can detect population structure confounders and subsequently select SNP-pairs that are strongly associated with complex diseases from genome-wide association studies
2.Devise a statistical-based machine learning framework to construct classification/prediction models. Such models will be used to predict the risk of having diseases from genetic variation profile.
3.Implement scalable parallel computing tools that can speedup the proposed algorithms to address the high computational complexity requirement of genome wide association studies.
4.Identify predictive SNPs for classifying 1) seven common complex diseases from WTCCC dataset and 2) ?-thalassemia/HbE disease and 3) major depressive disorder.