本文为加拿大纽芬兰纪念大学(作者:Songyuan Ji)的硕士论文,共109页。
The study of Single Nucleotide Polymorphisms (SNPs) associated with human diseases is important for identifying pathogenic genetic variants and illuminating the genetic architecture of complex diseases. A Genome-wide association study (GWAS) examines genetic variation in different individuals and detects disease related SNPs. The traditional machine learning methods always use SNPs data as a sequence to analyze and process and thus may overlook the complex interacting relationships among multiple genetic factors. In this thesis, we propose a new hybrid deep learning approach to identify susceptibility SNPs associated with colorectal cancer. A set of SNPs variants were first selected by a hybrid feature selection algorithm, and then organized as 3D images using a selection of space-filling curve models. A multi-layer deep Convolutional Neural Network was constructed and trained using those images. We found that images generated using the space-filling curve model that preserve the original SNP locations in the genome yield the best classification performance. We also report a set of high risk SNPs associate with colorectal cancer as the result of the deep neural network model.
- 项目背景
- 研究方法
- 研究结果
- 讨论与结论
附录A.1 鉴别器的Python代码
附录A.2 最终结果的基因组信息