Importance of GWAS Risk Loci and Clinical Data in Predicting Asthma Using Machine-learning Approaches
- Authors: Qin Z.1, Liang S.2, Long J.3, Deng J.2, Wei X.2, Yang M.2, Tang S.4, Li H.2
-
Affiliations:
- Department of Respiratory and Critical Care Medicine,, First Affiliated Hospital of Guangxi Medical University,
- Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University
- Department of Epidemiology and Health Statistics, School of Public Health of Guangxi Medical University
- School of Automation, Xi'an University of Posts and Telecommunications
- Issue: Vol 27, No 3 (2024)
- Pages: 400-407
- Section: Chemistry
- URL: https://kazanmedjournal.ru/1386-2073/article/view/644674
- DOI: https://doi.org/10.2174/1386207326666230602161939
- ID: 644674
Cite item
Full Text
Abstract
Introduction:To understand the risk factors of asthma, we combined genome-wide association study (GWAS) risk loci and clinical data in predicting asthma using machine-learning approaches.
Methods:A case-control study with 123 asthmatics and 100 controls was conducted in the Zhuang population in Guangxi. GWAS risk loci were detected using polymerase chain reaction, and clinical data were collected. Machine-learning approaches were used to identify the major factors that contribute to asthma.
Results:A total of 14 GWAS risk loci with clinical data were analyzed on the basis of 10 times the 10-fold cross-validation for all machine-learning models. Using GWAS risk loci or clinical data, the best performances exhibited area under the curve (AUC) values of 64.3% and 71.4%, respectively. Combining GWAS risk loci and clinical data, the XGBoost established the best model with an AUC of 79.7%, indicating that the combination of genetics and clinical data can enable improved performance. We then sorted the importance of features and found the top six risk factors for predicting asthma to be rs3117098, rs7775228, family history, rs2305480, rs4833095, and body mass index.
Conclusion:Asthma-prediction models based on GWAS risk loci and clinical data can accurately predict asthma, and thus provide insights into the disease pathogenesis.
Keywords
About the authors
Zan-Mei Qin
Department of Respiratory and Critical Care Medicine,, First Affiliated Hospital of Guangxi Medical University,
Email: info@benthamscience.net
Si-Qiao Liang
Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University
Email: info@benthamscience.net
Jian-Xiong Long
Department of Epidemiology and Health Statistics, School of Public Health of Guangxi Medical University
Email: info@benthamscience.net
Jing-Min Deng
Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University
Author for correspondence.
Email: info@benthamscience.net
Xuan Wei
Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University
Email: info@benthamscience.net
Mei-Ling Yang
Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University
Email: info@benthamscience.net
Shao-Jie Tang
School of Automation, Xi'an University of Posts and Telecommunications
Email: info@benthamscience.net
Hai-Li Li
Department of Respiratory and Critical Care Medicine, First Affiliated Hospital of Guangxi Medical University
Email: info@benthamscience.net
References
- Global Initiative for Asthma (GINA). The global strategy for asthma management and prevention. 2019. Available From: http://www.ginasthma.org
- Los, H.; Koppelman, G.H.; Postma, D.S. The importance of genetic influences in asthma. Eur. Respir. J., 1999, 14(5), 1210-1227. doi: 10.1183/09031936.99.14512109 PMID: 10596715
- Kim, K.W.; Ober, C. Lessons learned from GWAS of asthma. Allergy Asthma Immunol. Res., 2019, 11(2), 170-187. doi: 10.4168/aair.2019.11.2.170 PMID: 30661310
- Bønnelykke, K.; Sleiman, P.; Nielsen, K.; Kreiner-Møller, E.; Mercader, J.M.; Belgrave, D.; den Dekker, H.T.; Husby, A.; Sevelsted, A.; Faura-Tellez, G.; Mortensen, L.J.; Paternoster, L.; Flaaten, R.; Mølgaard, A.; Smart, D.E.; Thomsen, P.F.; Rasmussen, M.A.; Bonàs-Guarch, S.; Holst, C.; Nohr, E.A.; Yadav, R.; March, M.E.; Blicher, T.; Lackie, P.M.; Jaddoe, V.W.V.; Simpson, A.; Holloway, J.W.; Duijts, L.; Custovic, A.; Davies, D.E.; Torrents, D.; Gupta, R.; Hollegaard, M.V.; Hougaard, D.M.; Hakonarson, H.; Bisgaard, H. A genome-wide association study identifies CDHR3 as a susceptibility locus for early childhood asthma with severe exacerbations. Nat. Genet., 2014, 46(1), 51-55. doi: 10.1038/ng.2830 PMID: 24241537
- Ferreira, M.A.R.; Matheson, M.C.; Tang, C.S.; Granell, R.; Ang, W.; Hui, J.; Kiefer, A.K.; Duffy, D.L.; Baltic, S.; Danoy, P.; Bui, M.; Price, L.; Sly, P.D.; Eriksson, N.; Madden, P.A.; Abramson, M.J.; Holt, P.G.; Heath, A.C.; Hunter, M.; Musk, B.; Robertson, C.F.; Le Souëf, P.; Montgomery, G.W.; Henderson, A.J.; Tung, J.Y.; Dharmage, S.C.; Brown, M.A.; James, A.; Thompson, P.J.; Pennell, C.; Martin, N.G.; Evans, D.M.; Hinds, D.A.; Hopper, J.L. Genome-wide association analysis identifies 11 risk variants associated with the asthma with hay fever phenotype. J. Allergy Clin. Immunol., 2014, 133(6), 1564-1571. doi: 10.1016/j.jaci.2013.10.030 PMID: 24388013
- Moffatt, M.F.; Gut, I.G.; Demenais, F.; Strachan, D.P.; Bouzigon, E.; Heath, S.; von Mutius, E.; Farrall, M.; Lathrop, M.; Cookson, W.O.C.M. A large-scale, consortium-based genomewide association study of asthma. N. Engl. J. Med., 2010, 363(13), 1211-1221. doi: 10.1056/NEJMoa0906312 PMID: 20860503
- Gudbjartsson, D.F.; Bjornsdottir, U.S.; Halapi, E.; Helgadottir, A.; Sulem, P.; Jonsdottir, G.M.; Thorleifsson, G.; Helgadottir, H.; Steinthorsdottir, V.; Stefansson, H.; Williams, C.; Hui, J.; Beilby, J.; Warrington, N.M.; James, A.; Palmer, L.J.; Koppelman, G.H.; Heinzmann, A.; Krueger, M.; Boezen, H.M.; Wheatley, A.; Altmuller, J.; Shin, H.D.; Uh, S.T.; Cheong, H.S.; Jonsdottir, B.; Gislason, D.; Park, C.S.; Rasmussen, L.M.; Porsbjerg, C.; Hansen, J.W.; Backer, V.; Werge, T.; Janson, C.; Jönsson, U.B.; Ng, M.C.Y.; Chan, J.; So, W.Y.; Ma, R.; Shah, S.H.; Granger, C.B.; Quyyumi, A.A.; Levey, A.I.; Vaccarino, V.; Reilly, M.P.; Rader, D.J.; Williams, M.J.A.; van Rij, A.M.; Jones, G.T.; Trabetti, E.; Malerba, G.; Pignatti, P.F.; Boner, A.; Pescollderungg, L.; Girelli, D.; Olivieri, O.; Martinelli, N.; Ludviksson, B.R.; Ludviksdottir, D.; Eyjolfsson, G.I.; Arnar, D.; Thorgeirsson, G.; Deichmann, K.; Thompson, P.J.; Wjst, M.; Hall, I.P.; Postma, D.S.; Gislason, T.; Gulcher, J.; Kong, A.; Jonsdottir, I.; Thorsteinsdottir, U.; Stefansson, K. Sequence variants affecting eosinophil numbers associate with asthma and myocardial infarction. Nat. Genet., 2009, 41(3), 342-347. doi: 10.1038/ng.323 PMID: 19198610
- Hirota, T.; Takahashi, A.; Kubo, M.; Tsunoda, T.; Tomita, K.; Doi, S.; Fujita, K.; Miyatake, A.; Enomoto, T.; Miyagawa, T.; Adachi, M.; Tanaka, H.; Niimi, A.; Matsumoto, H.; Ito, I.; Masuko, H.; Sakamoto, T.; Hizawa, N.; Taniguchi, M.; Lima, J.J.; Irvin, C.G.; Peters, S.P.; Himes, B.E.; Litonjua, A.A.; Tantisira, K.G.; Weiss, S.T.; Kamatani, N.; Nakamura, Y.; Tamari, M. Genome-wide association study identifies three new susceptibility loci for adult asthma in the Japanese population. Nat. Genet., 2011, 43(9), 893-896. doi: 10.1038/ng.887 PMID: 21804548
- Yucesoy, B.; Kaufman, K.M.; Lummus, Z.L.; Weirauch, M.T.; Zhang, G.; Cartier, A.; Boulet, L.P.; Sastre, J.; Quirce, S.; Tarlo, S.M.; Cruz, M.J.; Munoz, X.; Harley, J.B.; Bernstein, D.I. Genome-wide association study identifies novel loci associated with diisocyanate-induced occupational asthma. Toxicol. Sci., 2015, 146(1), 192-201. doi: 10.1093/toxsci/kfv084 PMID: 25918132
- Ramasamy, A.; Kuokkanen, M.; Vedantam, S.; Gajdos, Z.K.; Couto Alves, A.; Lyon, H.N.; Ferreira, M.A.R.; Strachan, D.P.; Zhao, J.H.; Abramson, M.J.; Brown, M.A.; Coin, L.; Dharmage, S.C.; Duffy, D.L.; Haahtela, T.; Heath, A.C.; Janson, C.; Kähönen, M.; Khaw, K.T.; Laitinen, J.; Le Souef, P.; Lehtimäki, T.; Madden, P.A.F.; Marks, G.B.; Martin, N.G.; Matheson, M.C.; Palmer, C.D.; Palotie, A.; Pouta, A.; Robertson, C.F.; Viikari, J.; Widen, E.; Wjst, M.; Jarvis, D.L.; Montgomery, G.W.; Thompson, P.J.; Wareham, N.; Eriksson, J.; Jousilahti, P.; Laitinen, T.; Pekkanen, J.; Raitakari, O.T.; OConnor, G.T.; Salomaa, V.; Jarvelin, M.R.; Hirschhorn, J.N. Genome-wide association studies of asthma in population-based cohorts confirm known and suggested loci and identify an additional association near HLA. PLoS One, 2012, 7(9), e44008. doi: 10.1371/journal.pone.0044008 PMID: 23028483
- Ober, C.; Nicolae, D.L.; Chiu, G.Y.; Gauderman, W.J.; Gignoux, C.R.; Graves, P.E.; Himes, B.E.; Levin, A.M.; Mathias, R.A.; Hancock, D.B.; Baurley, J.W.; Eng, C.; Stern, D.A.; Celedón, J.C.; Rafaels, N.; Capurso, D.; Conti, D.V.; Roth, L.A.; Soto-Quiros, M.; Togias, A.; Li, X.; Myers, R.A.; Romieu, I.; Van Den Berg, D.J.; Hu, D.; Hansel, N.N.; Hernandez, R.D.; Israel, E.; Salam, M.T.; Galanter, J.; Avila, P.C.; Avila, L.; Rodriquez-Santana, J.R.; Chapela, R.; Rodriguez-Cintron, W.; Diette, G.B.; Adkinson, N.F.; Abel, R.A.; Ross, K.D.; Shi, M.; Faruque, M.U.; Dunston, G.M.; Watson, H.R.; Mantese, V.J.; Ezurum, S.C.; Liang, L.; Ruczinski, I.; Ford, J.G.; Huntsman, S.; Chung, K.F.; Vora, H.; Li, X.; Calhoun, W.J.; Castro, M.; Sienra-Monge, J.J.; del Rio-Navarro, B.; Deichmann, K.A.; Heinzmann, A.; Wenzel, S.E.; Busse, W.W.; Gern, J.E.; Lemanske, R.F., Jr; Beaty, T.H.; Bleecker, E.R.; Raby, B.A.; Meyers, D.A.; London, S.J.; Gilliland, F.D.; Burchard, E.G.; Martinez, F.D.; Weiss, S.T.; Williams, L.K.; Barnes, K.C.; Ober, C.; Nicolae, D.L. Meta-analysis of genome-wide association studies of asthma in ethnically diverse North American populations. Nat. Genet., 2011, 43(9), 887-892. doi: 10.1038/ng.888 PMID: 21804549
- Leung, T.F.; Tang, M.F.; Leung, A.S.Y.; Kong, A.P.S.; Liu, T.C.; Chan, R.W.Y.; Ma, R.C.W.; Sy, H.Y.; Chan, J.C.N.; Wong, G.W.K. Cadherin‐related family member 3 gene impacts childhood asthma in Chinese children. Pediatr. Allergy Immunol., 2020, 31(2), 133-142. doi: 10.1111/pai.13138 PMID: 31610042
- Chen, J.; Zhang, J.; Hu, H.; Jin, Y.; Xue, M. Polymorphisms of RAD50, IL33 and IL1RL1 are associated with atopic asthma in Chinese population. Tissue Antigens, 2015, 86(6), 443-447. doi: 10.1111/tan.12688 PMID: 26493291
- Marinho, S.; Custovic, A.; Marsden, P.; Smith, J.A.; Simpson, A. 17q12-21 Variants are associated with asthma and interact with active smoking in an adult population from the United Kingdom. Ann. Allergy Asthma Immunol., 2012, 108(6), 402-411.e9. doi: 10.1016/j.anai.2012.03.002 PMID: 22626592
- Yu, J.; Kang, M.J.; Kim, B.J.; Kwon, J.W.; Song, Y.H.; Choi, W.A.; Shin, Y.J.; Hong, S.J. Polymorphisms in GSDMA and GSDMB are associated with asthma susceptibility, atopy and BHR. Pediatr. Pulmonol., 2011, 46(7), 701-708. doi: 10.1002/ppul.21424 PMID: 21337730
- avbi, M.; Koroec, P.; Flear, M.; krgat Kristan, S.; Marc Malovrh, M.; Rijavec, M. Polymorphisms and haplotypes of the chromosome locus 17q12-17q21.1 contribute to adult asthma susceptibility in Slovenian patients. Hum. Immunol., 2016, 77(6), 527-534. doi: 10.1016/j.humimm.2016.05.003 PMID: 27163155
- Ullemar, V.; Magnusson, P.K.E.; Lundholm, C.; Zettergren, A.; Melén, E.; Lichtenstein, P.; Almqvist, C. Heritability and confirmation of genetic association studies for childhood asthma in twins. Allergy, 2016, 71(2), 230-238. doi: 10.1111/all.12783 PMID: 26786172
- Sun, Y.; Wei, X.; Deng, J.; Zhang, J.; He, Z.; Yang, M.; Liang, S.; Chen, Z.; Qin, H. Association of IL1RL1 rs3771180 and TSLP rs1837253 variants with asthma in the Guangxi Zhuang population in China. J. Clin. Lab. Anal., 2019, 33(6), e22905. doi: 10.1002/jcla.22905 PMID: 31066119
- Liang, S.Q.; Deng, J.M.; Wei, X.; Chen, Z.R.; Yang, M.L.; Qin, H.; Zhang, J.; He, Z. Association of GWAS‐supported noncoding area loci rs404860, rs3117098, and rs7775228 with asthma in Chinese Zhuang population. J. Clin. Lab. Anal., 2020, 34(2), e23066. doi: 10.1002/jcla.23066 PMID: 31605414
- Granell, R.; Henderson, A.J.; Evans, D.M.; Smith, G.D.; Ness, A.R.; Lewis, S.; Palmer, T.M.; Sterne, J.A.C. Effects of BMI, fat mass, and lean mass on asthma in childhood: A Mendelian randomization study. PLoS Med., 2014, 11(7), e1001669. doi: 10.1371/journal.pmed.1001669 PMID: 24983943
- Skaaby, T.; Taylor, A.E.; Jacobsen, R.K.; Paternoster, L.; Thuesen, B.H.; Ahluwalia, T.S.; Larsen, S.C.; Zhou, A.; Wong, A.; Gabrielsen, M.E.; Bjørngaard, J.H.; Flexeder, C.; Männistö, S.; Hardy, R.; Kuh, D.; Barry, S.J.; Tang Møllehave, L.; Cerqueira, C.; Friedrich, N.; Bonten, T.N.; Noordam, R.; Mook-Kanamori, D.O.; Taube, C.; Jessen, L.E.; McConnachie, A.; Sattar, N.; Upton, M.N.; McSharry, C.; Bønnelykke, K.; Bisgaard, H.; Schulz, H.; Strauch, K.; Meitinger, T.; Peters, A.; Grallert, H.; Nohr, E.A.; Kivimaki, M.; Kumari, M.; Völker, U.; Nauck, M.; Völzke, H.; Power, C.; Hyppönen, E.; Hansen, T.; Jørgensen, T.; Pedersen, O.; Salomaa, V.; Grarup, N.; Langhammer, A.; Romundstad, P.R.; Skorpen, F.; Kaprio, J.; R., Munafò M.; Linneberg, A. Investigating the causal effect of smoking on hay fever and asthma: A Mendelian randomization meta-analysis in the CARTA consortium. Sci. Rep., 2017, 7(1), 2224. doi: 10.1038/s41598-017-01977-w PMID: 28533558
- Zhao, J.V.; Schooling, C.M. The role of linoleic acid in asthma and inflammatory markers: A Mendelian randomization study. Am. J. Clin. Nutr., 2019, 110(3), 685-690. doi: 10.1093/ajcn/nqz130 PMID: 31287141
- Nuzzo, R. Scientific method: Statistical errors. Nature, 2014, 506(7487), 150-152. doi: 10.1038/506150a PMID: 24522584
- Antonucci, L.A.; Pergola, G.; Pigoni, A.; Dwyer, D.; Kambeitz-Ilankovic, L.; Penzel, N.; Romano, R.; Gelao, B.; Torretta, S.; Rampino, A.; Trojano, M.; Caforio, G.; Falkai, P.; Blasi, G.; Koutsouleris, N.; Bertolino, A. A pattern of cognitive deficits stratified for genetic and environmental risk reliably classifies patients with schizophrenia from healthy control subjects. Biol. Psychiatry, 2020, 87(8), 697-707. doi: 10.1016/j.biopsych.2019.11.007 PMID: 31948640
- Li, C.; Sun, D.; Liu, J.; Li, M.; Zhang, B.; Liu, Y.; Wang, Z.; Wen, S.; Zhou, J. A prediction model of essential hypertension based on genetic and environmental risk factors in northern han chinese. Int. J. Med. Sci., 2019, 16(6), 793-799.
- Guido, S.; Müller, A.C. Introduction to machine learning with Python: A guide for Data Scientists; O'Reilly Media, Inc.: Sebastopol, 2016, pp. 123-145.
- Chen, T.; Guestrin, C. C: XGBoost: A scalable tree boosting system. arXiv:1603.02754, 2016. doi: 10.1145/2939672.2939785
- Li, L.; Zhang, X. Study of Data Mining Algorithm Based on Decision Tree. In: 2010 International Conference On Computer Design and Applications, IEEE 2010.
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn., 1995, 20, 273-297.
- Ho, T.K. Random decision forests. Proceedings of the 3rd International Conference on Document Analysis and Recognition 278, 1995, p. 282.
- Gaudillo, J.; Rodriguez, J.J.R.; Nazareno, A.; Baltazar, L.R.; Vilela, J.; Bulalacao, R.; Domingo, M.; Albia, J. Machine learning approach to single nucleotide polymorphism-based asthma prediction. PLoS One, 2019, 14(12), e0225574. doi: 10.1371/journal.pone.0225574 PMID: 31800601
- Los, H.; Postmus, P.E.; Boomsma, D.I. Asthma genetics and intermediate phenotypes: A review from twin studies. Twin Res., 2001, 4(2), 81-93. doi: 10.1375/1369052012191 PMID: 11665340
- AlSaad, R.; Malluhi, Q.; Janahi, I.; Boughorbel, S. Interpreting patient-Specific risk prediction using contextual decomposition of BiLSTMs: Application to children with asthma. BMC Med. Inform. Decis. Mak., 2019, 19(1), 214. doi: 10.1186/s12911-019-0951-4 PMID: 31703676
- Ogunleye, A; Wang, QG XGBoost model for chronic kidney disease diagnosis. IEEE/ACM Trans Comput Biol Bioinform., 2020, 17(6), 2131-2140. doi: 10.1109/TCBB.2019.2911071
- Yu, D.; Liu, Z.; Su, C.; Han, Y.; Duan, X.; Zhang, R.; Liu, X.; Yang, Y.; Xu, S. Copy number variation in plasma as a tool for lung cancer prediction using Extreme Gradient Boosting (XGBoost) classifier. Thorac. Cancer, 2020, 11(1), 95-102. doi: 10.1111/1759-7714.13204 PMID: 31694073
- Liu, L.; Yu, Y.; Fei, Z.; Li, M.; Wu, F.X.; Li, H.D.; Pan, Y.; Wang, J. An interpretable boosting model to predict side effects of analgesics for osteoarthritis. BMC Syst. Biol., 2018, 12(S6)(Suppl. 6), 105. doi: 10.1186/s12918-018-0624-4 PMID: 30463545
- Ji, X.; Tong, W.; Liu, Z.; Shi, T. Five-feature model for developing the classifier for synergistic vs. antagonistic drug combinations built by XGBoost. Front. Genet., 2019, 10, 600. doi: 10.3389/fgene.2019.00600 PMID: 31338106
- Ding, W.; Chen, G.; Shi, T. Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis. Epigenetics, 2019, 14(1), 67-80. doi: 10.1080/15592294.2019.1568178 PMID: 30696380
- Fu, B.; Liu, P.; Lin, J.; Deng, L.; Hu, K.; Zheng, H. Predicting invasive disease-free survival for early-stage breast cancer patients using follow-up clinical data. IEEE Trans. Biomed. Eng., 2018. PMID: 30475709
Supplementary files
