Enhancing Drug-Target Binding Affinity Prediction through Deep Learning and Protein Secondary Structure Integration

Runhua Zhang; Baozhong Zhu; Tengsheng Jiang; Zhiming Cui; Hongjie Wu

doi:10.2174/0115748936285519240110070209

Enhancing Drug-Target Binding Affinity Prediction through Deep Learning and Protein Secondary Structure Integration

Авторы: Zhang R.¹, Zhu B.², Jiang T.³, Cui Z.², Wu H.²
Учреждения:
1. School of Electronic and Information Engineering, Suzhou University of Science and Technolog
2. School of Electronic and Information Engineering, Suzhou University of Science and Technology
3. Gusu School, Nanjing Medical University
Выпуск: Том 19, № 10 (2024)
Страницы: 943-952
Раздел: Life Sciences
URL: https://gynecology.orscience.ru/1574-8936/article/view/643756
DOI: https://doi.org/10.2174/0115748936285519240110070209
ID: 643756

Цитировать

Полный текст

Аннотация
Об авторах
Список литературы
Дополнительные файлы
Статистика

Аннотация

Background:Conventional approaches to drug discovery are often characterized by lengthy and costly processes. To expedite the discovery of new drugs, the integration of artificial intelligence (AI) in predicting drug-target binding affinity (DTA) has emerged as a crucial approach. Despite the proliferation of deep learning methods for DTA prediction, many of these methods primarily concentrate on the amino acid sequence of proteins. Yet, the interactions between drug compounds and targets occur within distinct segments within the protein structures, whereas the primary sequence primarily captures global protein features. Consequently, it falls short of fully elucidating the intricate relationship between drugs and their respective targets.

Objective:This study aims to employ advanced deep-learning techniques to forecast DTA while incorporating information about the secondary structure of proteins.

Methods:In our research, both the primary sequence of protein and the secondary structure of protein were leveraged for protein representation. While the primary sequence played the role of the overarching feature, the secondary structure was employed as the localized feature. Convolutional neural networks and graph neural networks were utilized to independently model the intricate features of target proteins and drug compounds. This approach enhanced our ability to capture drugtarget interactions more effectively

Results:We have introduced a novel method for predicting DTA. In comparison to DeepDTA, our approach demonstrates significant enhancements, achieving a 3.9% increase in the Concordance Index (CI) and a remarkable 34% reduction in Mean Squared Error (MSE) when evaluated on the KIBA dataset.

Conclusion:In conclusion, our results unequivocally demonstrate that augmenting DTA prediction with the inclusion of the protein's secondary structure as a localized feature yields significantly improved accuracy compared to relying solely on the primary structure.

Ключевые слова

Drug-target binding affinity, deep learning, convolutional neural network, graph neural network, protein secondary structure, protein primary sequence.

Список литературы

DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: New estimates of R&D costs. J Health Econ 2016; 47(47): 20-33. doi: 10.1016/j.jhealeco.2016.01.012 PMID: 26928437
Mullard A. New drugs cost US$2.6 billion to develop. Nat Rev Drug Discov 2014; 13(12): 877-7. doi: 10.1038/nrd4507 PMID: 25435204
Ding Y, Tang J, Guo F. Identification of drugtarget interactions via dual laplacian regularized least squares with multiple kernel fusion. Knowl Base Syst 2020; 204: 106254. doi: 10.1016/j.knosys.2020.106254
Sun M, Tiwari P, Qian Y, Ding Y, Zou Q. MLapSVM-LBS: Predicting DNA-binding proteins via a multiple Laplacian regularized support vector machine with local behavior similarity. Knowl Base Syst 2022; 250: 109174. doi: 10.1016/j.knosys.2022.109174
Ding Y, Tang J, Guo F. Identification of drugtarget interactions via fuzzy bipartite local model. Neural Comput Appl 2020; 32(14): 10303-19. doi: 10.1007/s00521-019-04569-z
Yamanishi Y, Kotera M, Kanehisa M, Goto S. Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatics 2010; 26(12): i246-54. doi: 10.1093/bioinformatics/btq176 PMID: 20529913
Gohlke H, Klebe G. Approaches to the description and prediction of the binding affinity of small-molecule ligands to macromolecular receptors. Angew Chem Int Ed 2002; 41(15): 2644-76. doi: 10.1002/1521-3773(20020802)41:153.0.CO;2-O PMID: 12203463
Tang J, Szwajda A, Shakyawar S, et al. Making sense of large-scale kinase inhibitor bioactivity data sets: A comparative and integrative analysis. J Chem Inf Model 2014; 54(3): 735-43. doi: 10.1021/ci400709d PMID: 24521231
Fielding L. NMR methods for the determination of proteinligand dissociation constants. Prog Nucl Magn Reson Spectrosc 2007; 51(4): 219-42. doi: 10.1016/j.pnmrs.2007.04.001
Cer RZ, Mudunuri U, Stephens R, Lebeda FJ. IC50-To-Ki: A web-based tool for converting IC50 to Ki values for inhibitors of enzyme activity and ligand binding. Nucleic Acids Res 2009; 37: W441-5. doi: 10.1093/nar/gkp253
Yang H, Ding Y, Tang J, Guo F. Drugdisease associations prediction via multiple Kernel-based dual graph regularized least squares. Appl Soft Comput 2021; 112: 107811. doi: 10.1016/j.asoc.2021.107811
Ding Y, Tang J, Guo F. Human protein subcellular localization identification via fuzzy model on kernelized neighborhood representation. Appl Soft Comput 2020; 96: 106596. doi: 10.1016/j.asoc.2020.106596
Wu H, Ling H, Gao L, et al. Empirical potential energy function toward ab initio folding G protein-coupled receptors. IEEE/ACM Trans Comput Biol Bioinformatics 2021; 18(5): 1752-62. doi: 10.1109/TCBB.2020.3008014 PMID: 32750885
Karimi M, Wu D, Wang Z, Shen Y. Explainable deep relational networks for predicting compoundprotein affinities and contacts. J Chem Inf Model 2021; 61(1): 46-66. doi: 10.1021/acs.jcim.0c00866 PMID: 33347301
Ding Y, Tang J, Guo F. Identification of drug-target interactions via multi-view graph regularized link propagation model. Neurocomputing 2021; 461: 618-31. doi: 10.1016/j.neucom.2021.05.100
Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Model 1988; 28(1): 31-6.
Ding Y, Tang J, Guo F. Identification of drug-side effect association via semisupervised model and multiple kernel learning. IEEE J Biomed Health Inform 2019; 23(6): 2619-32. doi: 10.1109/JBHI.2018.2883834 PMID: 30507518
Öztürk H, Özgür A, Ozkirimli E. DeepDTA: Deep drugtarget binding affinity prediction. Bioinformatics 2018; 34(17): i821-9. doi: 10.1093/bioinformatics/bty593 PMID: 30423097
Öztürk H, Ozkirimli E, Özgür A. WideDTA: Prediction of drug-target binding affinity. arXiv:190204166 2019.
Nguyen T, Le H, Quinn TP, Nguyen T, Le TD, Venkatesh S. GraphDTA: Predicting drugtarget binding affinity with graph neural networks. Bioinformatics 2020; 37(8): 1140-7. PMID: 33119053
Xu K, Hu W, Leskovec J, Jegelka S. How powerful are graph neural networks? arXiv:181000826 2019.
Veličković P, Cucurull G, Casanova A, Romero A, Pietro L, Bengio Y. Graph attention networks. arXiv:171010903 2017.
Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv:160902907 2017.
Chu Z, Huang F, Fu H, et al. Hierarchical graph representation learning for the prediction of drug-target binding affinity. Inf Sci 2022; 613: 507-23. doi: 10.1016/j.ins.2022.09.043
Yang Z, Zhong W, Zhao L, Yu-Chian CC. MGraphDTA: Deep multiscale graph neural network for explainable drugtarget binding affinity prediction. Chem Sci 2022; 13(3): 816-33. doi: 10.1039/D1SC05180F
Karimi M, Wu D, Wang Z, Shen Y. DeepAffinity: interpretable deep learning of compoundprotein affinity through unified recurrent and convolutional neural networks. Bioinformatics 2019; 35(18): 3329-38. doi: 10.1093/bioinformatics/btz111 PMID: 30768156
Kha QH, Ho QT, Le NQK. Identifying SNARE proteins using an alignment-free method based on multiscan convolutional neural network and PSSM profiles. J Chem Inf Model 2022; 62(19): 4820-6. doi: 10.1021/acs.jcim.2c01034 PMID: 36166351
Yuan Q, Chen K, Yu Y, Le NQK, Chua MCH. Prediction of anticancer peptides based on an ensemble model of deep learning and machine learning using ordinal positional encoding. Brief Bioinform 2023; 24(1): bbac630. doi: 10.1093/bib/bbac630 PMID: 36642410
Nguyen TM, Nguyen T, Le TM, Tran T. Gefa: early fusion approach in drug-target affinity prediction. IEEE/ACM Trans Comput Biol Bioinformatics 2022; 19(2): 718-28. doi: 10.1109/TCBB.2021.3094217 PMID: 34197324
Pandey M, Radaeva M, Mslati H, et al. Ligand binding prediction using protein structure graphs and residual graph attention networks. Molecules 2022; 27(16): 5114. doi: 10.3390/molecules27165114 PMID: 36014351
Davis MI, Hunt JP, Herrgard S, et al. Comprehensive analysis of kinase inhibitor selectivity. Nat Biotechnol 2011; 29(11): 1046-51. doi: 10.1038/nbt.1990 PMID: 22037378
Guermeur Y, Geourjon C, Gallinari P, Deléage G. Improved performance in protein secondary structure prediction by inhomogeneous score combination. Bioinformatics 1999; 15(5): 413-21. doi: 10.1093/bioinformatics/15.5.413 PMID: 10366661
Combet C, Blanchet C, Geourjon C, Deléage G. NPS@: network protein sequence analysis. Trends Biochem Sci 2000; 25(3): 147-50. doi: 10.1016/S0968-0004(99)01540-6 PMID: 10694887
Garnier J, Gibrat JF, Robson B. GOR method for predicting protein secondary structure from amino acid sequence. Methods Enzymol. 1996; 266: pp. 540-53. doi: 10.1016/S0076-6879(96)66034-0 PMID: 8743705
Levin JM, Robson B, Garnier J. An algorithm for secondary structure determination in proteins based on sequence similarity. FEBS Lett 1986; 205(2): 303-8. doi: 10.1016/0014-5793(86)80917-6 PMID: 3743779
Geourjon C, Deléage G. SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments. Bioinformatics 1995; 11(6): 681-4. doi: 10.1093/bioinformatics/11.6.681 PMID: 8808585
Wu H, Wang K, Lu L, Xue Y, Lyu Q, Jiang M. Deep conditional random field approach to transmembrane topology prediction and application to GPCR three-dimensional structure modeling. IEEE/ACM Trans Comput Biol Bioinformatics 2017; 14(5): 1106-14. doi: 10.1109/TCBB.2016.2602872 PMID: 27576262
Chan WKB, Zhang H, Yang J, et al. GLASS: A comprehensive database for experimentally validated GPCR-ligand associations. Bioinformatics 2015; 31(18): 3035-42. doi: 10.1093/bioinformatics/btv302 PMID: 25971743
Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 2001; 43(3): 246-55. doi: 10.1002/prot.1035 PMID: 11288174
Wang H, Tang J, Ding Y, Guo F. Exploring associations of non-coding RNAs in human diseases via three-matrix factorization with hypergraph-regular terms on center kernel alignment. Brief Bioinform 2021; 22(5): bbaa409. doi: 10.1093/bib/bbaa409 PMID: 33443536
Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv:13013781 2013.
Kabsch W, Sander C. Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983; 22(12): 2577-637. doi: 10.1002/bip.360221211 PMID: 6667333
Landrum G. RDKit: A software suite for cheminformatics, computational chemistry, and predictive modeling. Greg Landrum 2013; 8: 31.
Li W, Matthew Z, Sixin Z, Le Cun Y, Fergus R. Regularization of neural networks using DropConnect. Proceedings of the 30th International Conference on Machine Learning, PMLR. 1058-66.
Kingma D, Ba J. Adam: A Method for Stochastic Optimization. Comput Sci 2014.
Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML-10). 807-14.
Chicco D, Warrens MJ, Jurman G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Comput Sci 2021; 7: e623. doi: 10.7717/peerj-cs.623 PMID: 34307865
Brentnall AR, Cuzick J. Use of the concordance index for predictors of censored survival data. Stat Methods Med Res 2018; 27(8): 2359-73. doi: 10.1177/0962280216680245 PMID: 27920368
Zhao Q, Xiao F, Yang M, Li Y, Wang J. AttentionDTA: Prediction of drugtarget binding affinity using attention model. 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 64-9. doi: 10.1109/BIBM47256.2019.8983125
Tang Z, Liu X, Li Z, et al. SpaRx: elucidate single-cell spatial heterogeneity of drug responses for personalized treatment. Brief Bioinform 2023; 24(6): bbad338. doi: 10.1093/bib/bbad338 PMID: 37798249
Tang Z, Li Z, Hou T, et al. SiGra: Single-cell spatial elucidation through an image-augmented graph transformer. Nat Commun 2023; 14(1): 5618. doi: 10.1038/s41467-023-41437-w PMID: 37699885

Дополнительные файлы

Доп. файлы

Действие

1. JATS XML

Скачать

Имя пользователя
Пароль
Запомнить меня

Забыли пароль?	Регистрация

Имя пользователя
Пароль
Запомнить меня

Забыли пароль?	Регистрация

Enhancing Drug-Target Binding Affinity Prediction through Deep Learning and Protein Secondary Structure Integration

Полный текст

Аннотация

Ключевые слова

Об авторах

Runhua Zhang

Baozhong Zhu

Tengsheng Jiang

Zhiming Cui

Hongjie Wu

Список литературы

Дополнительные файлы