Biological Language Model. Qiwen Dong
Чтение книги онлайн.
Читать онлайн книгу Biological Language Model - Qiwen Dong страница 14
References
[1]Liu B., Wang X., Lin L., Dong Q., Wang X. A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis. BMC Bioinfo, 2008, 9(1): 510.
[2]Liu B., Liu F., Wang X., Chen J., Fang L., Chou K.-C. Pse-in-One: A web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Res, 2015, 43(W1): W65–W71.
[3]Liu B. BioSeq-Analysis: a platform for DNA, RNA and protein sequence analysis based on machine learning approaches. Briefings in Bioinformatics, 2019, 20(4): 1280–1294.
[4]Zamani M., Kremer S.C. Amino acid encoding schemes for machine learning methods. In the 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW), 2011, pp. 327–333.
[5]Yoo P.D., Zhou B.B., Zomaya A.Y. Machine learning techniques for protein secondary structure prediction: An overview and evaluation. Curr Bioinfo, 2008, 3(2): 74–86.
[6]Hu H.-J., Pan Y., Harrison R., Tai P.C. Improved protein secondary structure prediction using support vector machine with a new encoding scheme and an advanced tertiary classifier. IEEE Trans NanoBiosci, 2004, 3(4): 265–271.
[7]Miyazawa S., Jernigan R.L. Self-consistent estimation of inter-residue protein contact energies based on an equilibrium mixture approximation of residues. Proteins, 1999, 34(1): 49–68.
[8]Lin K., May A.C.W., Taylor W.R. Amino acid encoding schemes from protein structure alignments: Multi-dimensional vectors to describe residue types. J Theor Biol, 2002, 216(3): 361–365.
[9]Asgari E., Mofrad M.R.K. Continuous distributed representation of biological sequences for deep proteomics and genomics. Plos One, 2015, 10(11): e0141287.
[10]Kawashima S., Pokarowski P., Pokarowska M., Kolinski A., Katayama T., Kanehisa M. AAindex: Amino acid index database, progress report 2008. Nucleic Acids Res, 2008, 36(suppl 1): D202–D205.
[11]Wang S., Peng J., Ma J., Xu J. Protein secondary structure prediction using deep convolutional neural fields. Sci Rep, 2016, 6.
[12]Wang J.T.L., Ma Q., Shasha D., Wu C.H. New techniques for extracting features from protein sequences. IBM Syst J, 2001, 40(2): 426–441.
[13]Dayhoff M.O. A model of evolutionary change in proteins. Atlas Prot Seq Struct, 1978, 5: 89–99.
[14]White G., Seffens W. Using a neural network to backtranslate amino acid sequences. Electronic J Biotechnol, 1998, 1(3): 17–18.
[15]Atchley W.R., Zhao J., Fernandes A.D., Drüke T. Solving the protein sequence metric problem. Proc Natl Acad Sci USA, 2005, 102(18): 6395–6400.
[16]Rose G., Geselowitz A., Lesser G., Lee R., Zehfus M. Hydrophobicity of amino acid residues in globular proteins. Science, 1985, 229(4716): 834–838.
[17]Betts M.J., Russell R.B. Amino acid properties and consequences of substitutions. Bioinfo Genet, 2003, 317: 289.
[18]Fauchère J.-L., Charton M., Kier L.B., Verloop A., Pliska V. Amino acid side chain parameters for correlation studies in biology and pharmacology. Chem Biol Drug Design, 1988, 32(4): 269–278.
[19]Radzicka A., Wolfenden R. Comparing the polarities of the amino acids: side-chain distribution coefficients between the vapor phase, cyclohexane, 1-octanol, and neutral aqueous solution. Biochemistry, 1988, 27(5): 1664–1670.
[20]Reinhard L., Gisbert S., Dirk B., Paul W. A neural network model for the prediction of membrane spanning amino acid sequences. Prot Sci, 1994, 3(9): 1597–1601.
[21]Elofsson A. A study on protein sequence alignment quality. Proteins, 2002, 46(3): 330–339.
[22]Oren E.E., Tamerler C., Sahin D., Hnilova M., Seker U.O.S., Sarikaya M., Samudrala R. A novel knowledge-based approach to design inorganic-binding peptides. Bioinformatics, 2007, 23(21): 2816–2822.
[23]Henikoff S., Henikoff J.G. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA, 1992, 89(22): 10915–10919.
[24]Henikoff S., Henikoff J.G. Automated assembly of protein blocks for database searching. Nucleic Acids Res, 1991, 19(23): 6565–6572.
[25]Stormo G.D., Schneider T.D., Gold L., Ehrenfeucht A. Use of the ‘Perceptron’ algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Res, 1982, 10(9): 2997–3011.
[26]Altschul S.F., Koonin E.V. Iterated profile searches with PSI-BLAST — A tool for discovery in protein databases. Trends Biochem Sci, 1998, 23(11): 444–447.
[27]Remmert M., Biegert A., Hauser A., Söding J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Meth, 2012, 9(2): 173.
[28]Tanaka S., Scheraga H.A. Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. Macromolecules, 1976, 9(6): 945–950.
[29]Miyazawa S., Jernigan R.L. Estimation of effective interresidue contact energies from protein crystal structures: Quasi-chemical approximation. Macromolecules, 1985, 18(3): 534–552.
[30]Miyazawa S., Jernigan R.L. Residue–residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J Mol Biol, 1996, 256(3): 623–644.
[31]Skolnick J., Godzik A., Jaroszewski L., Kolinski A. Derivation and testing of pair potentials for protein folding. When is the quasichemical approximation correct? Prot Sci, 1997, 6(3): 676–688.
[32]Simmons, K.T., Ingo R., Charles K., A. F.B., Chris B., David B. Improved recognition of nativelike protein structures using