Algorithms in Bioinformatics. Paul A. Gagniuc
Чтение книги онлайн.
Читать онлайн книгу Algorithms in Bioinformatics - Paul A. Gagniuc страница 26
Although informative, note that an undersized proteome does not rule out the possibility of alternative splicing or protein splicing in any of these kingdoms. Animals and plants show the most diverse proteomes, well above the average number of genes (Table 2.6). Individually, some species may show a particularly high proteome diversity compared to these averages. For instance, in plants, Triticum durum (macaroni wheat) contains ∼63k of genes and a proteome of ∼190k. Following the same reasoning as above, the proteome of T. durum is ∼197% more diverse when compared to the number of genes. In animals, a significant difference can also be found. Current NCBI data shows that the human genome contains ∼60k of genes (the list of annotated features includes protein-coding genes, noncoding genes, and pseudogenes) and a proteome of ∼120k (H. sapiens GRCh38.p13). The proteome of H. sapiens is ∼95% more diverse when compared to the number of genes. Note: However, when it comes to the human genome and the proteome, a discussion can be almost dangerous over time. In literature, the number of genes and proteins for H. sapiens can vary depending on different agreements or/and advances in bioinformatics [234–236]. But why all this uncertainty related to the number of genes or proteins? All genes are predicted by using bioinformatic means. Many predictions are then verified by alignment of sequenced mRNAs against a reference genome. However, many genes express themselves only in special conditions or over certain periods of time, or only once in a life time. Thus, their mRNAs cannot be detected and sequenced to further confirm the bioinformatic predictions. To add to this matter, many genes may overlap and often gene promoters can show bidirectional activity [237–239]. It stands to reason that such elusive genes are difficult to locate with certainty and other genes will prove difficult to detect in the future. Moreover, many results derived from large-scale experiments (e.g. genome studies) are directly under the umbrella of chaos theory. Small changes in the initial parameters of different algorithms can lead to huge variations in the final predictions. This has already been evident over time in the case of the human genome [234–236].
2.9 Conclusions
Bioinformatics is the field that will perhaps lead to a clearer understanding of both the origins and the current mechanisms of life. Here, an introduction provided the necessary context for a better understanding of different approaches used in the field of bioinformatics and, possibly, for new ideas “just waiting” to be implemented in the future. In a first stage, the chapter described the units of measurement used throughout the book, and presented a series of useful conversions, followed by discussions regarding organisms with the largest/smallest genomes. In a second stage, the average values related to the genome size in different kingdoms of life were calculated and discussed. In this context, the global features of viral genomes, plasmid DNA and various genome-containing organelles found in different eukaryotic organisms have been described in brief. Furthermore, viroids have been mentioned in connection with the properties shown by catalytic RNAs. Toward the end of the chapter, discussions were gradually switched from catalytic RNAs to RNA splicing. The frequency of RNA splicing was further pointed out by a comparison between the average number of genes and proteins in the main kingdoms of life.
Конец ознакомительного фрагмента.
Текст предоставлен ООО «ЛитРес».
Прочитайте эту книгу целиком, купив полную легальную версию на ЛитРес.
Безопасно оплатить книгу можно банковской картой Visa, MasterCard, Maestro, со счета мобильного телефона, с платежного терминала, в салоне МТС или Связной, через PayPal, WebMoney, Яндекс.Деньги, QIWI Кошелек, бонусными картами или другим удобным Вам способом.