Bioinformatics. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Bioinformatics - Группа авторов страница 53

Bioinformatics - Группа авторов

Скачать книгу

the index on the left side of the Gene tab view, the Comparative Genomics → Orthologues link lists the computationally predicted orthologs of the selected gene that Ensembl has identified among the available genome assemblies (Herrero et al. 2016; Figure 4.15). The Location tab provides a graphical view of the genomic context of the gene, similar to the view available at UCSC. The link to the Location tab is at the top of the Gene tab view in Figure 4.14. The Location tab view is shown in Figure 4.16 and depicts, at three different zoom levels, the genomic context of the PAH gene on the GRCh38 genome assembly. The PAH gene has been mapped to chromosome 12, and the top panel shows a cartoon of that chromosome, with the region surrounding the PAH gene outlined in a red box. This red box is expanded in the middle panel of the figure, which shows ∼1 Mb of chromosome 12 around the PAH gene. The genes are shown as colored blocks, with their identifiers noted below them. The region outlined in red in this middle section is further expanded in the large bottom panel, which zooms in on the PAH gene itself. Individual tracks are visible in this view. Note the track called Contigs, a blue bar that represents the underlying assembled contigs. By convention, any transcripts shown above this track are transcribed from left to right. Transcripts drawn below the Contigs track, such as the PAH transcripts, are transcribed on the opposite strand, from right to left.

Snapshot depicts the computationally predicted orthologs of the human PAH gene, from the Comparative Genomics - Orthologues link and ensembl provides a detailed analysis of the orthologs calculated for each gene. Snapshot depicts the Location tab for the human PAH gene. The Location tab is divided into three sections.

      Box 4.4 Ensembl Stable IDs

      Ensembl assigns accession numbers to many data types in its database. Each identifier begins with the organism prefix; for human, the prefix is ENS; for mouse, it is ENSMUS; and for anole lizard, it is ENSACA. Next comes an abbreviation for the feature type: G for gene, T for transcript, P for protein, R for regulatory, and so forth. This is followed by a series of digits, and an optional version. The version number increments when there is a change in the underlying data. The gene version changes when the underlying transcripts are updated, and the transcript and protein versions increment when the sequence changes.

      For example, the human PAH gene has the following identifiers:

       ENSG00000171759.9: the identifier of the human PAH gene

       ENST00000553106.5: the identifier of one transcript of the human PAH gene, transcript PAH-215

       ENSP00000448059.1: the identifier of the protein translation of transcript PAH-215, ENST00000553106.5

       ENSR00000056420: the identifier of a promoter of several PAH transcripts

Скачать книгу