Bioinformatics. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Bioinformatics - Группа авторов страница 48

Bioinformatics - Группа авторов

Скачать книгу

categories: Mapping and Sequencing, Genes and Gene Predictions, Phenotype and Literature, mRNA and Expressed Sequence Tag (EST), Expression, Regulation, Comparative Genomics, Variation, and Repeats. Clicking on a track name opens the Track Settings page for that track, providing a description of the data displayed in that track. Most tracks can be displayed in one of the following five modes.

      1 Hide: the track is not displayed at all.

      2 Dense: all features are collapsed into a single line; features are not labeled.

      3 Squish: each feature is shown separately, but at 50% the height of full mode; features are not labeled.

      4 Pack: each feature is shown separately, but not necessarily on separate lines; features are labeled.

      5 Full: each feature is labeled and displayed on a separate line.

Snapshot depicts the default view of the UCSC Genome Browser, describing the genomic context of the human HIF1A gene.

      Box 4.2 GENCODE

      The consortium makes available two types of GENCODE gene sets. The Comprehensive set encompasses all gene models, and may include many alternatively spliced transcripts (isoforms) for each gene. The Basic set includes a subset of representative transcripts for each gene that prioritizes full-length protein-coding transcripts over partial- or non-protein-coding transcripts. The Ensembl Genome Browser displays the Comprehensive set by default. Although the UCSC Genome Browser displays the Basic set by default, the Comprehensive set can be selected by changing the GENCODE track settings. At the time of this writing, Ensembl is displaying GENCODE v27, released in August 2017. The GENCODE version available by default at the UCSC Genome Browser is v24, from December 2015. More recent versions of GENCODE can be added to the browser by selecting them in the All GENCODE super-track.

      GENCODE and RefSeq both aim to provide a comprehensive gene set for mouse and human. Frankish et al. (2015) have shown that, in human, the RefSeq gene set is more similar to the GENCODE Basic set, while the GENCODE Comprehensive set contains more alternative splicing and exons, as well as more novel protein-coding sequences, thus covering more of the genome. They also sought to determine which gene set would provide the best reference transcriptome for annotating variants. They found that the GENCODE Comprehensive set, because of its better genomic coverage, was better for discovering new variants with functional potential, while the GENCODE Basic set may be better suited for applications where a less complex set of transcripts is needed. Similarly, Wu et al. (2013) compared the use of different gene sets to quantify RNA-seq reads and determine gene expression levels. Like Frankish et al., they recommend using less complex gene annotations (such as the RefSeq gene set) for gene expression estimates, but more complex gene annotations (such as GENCODE) for exploratory research on novel transcriptional or regulatory mechanisms.

Скачать книгу