Principles of Virology, Volume 1. Jane Flint
Чтение книги онлайн.
Читать онлайн книгу Principles of Virology, Volume 1 - Jane Flint страница 45
Some computational pipelines are designed to define the abundance and types of viruses in a sample, such as Viral Informatics Resource for Metagenome Exploration (VIROME), the Viral MetaGenome Annotation Project (VMGAP), and Basic Local Alignment Search Tool (BLAST). Other virus discovery programs (MePIC, READSCAN, CaPSID, VirusFinder, and SRSA) rely on nucleotide sequence alignment and will work only for the detection of viruses with high sequence similarity to known viruses. PathSeq, SURPI, VirFind, and VirusHunter identify viruses by amino acid searches, a computationally demanding exercise that is critical for new virus identification. VirusSeeker-Virome (VS-Virome) is a computational pipeline designed for defining both the type and abundance of known and novel viral sequences in metagenomic data sets (Fig. 2.17).
Genome sequences can provide considerable insight into the evolutionary relationships among viruses. Such information can be used to understand the origin of viruses and how selection pressures change viral genomes and to assist in epidemiological investigations of viral outbreaks. When few viral genome sequences were available, pairwise homologies were often displayed in simple tables. As sequence databases increased in size, tables of multiple alignments were created, but these were still based only on pairwise comparisons. Today, phylogenetic trees are used to illustrate the relationships among numerous viruses or viral proteins (Box 2.10). Not only are such trees important tools for understanding evolutionary relationships, but they may allow conclusions to be drawn about biological functions: examination of a phylogenetic tree may allow determination of how closely or distantly a sequence relates to one of known function. Software programs such as AdaPatch, AntiPatch, and AntigenicTree have been developed to produce phylogenetic trees. However, these approaches do not account for horizontal gene transfer, recombination, or the evolutionary relationships between viruses and their hosts, which will require unconventional computational methods to resolve.
Algorithms have also been written to apply high-throughput sequencing methods to a variety of genome-wide analyses, including detection of single-nucleotide polymorphisms (SNP), RNA-seq, ChiP-seq, CLIP, and more (see below).
Viral Reproduction: the Burst Concept
A fundamental and important principle is that viruses are reproduced via the assembly of preformed components into particles: the parts are first made in cells and then assembled into the final product. This simple build-and-assemble strategy is unique to all viruses, but the details of how this process transpires are astonishingly diverse among members of different virus families. There are many ways to build a virus particle, and each one tells us something new about virus structure and assembly.
Modern investigations of viral reproduction strategies have their origins in the work of Max Delbrück and colleagues, who studied the T-even bacteriophages starting in 1937. Delbrück believed that these bacteriophages were perfect models for understanding the basis of heredity. He focused his attention on the fact that one bacterial cell usually makes hundreds of progeny virus particles. The yield from one cell is one viral generation; it was called the burst because the viruses that he studied literally burst from the infected cell. Under carefully controlled laboratory conditions, most cells make, on average, about the same number of bacteriophages per cell. For example, in one of Delbrück’s experiments, the average number of bacteriophage T4 particles produced from individual single-cell bursts from Escherichia coli cells was 150 particles per cell.
Another important implication of the burst is that a cell has a finite capacity to produce virus. Multiple parameters limit the number of particles produced per cell. These include metabolic resources, the number of sites for genome replication in the cell, the regulation of release of virus particles, and host defenses. In general, larger cells (e.g., eukaryotic cells) produce more virus particles per cell: yields of 1,000 to 10,000 virions per eukaryotic cell are not uncommon.
A burst occurs for viruses that kill the cell after infection, namely, cytopathic viruses. However, some viruses do not kill their host cells, and virus particles are produced as long as the cell is alive. Examples include filamentous bacteriophages, most retroviruses, and hepatitis viruses.
The One-Step Growth Cycle
The idea that one-step growth analysis can be used to study the single-cell reproductive cycle of viruses originated from the work on bacteriophages by Emory Ellis and Delbrück. In their classic experiment, they added virus particles to a culture of rapidly growing E. coli. These particles adsorbed quickly to the cells. The infected culture was then diluted, preventing further adsorption of unbound particles. This simple dilution step is the key to the experiment: it reduces further binding of virus to cells and effectively synchronizes the infection. Samples of the diluted culture were then taken every few minutes and analyzed for the number of infectious bacteriophages.
When the results of this experiment were plotted, several key observations emerged. The graphs were surprising in that they did not resemble the growth curves of bacteria or cultured cells. After a short lag, bacterial cell growth becomes exponential (i.e., each progeny cell is capable of dividing into two cells) and follows a straight line (Fig. 2.18A). Exponential growth continues until the nutrients in the medium are exhausted. In contrast, numbers of new viruses do not increase in a linear fashion from the start of the infection (Fig. 2.18B, left). There is an initial lag period in which no infectious viruses can be detected. This lag period is followed by a rapid increase in the number of infectious particles, which then plateaus. The single cycle of virus reproduction produces this “burst” of virus progeny. If the experiment is repeated, such that only a few cells are initially infected, the graph looks different (Fig. 2.18B, right). Instead of a single cycle, there is a stepwise increase in numbers of new viruses with time. Each step represents one cycle of virus infection.
Figure 2.17 Workflow for VS-Virome. Shown is the computational pipeline designed for defining the type and abundance of known and novel viral sequences in metagenomic data sets. VS-Virome first pre-processes the sequences (left) to remove adapter sequences (these are added to every DNA in the sample, and contain barcoding sequences, primer binding sites, and sequences for immobilizing the DNA), joins paired end reads if they overlap, performs quality control on sequences, and identifies low-complexity sequences and host sequences before subjecting all the sequences to BLAST (right) to detect viral sequences. Because integrated prophage are found in bacterial genomes, alignment to comprehensive databases could lead to removal of bona fide bacteriophage sequences. Bacteriophage hits are therefore placed into a separate output file. Candidate eukaryotic viral sequences are filtered to remove sequences that have high identity to bacterial genomes. Remaining reads are then aligned to the more comprehensive GenBank NT and NR databases to identify reads or contigs that have greater similarity to nonviral sequences than to viral sequences (i.e., increased likelihood of being a false positive). To have a high degree of confidence in viral classification, sequences that have significant hits to both viral and any nonviral reference sequence are placed in an “ambiguous” bin. Sequences in the viral bin only have significant alignment to viral sequences.
Once