Genome Editing in Drug Discovery. Группа авторов
Чтение книги онлайн.
Читать онлайн книгу Genome Editing in Drug Discovery - Группа авторов страница 20
3.3 The Diversity of CRISPR Systems
Thanks to the greater availability of prokaryotic genome sequences, bioinformatic analysis has revealed that CRISPR systems are extremely abundant in prokaryotes, with roughly 40% of bacterial and over 85% of archaeal species harboring these systems (Makarova et al. 2019). This diversity, conferred by the remarkable variety of the Cas protein sequences, gene composition, and architecture of the loci, underpins the differences in how each of the three phases of adaptive immunity is performed. CRISPR systems have not only evolved to use different types of nucleic acids (DNA, RNA, or both) as a substrate (Marraffini and Sontheimer 2008; Hale et al. 2009; Kazlauskiene et al. 2016), but also can target different modalities (i.e. single‐ or double‐stranded) (Ma et al. 2015; Strutt et al. 2018) and a wide spectrum of different genomic sequences, thanks to diverse PAM requirements (Mojica et al. 2009; Gasiunas et al. 2020).
The origin and evolution of CRISPR systems are complex, but it seems that they have originated by domesticating various components of mobile genetic elements and subsequent integration with the “antediluvian” prokaryotic toxin–antitoxin defense systems (Koonin and Makarova 2019). The module of genes responsible for the adaptation phase is thought to originate from a class of transposons carrying a homologue of cas1 gene, collectively named casposons, while the CRISPR array is thought to originate from the inverted repeats flanking the casposon. The effector module seems to have evolved from a transposon‐encoded nuclease able to target DNA and RNA, which through a putative series of duplication and subsequent deletions lead to the current collection of effector enzymatic activities (Makarova et al. 2019; Koonin and Makarova 2019).
Importantly, the diversity of Cas systems across species also translates into how these systems can be used as a tool, where one can choose the most suitable CRISPR system for their target (DNA or RNA), a sequence of choice (by choosing a Cas protein with a pertinent PAM) or application (by choosing a Cas system with the desired outcome). To fully explore this untapped potential of the microbial CRISPR systems, significant efforts to establish a robust classification of CRISPR‐Cas systems have been made over the past decade. As there are no universally present cas genes that could act as an identifying trait, CRISPR classifications have been based on multiple factors, mainly on comparison of genomic loci organization and gene repertoires involved in a particular system. The most up‐to‐date classification is used in this chapter (Makarova et al. 2019).
All CRISPR‐Cas systems can be divided into two distinct classes based on the organization of the effector complex (Figure 3.2). In class 1 systems, the effector complex is composed of multiple Cas proteins, where individual subunits are needed for binding crRNA and recognition of the target nucleic acids, unwinding, and nucleolytic degradation. In contrast, class 2 systems contain a single multidomain protein, which catalyzes all of the activities necessary for the interference phase. Each of the classes can then be further divided into six types (I–VI) based on the organization of individual cas genes into functional modules, and then further classified into subtypes. Cas gene nomenclature also reflects their classification: currently, the set of cas genes involved in adaptation or crRNA biogenesis are denoted as cas1‐11 and are shared across different types. In contrast, the names of effector cas genes are reserved for specific types – for example, type II uses Cas9, type V uses Cas12, and type VI Cas13. Subtypes are further specified by suffices, such as Cas12a, Cas12b, Cas12c, and so on; however, as many on the genes also have older familiar names (e.g. Cpf1 is the original name for Cas12a), these will be mentioned where appropriate as a point of reference. The ancillary genes continue to be referred to by their legacy names (e.g. Csm6).
While the interference module is the prominent feature in the classification of CRISPR systems, each of the classes and types also differ mechanistically in the manners of crRNA biogenesis and acquisition of spacers. The traits of main CRISPR‐Cas systems will be discussed, with some differences between different subtypes touched upon as well; however, the intricacies and finesses of further classification are beyond the scope of this Chapter and the reader is invited to consult recent excellent reviews on the topic (Hille et al. 2018; Koonin and Makarova 2019; Makarova et al. 2019; Nussenzweig and Marraffini 2020).
3.3.1 crRNA Biogenesis
Synthesis of crRNAs starts with the transcription of the CRISPR array from a promoter usually located within the leader sequence (Pul et al. 2010; Pougach et al. 2010). Processing of the pre‐crRNA is specific to each of the CRISPR class, with class 1 pre‐crRNAs cleaved by dedicated endoribonuclease, whereas the class 2 systems employ the same machinery that performs target destruction (Figure 3.3).
In class 1 systems, the processing is performed by either Cas6 or Cas5d ribonucleases (Nam et al. 2012; Carte et al. 2008). Both of these proteins recognize and bind to the hairpin structure formed by the palindromic sequences of the pre‐cRNA, and introduce a cut immediately downstream of it (Figure 3.3a), releasing mature crRNAs (Carte et al. 2010; Haurwitz et al. 2010; Ozcan et al. 2019). Intriguingly, in CRISPR systems containing repeats which are not thermodynamically likely to form hairpin structures (namely type I‐A and ‐B, and type III‐A and ‐B) and, hence, lack inherent discriminatory borders between spacers, Cas6 seems to be able to identify repeat regions by restructuring them to favor the formation of a hairpin or hairpin‐like structure compatible with precise cleavage that will lead to productive crRNAs (Shao et al. 2016; Sefcikova et al. 2017). Mature crRNAs of most type I systems contain part of the repeat sequence at the 5’ end of the spacer and the 3’ hairpin; these do not participate in recognition of the target sequence but seem to be important for the assembly of the effector complex (Jore et al. 2011). Type III crRNA, on the other hand, undergoes additional trimming that removes the hairpin structure (Hale et al. 2008). How these mature crRNAs are paired to the cognate effector complex remains unanswered.
Class 2 systems employ two different strategies to generate mature crRNAs. The first strategy employed by type V and VI is in principle similar to class 1 crRNA biogenesis (Figure 3.3b). Here, the effector nucleases, such as Cas12a (Cpf1) and Cas13, recognize the repeat hairpin structure within the pre‐crRNA and cleave the RNA within or upstream of it (East‐Seletsky et al. 2016; Fonfara et al. 2016).
A more elaborate strategy is used to generate mature crRNAs in type II and type V‐B systems (Figure 3.3c). These two systems, exemplified by Cas9 and Cas12b (C2c1), require a second noncoding trans‐activating CRISPR RNA (tracrRNA) to pair with the repeat regions within the pre‐crRNA and form an intermediary between the crRNA and the effector protein (Deltcheva et al. 2011; Shmakov et al. 2015). The stem‐loops of the tracrRNA act as a recruitment site for Cas9 and Cas12b, permitting them to form a ternary pre‐crRNA:tracrRNA:Cas effector complex. The binding of the effector protein further stabilizes the interaction between pre‐crRNA and tracrRNA, but also recruits cellular RNase III that cleaves the RNA:RNA duplex formed by the repeat sequences of the pre‐crRNA and tracrRNA, releasing the 3’ end of the crRNA (Deltcheva et al. 2011). The 5’ end of the crRNA is processed further by removing the remaining repeat sequence, but the protein involved has remained elusive (Hille et al. 2018; Nussenzweig and Marraffini 2020). Once fully processed, crRNA paired with its cognate effector protein can patrol the cytosol and confer immunity to any invading DNA.
Figure 3.2 Overview of class 1 and class 2 CRISPR