Computational Prediction of Protein Complexes from Protein Interaction Networks. Sriganesh Srihari

Чтение книги онлайн.

Читать онлайн книгу Computational Prediction of Protein Complexes from Protein Interaction Networks - Sriganesh Srihari страница 5

Автор:
Жанр:
Серия:
Издательство:
Computational Prediction of Protein Complexes from Protein Interaction Networks - Sriganesh Srihari ACM Books

Скачать книгу

cells have characterized 1,500–1,880 (or 8–10%) “core” protein-coding genes as essential in human cells [Marcotte et al. 2016, Silva et al. 2008, Wang et al. 2014, Hart et al. 2015, Hart et al. 2014, Wang et al. 2015, Blomen et al. 2015].

image

      Comparative analyses of proteomes from different species have revealed interesting insights into the evolution and conservation of proteins. For example, it is estimated that the genomes (proteomes) of human and budding yeast diverged about 1 billion years ago from a common ancestor [Douzery et al. 2014], and these share several thousand genes accounting for more than one-third of the yeast genome [O’Brien et al. 2005, Östlund et al. 2010]. Yeast and human orthologs are highly diverged; the amino-acid sequence similarity between human and yeast proteins ranges from 9–92%, with a genome-wide average of 32%. But, sequence similarity predicts only a part of the picture [Sun et al. 2016]. Recent studies [Kachroo et al. 2015, Laurent et al. 2015] have reported that 414 (or nearly half of the) essential protein-coding genes in yeast could be “replaced” by human genes, with replaceability depending on gene (protein) assemblies: genes in the same process tend to be similarly replaceable (e.g., sterol biosynthesis) or not replaceable (e.g., DNA replication initiation).

      Irrespective of whether in a lower-order model or a higher-order complex organism, a protein has to physically interact with other proteins and biomolecules to remain functional. Estimates in human suggest that over 80% of proteins do not function alone, but instead interact to function as macromolecular assemblies [Berggárd et al. 2007]. This organization of individual proteins into assemblies is tightly regulated in cellular space and time, and is supported by protein conformational changes, posttranslational modifications, and competitive binding [Gibson and Goldberg 2009]. On the basis of the stability (area of interaction surface and duration of interaction) and partner specificity, the interactions between proteins are classified as homo- or hetero-oligomeric, obligate or non-obligate, and permanent or transient [Zhang 2009, Nooren and Thornton 2003]. Proteins in obligate interactions cannot exist as stable structures on their own and are frequently bound to their partners upon translation and folding, whereas proteins in non-obligate interactions can exist as stable structures in bound and unbound states. Obligate interactions are generally permanent or constitutive, which once formed exist for the entire lifetime of the proteins, whereas non-obligate interactions may be permanent, or alternatively transient, wherein the protein interacts with its partners for a brief time period and dissociates after that. Depending on the functional, spatial, and temporal context of the interactions, protein assemblies are classified as protein complexes, functional modules, and biochemical (metabolic) and signaling pathways.

      Protein complexes are the most basic forms of protein assemblies and constitute fundamental functional units within cells. Complexes are stoichiometrically stable structures and are formed from physical interactions between proteins coming together at a specific time and space. Complexes are responsible for a wide range of functions within cells including formation of cytoskeleton, transportation of cargo, metabolism of substrates for the production of energy, replication of DNA, protection and maintenance of the genome, transcription and translation of genes to gene products, maintenance of protein turn over, and protection of cells from internal and external damaging agents. Complexes can be permanent—i.e., once assembled can function for the entire lifetime of cells (e.g., ribosomes)—or transient—i.e., assembled temporarily to perform a specific function and are disassembled after that (e.g., cell-cycle kinase-substrate complexes formed in a cell-cycle dependent manner).

      Functional modules are formed when two or more protein complexes interact with each other and often other biomolecules (viz. nucleic acids, sugars, lipids, small molecules, and individual proteins) at a specific time and space to perform a particular function and disassociate after that. This molecular organization has been termed “protein sociology” [Robinson et al. 2007]. For example, the DNA replication machinery, highlighted earlier, is formed by a tightly coordinated assembly of DNA polymerases, DNA helicase, DNA primase, the sliding clamp and other complexes within the nucleus to ensure error-free replication of the DNA during cell division.

      Pathways are formed when sets of complexes and individual proteins interact via an ordered sequence of interactions to transduce signals (signaling pathways) or metabolize substrates from one form to another (metabolic pathways). For example, the MAPK pathway is composed of a sequence of microtubule-associated protein kinases (MAPKs) that transduce signals from the cell membrane to the nucleus, to induce the transcription of specific genes within the nucleus. Unlike complexes and functional modules, pathways do not require all components to co-localize in time and space.

      Physical interactions between proteins are fundamental to the formation of protein complexes. Therefore, mapping the entire complement of protein interactions (the “interactome”) occurring within cells (in vivo) is crucial for identifying and characterizing complexes. However, inferring all interactions occurring during the entire lifetime of cells in an organism is challenging, and this challenge increases multifold as the complexity of the organism increases—e.g., for multicellular organisms made up of multiple cell types.

      The development of high-throughput proteomics technologies including yeast two-hybrid- (Y2H) [Fields and Song 1989], co-immunoprecipitation (Co-IP) [Golemis and Adams 2002] and affinity-purification (AP)-based [Rigaut et al. 1999] screens have revolutionized our ability to interrogate protein interactions on a massive scale, and have enabled global surveys of interactomes from a number of organisms. In particular, up to 70% of the interactions from model organisms including yeast [Ito et al. 2000, Uetz et al. 2000, Ho et al. 2002, Gavin et al. 2002, Gavin et al. 2006, Krogan et al. 2006], fly [Guruharsha et al. 2011], and nematode [Butland et al. 2005, Li et al. 2004] have been mapped, and the identification of interactions from higher-order multicellular organisms including species of flowering plant Arabidopsis, fish Danio (zebrafish), and several mammals—Mus musculus (house mouse), Rattus norvegicus (Norwegian rat), and humans—is rapidly underway; the interactions are cataloged in large public databases [Stark et al. 2011, Rolland et al. 2014].

      The earliest and most widely used experimental techniques to capture binary interacting proteins on a high-throughput scale were mostly yeast two-hybrid (Y2H) [Fields and Song 1989]. However, datasets of protein interactions inferred from Y2H screens were found to have significant numbers of spurious interactions [Von Mering et al. 2002, Bader and Hogue 2002, Bader et al. 2004]. This is attributed in part to the nature of the Y2H protocol in which all potential interactors are tested within the same compartment (nucleus) even though some of these do not meet during their lifetimes due to compartmentalization (different subcellular localizations) within living cells.

      Co-immunoprecipitation or affinity-purification (Co-IP/AP) techniques were introduced later and these are more specific in detecting interactions between co-complexed proteins [Golemis and Adams 2002, Rigaut et al. 1999, Köcher and Superti-Furga 2007]. In these protocols, cohesive groups or complexes of proteins are “pulled down,” from which the binary interactions between the proteins are individually inferred. However, this indirect inference could lead to over- or under-estimation of protein interactions. In the tandem affinity purification (TAP) procedure [Rigaut et al. 1999, Puig et al. 2001], proteins of interest (“baits”) are TAP-tagged and purified in an affinity column with potential interaction partners (“preys”). The pulled-down complexes are subjected to mass spectrometric (MS) analysis to identify individual components within the complexes. However, although more reliable than Y2H, the TAP/MS procedure can be elaborate and with the inclusion of MS, it can be expensive too. The exhaustiveness

Скачать книгу