Genetic Analysis of Complex Disease. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Genetic Analysis of Complex Disease - Группа авторов страница 11

Genetic Analysis of Complex Disease - Группа авторов

Скачать книгу

      Identification of Datasets

      It is helpful to identify early on what potential datasets exist or can be collected. Do large families exist or are most cases apparently sporadic? Are large cohort or case–control studies available? Are there repositories of multiplex families with associated clinical data available? Are there existing clinical networks or large specialty clinics available? Is the necessary phenotype data available in a biobank linked to an existing electronic health record? The answers to these questions determine what study designs are feasible for the trait under study, as discussed in Chapters 3 and 4.

      Develop Study Design

      The exact approach to the disease gene discovery process should be outlined as completely as possible before the project gets underway. With the clinical phenotype in hand, it is possible to determine the best strategy for defining what type of dataset to collect. Participant recruitment is perhaps the longest and most labor‐intensive step in the entire process. It is imperative that the enrollment of participants (particularly if studying multiple members of the same family) proceeds with careful consideration of the wishes and norms of the participating individuals, families, and communities. The rights of individuals to participate or refuse participation should receive careful consideration, and the informed consent process should provide adequate explanation of the study and answer any questions, and, critically, confidentiality must be carefully protected. These issues are outlined in detail in Chapter 5.

      Determination of the study design (case–control, cohort, case series, family‐based) is based on the characteristics of the phenotype, the estimated genetic model, and the research objective. For example, the existence of large families with apparent Mendelian segregation suggests that a single major gene could be detected, and a family‐based study would be appropriate. A phenotype with weaker estimated heritability, a pattern of recurrence risks suggesting many genes of small effect, and little familial aggregation would suggest that a case–control study design is most feasible. The process of selecting a study design to answer a research question is reviewed in Chapter 4.

      It is also important to have some sense of the sample size required to identify the genes being sought. When pedigree structures are already available in family‐based studies of single‐gene disorders, power is easily calculated with high confidence for specific genetic models using computer simulation programs. For complex traits, however, genetic models are not as easily specified in advance, and computer simulations often must consider a range of parameter values for the genetic model to describe the power across several competing alternatives. Chapter 12 provides an overview of the available approaches and tools for sample size, power estimation, and genetic simulations.

      Family‐Based Studies

      Family‐based studies include large extended families, smaller multi‐case families (often affected sibpair or other affected relative pairs), and discordant sibpair studies. Depending on family structure and number of individuals collected, these families may be used in linkage analyses (as discussed in Chapter 6) or association studies (Chapter 8). Depending on the genetic architecture of the trait and the frequency of the disease‐associated alleles being sought, this design may offer increased power over population‐based designs.

      Population‐Based Studies

      Approaches for Gene Discovery

      There are two general, but not mutually exclusive, ways to approach gene discovery for complex traits. The first is to take a genome‐wide screening approach. Genomic screening can aim to identify areas of genetic linkage in family‐based designs (Chapter 6) or areas of association in either family‐ or population‐based designs (Chapters 8 and 9). A good genomic screen will attempt to cover the entire human genome using markers evenly spaced across the genome. Current high‐throughput genotyping technologies enable genotyping of hundreds of thousands to millions of single nucleotide polymorphisms in a rapid, inexpensive manner for use in linkage or association studies. More recently, high‐throughput sequencing technology has been used to screen the entire coding sequence of the genome (WES) or the entire genome (WGS) for trait‐associated variants, without first conducting genome‐wide linkage or association studies. As sequencing costs continue to decline, a shift to “genotyping by sequencing” is likely, in which results from WGS might be used to conduct a genome‐wide screen and follow‐up in a single molecular experiment. These same high‐throughput genotyping and sequencing technologies allow large‐scale examination of gene expression (through gene expression microarrays or RNA‐Seq) and epigenetic changes (through methylation arrays or Methyl‐Seq) in trait‐relevant tissues. The results of such experiments are often used in conjunction with genome‐wide screens to identify high‐priority candidate genes for follow‐up studies. These technologies and their application to genomic studies are discussed in Chapter 10.

      In contrast to the genomic screening approach, a directed screening approach may be used. This approach, sometimes termed a “candidate‐gene” approach, focuses on an area of the genome selected for examination based on prior information. The additional information could come from many sources, including results from a previous genome‐wide screen, results from gene expression studies, genes suggested by pathophysiology, or candidate genes identified in model systems. For example, multiple sclerosis is an autoimmune disease in which the myelin sheaths around nerves are attacked and often destroyed.

Скачать книгу