Page 186 - Williams Hematology ( PDFDrive )
P. 186

160  Part IV:  Molecular and Cellular Hematology                                     Chapter 11:  Genomics            161




                  subclones can be defined by their somatic mutational landscape from   represent up to 60 percent of transcripts in a cell) or one that used an
                  high depth NGS, where the digital nature of the NGS data is exploited   initial poly-A enrichment step (as rRNAs are not polyadenylated). By
                  by algorithmic clustering of mutations that share the same variant allele   comparison, noncoding RNAs play a role in many cellular processes but
                  fraction (VAF). In particular, the VAF of any mutation is defined as the   are not polyadenylated, so even though poly-A enrichment would not
                  fraction of sequencing reads that contain the somatic variant (as com-  be applied, a protocol that preserves strand specificity should be.
                  pared to the germline or inherited nucleotide at that locus). Changes in   RNA is a less-stable molecule than DNA and hence assessing the
                  the heterogeneity of cancer cell populations can be studied by compar-  quality of the isolated RNA prior to creating a sequencing library is of
                  ing data from temporal sampling of a patient, such as at diagnosis and   paramount importance. The source for the RNA may be fresh tissue,
                  disease relapse.                                      fresh-frozen tissue, or formalin-fixed, paraffin-embedded (FFPE) tis-
                                                                        sue, and each of these sources may influence the quality of the resulting
                  NEXT-GENERATION SEQUENCING–BASED                      RNA. RNA derived from FFPE tissue is often at least partially degraded
                                                                        because of formalin crosslinks with the RNA backbone that result in
                  COMPREHENSIVE GENOMICS: FROM                          breakage. Similarly, the amount of RNA available from clinical speci-
                  STUDIES OF THE TRANSCRIPTOME TO DNA                   mens is often quite limited, making necessary the use of RNA amplifi-
                  METHYLATION TO CHROMATIN ACCESSIBILITY                cation prior to library construction, or the use of hybrid capture probes
                                                                                                                          39
                                                                        to enrich the on-gene yield of sequencing data from low input sources.
                  AND MODIFICATIONS                                         As the analysis of RNA-seq data is distinct in many ways compared
                  The study of modern genomics by NGS methods is not limited to the   to DNA sequencing data analysis, multiple software tools are avail-
                  sequencing of genomic DNA but also can include (1) the characteriza-  able to characterize differential gene expression, differential splicing,
                  tion of RNA transcripts, (2) the physical structure of genomes includ-  gene fusion detection, and allele-specific expression. 40,41  In regard to
                  ing chromatin organization and protein-DNA interactions, and (3) the   cancer-specific analyses of RNA, a paired “normal” comparator from
                  identification  of specific  chemical modifications  to  nucleotides  and   adjacent nonmalignant cells is often not available (or even understood),
                  histones. 37                                          which complicates the analysis and interpretation of RNA-seq data.
                                                                        However, efforts are now cataloguing  expression in  normal human
                  Analysis of the Transcriptome: RNA Sequencing         tissues and providing these results in public databases for comparison
                  RNA sequencing (RNA-seq) involves the conversion of RNA into com-  purposes.
                  plementary DNA (cDNA) by reverse transcription followed by NGS
                  library construction.  RNA-seq uses the digital nature of NGS tech-
                                 38
                  nology to quantify levels of RNA transcripts. Previously, microarrays   Next-Generation Sequencing–Based Studies of Chromatin
                  (designed with a fixed content of gene-specific probes) were used to   Modifications
                  assay gene expression by hybridization to reverse-transcribed RNA iso-  Chromatin immunoprecipitation followed by NGS-based whole-
                                                                                                        42
                  lates. By contrast, RNA-seq offers the advantages of comprehensive and   genome sequencing is known as ChIP-seq.  When studying chroma-
                  less-biased data analysis, with a broader dynamic range for detection   tin modifications (Chap. 12), the targets are often transcription factors
                  of high and low abundance transcripts. With the single base resolution   or specific histone modifications (such as methylation or acetylation)
                  provided by RNA-seq, one can determine the expression of specific   that may be important for regulation of gene expression. In brief, ChIP-
                  mutant alleles present in the germline or in cancer samples, which may   seq begins with standard chromatin immunoprecipitation: protein and
                  be highly relevant for implementing a small molecule or immunother-  DNA are crosslinked in growing cell culture, the fixed and crosslinked
                  apy-based targeted therapeutic. RNA-seq data can be analyzed to detect   DNA–protein complexes are fragmented, immunoprecipitated with an
                  the expression of alternatively spliced isoforms of transcribed genes or   antibody specific for the protein of interest, and the DNA isolated from
                  to detect the transcriptional product(s) of gene fusions in cancer cells.   the precipitated material. After DNA isolation, a standard NGS library
                  RNA-seq can be produced as either single- or paired-end reads, where   is prepared by adapter ligation and sizing, and the DNA is sequenced by
                  the latter are better suited to detect alternative splicing and gene fusions.   standard NGS methods. Given the digital nature of NGS, the number of
                  Additionally, RNA-seq data can identify strand specificity of the DNA   reads aligning to a particular area of the genome is directly proportional
                  template, wherein RNA derived from the antisense strand may play an   to the amount of input DNA from that region. Thus, one can determine
                  important role in regulating gene expression. Finally, the insert size   “peaks” with a statistically significant increased number of aligned reads
                  of the RNA-seq libraries can be targeted to enrich for different sub-  and infer that the genomic regions underlying the peaks are the specific
                  sets of the transcriptome. Small fragment size libraries (approximately   areas where the protein of interest was bound to the DNA. 43,44  Antibody
                  15 to 70 bp) enrich for microRNA (miRNA), short-interfering RNA   specificity and avidity remain key determinants for the validity of ChIP-
                  (siRNA) and PIWI-interacting RNA (piRNA), intermediate size librar-  seq data, as does identifying the appropriate coverage cutoff value that
                  ies (approximately 70 to 200 bp) enrich for small nuclear (snRNA) and   determines a “peak.”
                  small nucleolar RNA (snoRNA), and larger fragment libraries (exclud-
                  ing fragments less than 200 bp) enrich for messenger RNA (mRNA) and   Next-Generation Sequencing–Based Studies of Chromatin
                  long noncoding RNA (lncRNA).                          Accessibility
                     There are many protocols for RNA-seq, including different com-  The interaction of DNA and proteins to form chromatin plays an
                  mercially available kits that exploit the aforementioned experimental   increasingly recognized role in the study of genomics and epigenomics
                  focus areas. For example, protocols to study the “transcriptome,” which   (Chap. 12). Several methods using NGS-based approaches can inter-
                  is defined as all the expressed RNA from a given cell or cell popula-  rogate the physical structure of DNA. These methods, which fragment
                  tion, are often optimized to preferentially target one (or more) types   DNA based on the accessibility of chromatin, allow for the determi-
                  of RNA that are pertinent to a particular area of clinical or research   nation of nucleosome positioning and inferred protein–DNA binding
                  interest. Thus, a researcher interested only in detecting gene expres-  sites. Although these studies are not a direct method for determining
                  sion of annotated mRNA transcripts would choose either an RNA-seq   specific protein–DNA binding sites, one can use sequence from the
                  protocol that included ribosomal RNA (rRNA) depletion (rRNA may   inferred protein–DNA binding sites as an indirect method for assaying







          Kaushansky_chapter 11_p0155-0164.indd   161                                                                   9/18/15   11:48 PM
   181   182   183   184   185   186   187   188   189   190   191