Elsevier

Genomics

Volume 79, Issue 1, January 2002, Pages 104-113
Genomics

Regular Article
Characterization of Variability in Large-Scale Gene Expression Data: Implications for Study Design

https://doi.org/10.1006/geno.2001.6675Get rights and content

Abstract

Large-scale gene expression measurement techniques provide a unique opportunity to gain insight into biological processes under normal and pathological conditions. To interpret the changes in expression profiles for thousands of genes, we face the nontrivial problem of understanding the significance of these changes. In practice, the sources of background variability in expression data can be divided into three categories: technical, physiological, and sampling. To assess the relative importance of these sources of background variation, we generated replicate gene expression profiles on high-density Affymetrix GeneChip oligonucleotide arrays, using either identical RNA samples or RNA samples obtained under similar biological states. We derived a novel measure of dispersion in two-way comparisons, using a linear characteristic function. When comparing expression profiles from replicate tests using the same RNA sample (a test for technical variability), we observed a level of dispersion similar to the pattern obtained with RNA samples from replicate cultures of the same cell line (a test for physiological variability). On the other hand, a higher level of dispersion was observed when tissue samples of different animals were compared (an example of sampling variability). This implies that, in experiments in which samples from different subjects are used, the variation induced by the stimulus may be masked by non-stimuli-related differences in the subjects' biological state. These analyses underscore the need for replica experiments to reliably interpret large-scale expression data sets, even with simple microarray experiments.

Section snippets

Supplementary Files

The supplementary data, HudsonGENO2001-0291Sup.doc, is a Microsoft Word document, which contains Appendix A: Description of Analytic Methods.

References (17)

  • T.R. Hughes

    Functional discovery via a compendium of expression profiles

    Cell

    (2000)
  • B. Lemieux et al.

    Overview of DNA chip technology

    Mol. Breeding

    (1998)
  • A.J. Carlisle

    Development of a prostate cDNA microarray and statistical gene expression analysis package

    Mol. Carcinogen.

    (2000)
  • M. Schena et al.

    Quantitative monitoring of gene expression patterns with a complementary DNA microarray

    Science

    (1995)
  • M. Schena

    Parallel human genome analysis: Microarray-based expression monitoring of 1000 genes

    Proc. Natl. Acad. Sci. USA

    (1996)
  • S.A. Amundson

    Fluorescent cDNA microarray hybridization reveals complexity and heterogeneity of cellular genotoxic stress responses

    Oncogene

    (1999)
  • L. Zhou

    Analysis of the pattern of gene expression during human adipogenesis by DNA microarray

    Biotechnol. Techniques

    (1999)
  • P. Tamayo

    Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation

    Proc. Natl. Acad. Sci. USA

    (1999)
There are more references available in the full text version of this article.

Cited by (165)

  • Identification and analysis of house-keeping and tissue-specific genes based on RNA-seq data sets across 15 mouse tissues

    2016, Gene
    Citation Excerpt :

    We can thus roughly categorize HK genes into two subclasses: constantly expressed HK genes and variably expressed HK genes. In this study, a coefficient variation (CV) model was used to assess the expression variation for each gene in distinct tissues (De Jonge et al., 2007; Novak et al., 2002), and the calculation results showed that CV scores of expression level for all genes ranged from 0.15 to 3.79 (Fig. 3A). According to the distribution, we adopted the following two criteria to define constant HK genes: (1) less than a quarter of a quantile of CV scores (0.47); (2) the maximum fold change of expression levels across tissues < 2 (log2 (FPKMy/FPKMx) < 1) (Eisenberg and Levanon, 2013; De Jonge et al., 2007).

  • Identification of GATA-4 as a novel transcriptional regulatory component of regenerating islet-derived family members

    2015, Biochimica et Biophysica Acta - Gene Regulatory Mechanisms
    Citation Excerpt :

    cDNAs were generated from total RNAs obtained 48 h after IPTG induction of IEC-6/L1 shRNA control and IEC-6/L1 shRNA GATA-4 cells (n = 3). Affymetrix GeneChip® Rat Genome 230 2.0 arrays were screened via the microarray platform of the McGill University and Génome Québec Innovation Center (http://genomequebec.mcgill.ca), as described previously [37]. To test for statistically significant changes in signal intensity (P values of ≤ 0.05), compiled data (RMA analysis) were screened using the software available on the microarray platform website.

View all citing articles on Scopus
*

To whom correspondence and requests for reprints should be addressed. Fax: (514)933-7146. E-mail: [email protected].

View full text