Lewin, A., Richardson, S., Marshall C., Glazier A. and Aitman T. (2005) Bayesian Modelling of Differential Gene Expression. Biometrics (in press) abstract
-
-
Abstract: We present a Bayesian hierarchical model for detecting
differentially expressing genes that includes simultaneous
estimation of array effects, and show how to use the output for
choosing lists of genes for further investigation. We give empirical
evidence that expression-level dependent array effects are needed,
and explore different non-linear functions as part of our
model-based approach to normalisation. The model includes
gene-specific variances but imposes some necessary shrinkage through
a hierarchical structure. Model criticism via posterior predictive
checks is discussed. Modelling the array effects (normalisation)
simultaneously with differential expression gives fewer false
positive results. To choose a list of genes, we propose to combine
various criteria (for instance, fold change and overall expression)
into a single indicator variable for each gene. The posterior
distribution of these variables is used to pick the list of genes,
thereby taking into account uncertainty in parameter estimates.
Broët, P., Lewin, A., Richardson, S., Dalmasso, C. and Magdelenat, H. (2004)
A mixture model based strategy for selecting sets of genes in multiclass response microarray experiments.
Bioinformatics 2004 20(16):2562-2571;
doi:10.1093/bioinformatics/bth285.
-
-
Abstract: Multiclass response (MCR) experiments are those in
which there are more than two classes to be compared. In these experiments,
though the null hypothesis is simple, there are typically many patterns of
gene expression changes across the different classes that lead to complex
alternatives.
In this paper, we propose a new
strategy for selecting genes in MCR based on a
flexible mixture model for the marginal distribution of a modified
F statistic.
Using this model, false positive and negative discovery
rates can be estimated and combined to produce a rule for
selecting a subset of genes. Moreover, the method proposed allows
calculation of these rates for any predefined subset of genes.
We illustrate the performance our approach using
simulated datasets and a real breast cancer microarray dataset from
Hedenfalk et al. (2001). In this latter study, we investigate
predefined subset of genes and point out interesting differences between
three distinct biological pathways.