Alex Lewin

Official Imperial home page here


I am a postdoc in the Biostatistics Group in the Department of Epidemiology and Public Health at Imperial and a member of the BGX collaboration between our department, the Statistics group in Bristol and the Imperial College Microarray Centre, which is developing flexible Bayesian models for gene expression (microarray) data.

Most of my work is on models for differential gene expression and gene expression profiles, focussing on Bayesian mixture models with variable numbers of components. I have developed a Bayesian hierarchical model for differential expression, incorporating expression-level-dependent normalisation and stabilision of gene variances. Our collaboration is integrating this work with the model for Affymetrix perfect match and mismatch data being developed in our group. I have also worked on Bayesian mixture models for classifiying gene profiles and estimating the false discovery rate. Current work is to extend and combine these models.

Prior to this I worked on the Landfill project as part of SAHSU, analysing cancer risks in populations living near landfill sites, and developing a Bayesian hierarchical model for the analysis of risk of birth defects around landfill sites.

For my PhD (in the Theoretical Physics Group at Imperial) I worked with Andy Albrecht and Joao Magueijo on detecting non-Gaussianity in the cosmic microwave background, and on the observability of oscillations in the microwave background and large scale structure power spectra. Whilst visiting Andy Albrecht's Cosmology Group at UC Davis as part of my PhD I worked with the Supernovae Cosmology Project in Berkeley comparing different methods of analysing Type Ia supernovae light curves.

Papers and Presentations

Microarray Papers:

Lewin, A., Richardson, S., Marshall C., Glazier A. and Aitman T. (2004) Bayesian Modelling of Differential Gene Expression. (submitted)   pdf of paper   supplementary material, including WinBUGS code

NEW This paper has changed (17 September 2004). The main change is the addition of a section which shows the effect on the false positive rate of carrying out normalisation in a pre-processing step rather than as part of an integrated model.

Abstract: We present a Bayesian hierarchical model for detecting differentially expressing genes that includes simultaneous estimation of array effects, and show how to use the output for choosing lists of genes for further investigation. We give empirical evidence that expression-level dependent array effects are needed, and explore different non-linear functions as part of our model-based approach to normalisation. The model includes gene-specific variances but imposes some necessary shrinkage through a hierarchical structure. Model criticism via posterior predictive checks is discussed. Modelling the array effects (normalisation) simultaneously with differential expression gives fewer false positive results. To choose a list of genes, we propose to combine various criteria (for instance, fold change and overall expression) into a single indicator variable for each gene. The posterior distribution of these variables is used to pick the list of genes, thereby taking into account uncertainty in parameter estimates.

Broët, P., Lewin, A., Richardson, S., Dalmasso, C. and Magdelenat, H. (2004) A mixture model based strategy for selecting sets of genes in multiclass response microarray experiments. Bioinformatics 2004 20(16):2562-2571; doi:10.1093/bioinformatics/bth285. journal page   journal TOC   software

Abstract: Multiclass response (MCR) experiments are those in which there are more than two classes to be compared. In these experiments, though the null hypothesis is simple, there are typically many patterns of gene expression changes across the different classes that lead to complex alternatives. In this paper, we propose a new strategy for selecting genes in MCR based on a flexible mixture model for the marginal distribution of a modified F statistic. Using this model, false positive and negative discovery rates can be estimated and combined to produce a rule for selecting a subset of genes. Moreover, the method proposed allows calculation of these rates for any predefined subset of genes. We illustrate the performance our approach using simulated datasets and a real breast cancer microarray dataset from Hedenfalk et al. (2001). In this latter study, we investigate predefined subset of genes and point out interesting differences between three distinct biological pathways.

Microarray Presentations:

Seminar on A Bayesian Model for Simultaneous Normalisation and Differential Expression (Dept. of Statistics, NTNU, Trondheim, November 2004). ppt

Talk on Simultaneous Normalisation and Differential Expression (BBSRC ExGen Grantholders Meeting, Windsor, October 2004). ppt

Seminar on Normalisation, Differential Expression and a Bayesian Mixture Model for estimating the False Discovery Rate (similar talks given in the School of Applied Statistics, Reading, February 2004 and in the Institut Pasteur in Lille, March 2004). pdf ppt

Poster on A Bayesian Mixture Model for estimating the False Discovery Rate (presented at 'Mathematical and Statistical Aspects of Molecular Biology XIV', Cambridge, March 2004 and 'Statistics in Functional Genomics' Workshop, Ascona July 2004). ppt

Poster on Gene-expression index, Normalisation and Differential Expression (joint poster with Anne-Mette Hein, presented at the Royal Statistical Society Conference in Leuven, July 2003). pdf

Talk on Normalisation and Differential Expression (Royal Statistical Society Workshop, Wye, July 2003). pdf

Talk on Gene-expression index, Normalisation and Differential Expression (EPSRC Network 'Systems Theory and Genomics' Workshop, UMIST, September 2002). pdf


Jarup, L., Briggs, D., de Hoogh, C., Morris, S., Hurt, C., Lewin, A., Maitland, I., Richardson, S., Wakefield J. and Elliott P. (2002) Cancer risks in populations living near landfill sites in Great Britain. British Journal of Cancer 86, 1732-1736 journal page

Publications from my PhD:

Lewin, A. and Albrecht, A. (2001) Can inflationary models of cosmic perturbations evade the secondary oscillation test? Physical Review D 64 023514 ps/pdf at arXiv

Lewin, A., Albrecht, A. and Magueijo, J. (1999) A new statistic for picking out Non-Gaussianity in the CMB. Monthly Notes of the Royal Astronomical Society 302 131-138 ps/pdf at arXiv

Magueijo, J. and Lewin, A. (1997) Non-Gaussian spectra and the search for cosmic strings. Contribution to the proceedings of "Topological defects and CMB", Rome, October 96 ps/pdf at arXiv

So you want to see the world

Something funny

Gorgeous cats

Gorgeous bookmarks