
Jump to:
Freely distributable documents and computer programs
Le Progiciel R / The R Package
CCA Bibliography 86-96
Lexique anglais-français d'écologie numérique et de statistique


|
|
Program for multiple linear regression with permutation test. The program includes regression through the origin.
Pierre Legendre
December 1999
Département de Sciences Biologiques
Université de Montréal
This program computes a multiple linear regression and performs
tests of significance of the equation parameters using permutations. In
this new version, the regression line can be forced through the origin.
Permutation testing is recommended when the residuals of the regression
equation are not normally distributed; it does not solve the problems
caused by heteroscedasticity. Two permutation methods are available in the
program:
- Permutation of the raw data.
- Permutation of residuals of the full regression model (ter Braak 1990, 1992).
Details about these permutation methods are given in Legendre & Legendre (1998, pp. 606-612) and in Anderson & Legendre (1999).
Monte Carlo simulations conducted by Anderson & Legendre (1999) concluded that:
- For data with non-normal error structure, permutation tests had type I error closer to the nominal significance level alpha and greater power than the normal-theory t-test.
- Permutation of raw data and permutation of residuals gave asymptotically equivalent results in most situations and provided good approximative tests for partial regression coefficients.
- Permutation of raw data resulted in unstable (and often inflated) type I error when the covariable contained an extreme outlier, whether or not there was collinearity between predictor variables, or the data were normal or non-normal. This problem was not amended with increasing sample sizes.
- The presence of outliers in the covariable did not adversely affect the tests based on permutation of residuals.
Thus, permutation of raw data should not be used when the covariables contain (or may contain) outliers; permutation of residuals should be used in that case.
Program availability
- Macintosh version
- Fortran source code for Macintosh (file Regression_test.f), which can be compiled using a Fortran compiler. The user may modify the Parameter statement at the beginning of the program, which fixes the size (nmax, mmax) of the largest data matrix which may be analysed.
- Compiled versions of the program for any Macintosh computer, including MacOS X
- Program documentation, in Adobe Acrobat format
- Sample files
- 32-bit DOS version
(The executable file is a Win32 "console" executable, not a DOS executable. Therefore it cannot run under plain DOS, nor in a DOS window under Windows 3.x, only in Windows 95/98 or Windows NT consoles)
- Fortran source code (file regressn.f)
- Compiled version of the program for Win32 compatible computers
- Program documentation, in Adobe Acrobat format
- Sample files
- Anderson, M. J. & P. Legendre. 1999. An empirical comparison of permutation methods for tests of partial regression coefficients in a linear model. Journal of Statistical Computation and Simulation 62: 271-303.
- Legendre, P. & Legendre, L. 1998. Numerical Ecology, 2nd English edition. Elsevier Science BV, Amsterdam. xv + 853 pages.
- "Legendre_Desdevises_JTB_in_press.pdf"
- Sokal, R. R. & F. J. Rohlf. 1995. Biometry - The principles and practice of statistics in biological research. 3rd edition. W. H. Freeman, New York.
- ter Braak, C. J. F. 1990. Update notes: CANOCO version 3.10. Agricultural Mathematics Group, Wageningen.
- ter Braak, C. J. F. 1992. Permutation versus bootstrap significance tests in multiple regression and ANOVA. 79-86 in: K.-H. Jöckel, G. Rothe & W. Sendler [eds.] Bootstrapping and related techniques. Springer-Verlag, Berlin.
Last updated on Sunday, August 01, 2010 by Philippe Casgrain
Created on Thursday, December 09, 1999
|