
Jump to:
The R Package, version 4.0
Diagram types in the R Package
R 4.0 Progress Info
The R Package 4.0
R 4.0 Known bugs
R 4.0 for Windows
Freely distributable documents and computer programs


|
|
It's in the works!
I apologize to my french-speaking colleagues. Though this page is not available in French, all the programs are fully bilingual. When the development cycle is completed, I will make sure this website is bilingual; for now, as things change rapidly, it's just not worth the effort.
Distribution of R 4.0 preview releases has started. I am working full-time on Version 4.0 and a more complete alpha release. Version 4.0 is now an integrated package, not a collection of separate modules. Up-to-the minute progres information is found below. Stay tuned by subscribing to the mailing list and bookmarking this page.
You may now download the latest build directly from our ftp site:
Current progress (as of 01/08/10; 14:16:39) on version 4.0d10, build [Macro error: The file “• Moof! •:Users:philippe:Projects:cvs:Progiciel_R:buildcount.data” wasn’t found.]
:
(modifications are in bold)
Currently being worked on:
- MacOS X version
- Cocopan
- Windows version (preliminary tests)
- Integration of StoneTable to replace the List Manager
- >16k cells displayed
- Non-scrolling row/col headers
- Easy in-cell editing
- Row/col headers
- Values within cells
- Add/delete rows and columns (see Copy and Paste)
- The User's manual is being written as the program is being developed, and is distributed with the program
What works and what's new in this version
- Import of CANOCO files of any format
- Modules
- SIMIL (Mard 09 mars 2004 at 23:14:37 by PhC)
- Similarity coefficients (between objects) tested against version 3
- S01 — (a+d)/(a+b+c+d) Simple matching coeff. (Sokal & Michener)
- S02 — (a+d)/(a+2b+2c+d) (Rogers & Tanimoto)
- S03 — (2a+2d)/(2a+b+c+2d)
- S04 — (a+d)/(b+c)
- S05 — (1/4) [a/(a+b) + a/(a+c) + d/(b+d) + d/(c+d)]
- S06 — ad/√[(a+b)(a+c)(b+d)(c+d)]
- S07 — a/(a+b+c) Coefficient of community (Jaccard)
- S08 — 2a/(2a+b+c) (Sørensen, Dice)
- S09 — 3a/(3a+b+c)
- S10 — a/(a+2b+2c)
- S11 — a/(a+b+c+d) (Russell & Rao)
- S12 — a/(b+c) (Kulczynski)
- S13 — (1/2) [a/(a+b)+a/(a+c)] (Kulczynski)
- S14 — a/√[(a+b)(a+c)] (Ochiai)
- S15 — Σ(w[i] s[i])/Σ(w[i]) (Gower, symmetrical)
- S16 — Σ(w[i] s'[i])/Σ(w[i]) (Estabrook & Rogers)
- S17 — 2W/(A+B) (Steinhaus)
- S18 — (1/2) [(W/A) + (W/B)] (Kulczynski)
- S19 — Σ(w[i] s[i])/Σ(w[i]) (Gower, asymmetrical)
- S20 — Σ(w[i] s'[i])/Σ(w[i]) (Legendre & Chodorowski)
- S21 — Chi-square similarity (Roux & Reyssac)
- S22 — Chi-square probabilistic similarity
- S23 — Goodall's probabilistic coefficient
- S24 — [a/√((a+b)(a+c))] - 0.5√(a+c) (Fager & McGowan)
- S25 — 1 - p(Chi-square) (Krylov)
- S26 — [ a + (d/2) ]/(a+b+c+d) (Faith)
- NEI — Nei's genetic similarity (bounded between 0 and 1)
- Distance coefficients (between objects) tested against version 3
- D01 — Euclidean distance
- D02 — Taxonomic, or average distance
- D03 — Chord distance
- D04 — Geodesic metric
- D05 — Mahalanobis generalized distance (among groups)
- D06 — Minkowski metric
- D07 — Manhattan metric
- D08 — Mean character difference (Czekanowski)
- D09 — Index of association (Whittaker)
- D10 — Canberra metric (Lance & Williams)
- D11 — Coefficient of divergence (Clark)
- D13 — Nonmetric coefficient (Watson, Williams & Lance)
- D14 — Percentage difference (Odum; Bray & Curtis)
- R Mode coefficients (between variables) tested against version 3
- TAU — Kendall's Tau (new algorithm)
- RP — Pearson's r
- RS — Spearman's r
- CHI — G statistic (Wilks' chi-square)
- HT — Tschuproff's contingency coefficient
- HS0 — (reciprocal information; Estabrook)
- HS1 — (Rajski's coherence coefficient)
- HS2 — (symmetric uncertainty coefficient)
- HD — (Rajski's metric)
- New coefficients
- S27 — 1 - p(a): Raup & Crick's modified probabilistic association coefficient (Raup & Crick)
- D15 — Chi-square metric (introduced in Simil 3.02)
- D16 — Chi-square distance (introduced in Simil 3.02)
- D17 — Hellinger distance
- Untested/Not yet implemented
- D12 — Coefficient of racial likeness (among groups; Pearson)
- Correspondance Analysis
- Periodograph
- Chrono
- Biogeo (spatially-constrained clustering) (Vend 29 juin 2001 at 09:59:04 by PhC)
- PnComp (Principal components) (Lund 07 mai 2001 at 14:25:48 by PhC)
- Works on any-sized matrix, as long as you have enough memory
- Still missing
- Supplementary objects and descriptors
- Pairwise deletion of missing values
- K-Means (Lund 12 mars 2001 at 15:46:52 by PhC)
- Mantel (Merc 14 févr 2001 at 16:18:21 by PhC)
- Partial Mantel correlation rewritten
- No longer uses Smouse, Long & Sokal 1986
- Works with both Mantel's r and Spearman's r
- Limited permutations with group selection
- Current correlation coefficients:
- Mantel's Z
- Mantel's r
- Mantel's t (permutation-free probability approximation, useful for large, > 30 objects datasets)
- Added correlation coefficients:
- Spearman's r
- Kendall's Tau
- Left to do
- PROTEST: Procrustes statistic sensu Jackson, 1995.
- Mantel's correlogram is now a part of module Autocor
- Links (Merc 14 févr 2001 at 15:16:05 by PhC)
- Autocor (Merc 29 mars 2000 at 11:29:57 by PhC)
- Cluster (Merc 08 mars 2000 at 16:43:00 by PhC)
- VerNorm (Jeud 08 avri 1999 at 12:20:36 by PhC)
- Transpose any rectangular matrix (bugfix on Jeud 08 avri 1999)
- Make data positive
- Simple translation of the data to a user-defined (positive) minimum
- Useful for Box-Cox and Taylor
- Kolmogorov-Smirnov test of normality
- Box-Cox and Box-Cox-Bartlett normalizations
- Ranging and standardization
- Your choice of transformation
- Y' = a + bY
- Y' = Y * exp(a)
- Y' = exp(Y)
- Y' = ln(a + bY)
- Multinormality test
- Left to do
- Convert (Mard 23 févr 1999 at 15:09:34 by PhC)
- Similarity <-> Distance matrix conversions
- PCoord: Principal COORDinates (Lund 21 sept 1998 at 15:35:45 by PhC)
- Standard reduced-space ordinations for distance matrices (similarity matrices must be converted to distances first)
- Lingoes (1971) and Cailliez (1983) corrections for negative eigenvalues; see DistPCoA for more information
- Left to do
- Iterative eigenvalue algorithm, useful for large datasets with no correction for negative eigenvalues
- GeoDistances (Jeud 02 avri 1998 at 11:27:13 by PhC)
- Requires data already formatted as decimal degrees (conversion from other formats will happen when importing a file)
- Use of WASTE instead of TextEdit
- >32k of text in Text windows (limited by available memory)
- Diagram types in the R Package
- X-Y scatterplots with optional Z variable
- Types of Z variable (all documented in the manual):
- Group symbol, including "none" if you want no display of the points (e.g. you only want the ellipses)
- Symbols are now highly contrasting and can be re-ordered
- Label (number)
- 95% ellipse-grouping criteria (draw ellipses around a group of points)
- Proportional "bubble"
- Arrow: draw an arrow from the origin [0, 0] to the point itself. Very useful in CCA/RDA biplots.
- Color (8 colors to choose from)
- You can also use row names (characters) as labels
- Axes can be constrained to be proportional: if one centimeter equals 50 units horizontally, one centimeter will equal 50 units vertically. Very useful in ordination diagrams.
- Bounds can be set on both axes
- Settings are remembered when you double-click on a graph
- Link map (see Links)
- Autocorrelograms
- Now including regular or progressive Bonferroni correction
- Histograms
- Growth diagrams are now functional
- They are a special kind of histogram for population dynamics
- Saved diagrams can be re-opened and re-sized at will
- All diagram windows can be set to an arbitrary size regardless of your screen size (e.g. 1600x1600 on a Mac Plus)
- PICT image: paste any PICT from the clipboard
- Distance comparison diagrams
- Useful in reduced-space ordination for reduced vs. original space distance comparison (called a Shepard diagram in this case)
- One can also compare arbitrary distance matrices
- The User's manual is coming along nicely, over 115 pages!
- I try to document every feature in the manual, e.g. "Press the option key while clicking to do this..."
- Note: the self-reading (.srd) version was dropped because it was getting too big (4.2 MB vs. 247k for the pdf version)
- Printing has been implemented!
- Better import/export code
- Bug fix when importing unfolded triangulars
- Export matrices in many formats
- Change language from French to English and back on the fly
- Now uses stand-alone "dictionary" files
- Import/Export text data file
- Export of SIMIL-type files (resemblance matrices) is no longer restricted to square matrix. You can export as:
- Lower triangular (with or without diagonal)
- Upper triangular (with or without diagonal)
- Unfolded lower triangular (without diagonal)
- Unfolded upper triangular (without diagonal)
(This is the format required by Permute! version 3.4 alpha)
- Square
- Import from text files
- Import DD MM SS-type data (i.e. GeoDist)
- Progress thermometers
- Faster scanning
- Can import via drag-and-drop (user preference), also works for multiple imports if option key is held down!
- Direct importation of DOS or UNIX text files
- Rectangular (nxp) matrices
- Resemblance matrices
- Square with diagonal
- Unfolded upper triangular without diagonal
- Unfolded lower triangular without diagonal
- Other formats coming soon
- Parser is now faster and more robust; also used for Copy/Paste
- Text format is compatible with spreadsheet programs
- Columns in files can be delimited with tabs, spaces or commas
- Row and column headers
- Persistent user preferences
- Import text files via drag-and-drop
- Language
- Missing value indicator
- Matrix display font
- Number of decimals in matrix display
- Binary computations threshold (i.e. above this value, everything is 1 and below, 0)
- Multithreading using Thread Manager (System 7.5 and up, there is an extension for previous systems)
- If the Thread Manager is present, one is able to run multiple, concurrent tasks and continue to use one's computer while computations are being made
- One can also turn the multithreading off to use the full power of one's computer
- Open and save
- Binary data files for accuracy and faster opening
- Matrix data (rows/cols)
- SIMIL-type file (internally stored as a one row matrix)
- Text data files (computation results) in text display windows
- Copy and paste
- To clipboard
- Data matrix selection to clipboard, tab-delimited
- Internally kept as binary (private clipboard) to prevent loss of precision when pasting between two data matrices
- Text window to clipboard
- Picture window to clipboard
- From clipboard
- Tab-delimited matrix text to a matrix (uses the import parser)
- Append to text window from clipboard
- Picture from clipboard (non-dynamic, though)
- On-screen display of rectangular matrices and SIMIL-type files
- Current data matrices limits:
- Rectangular matrices: 32,000 rows x 32,000 columns
- SIMIL matrices: 32,000 values, which means 256 objects (256*255/2 = 32640)
- I use 32,000 because it makes debugging easier!
- When the Package will be released, the limits will be raised to 2,147,483,647 * 2,147,483,647
- Everything is stored internally as Long Integers
- The theoretical limit is not 32,000 (~ 215-1) but 2,147,483,647 (231-1)
- Due to a design constraint in early versions of the R package, SIMIL matrices will never be able to handle more than 32,767 objects
- That is still 536,821,761 values!
- And such a matrix would require over 5,110 Megabytes of memory and storage...
- The theoretical limits of the internal format for SIMIL matrices is 46,341 objects, but to achieve it I would have to hack backward compatibility
- Note that there is an internal limit to the List Manager
- It cannot display more than 16,000 cells (e.g. a 400x40 matrix)
- Larger matrices will only show the first 16,000 values
- However, all values will be accessible for computation
- Error reporting
- Use alert if Macsbug is not present
- Distinguish between fatal and non-fatal errors
- Create empty files
- Text
- Data matrices
- Diagrams (pictures)
- Labels, arrows, ellipses, symbols, bubbles
- Re-computed from original data upon window resize
- PowerPC-native (fat binary)
- Balloon help throughout
- Though not completely translated...
- This list changes often... please bookmark it!
Left to do (in no particular order)
- Diagram types
- Dendrogram or tree
- Chronological clustering
- Import/Export text data file
- Specify number of rows and columns
- When exporting, allow the user to specify the number of columns to wrap the data
- Data processing
- Fold/unfold SIMIL-type matrices (distance or similarity matrices) when copying and pasting
- Picture processing
- Help
- AppleGuide (dependant on OSA support)
- OSA support
- OSA-compliant: scriptable and recordable
Last updated on Sunday, August 01, 2010 by Philippe Casgrain
Created on Saturday, April 05, 1997
|