Version 5 of random forests contains some modifications and major additions to Version 4. The additions are:
- Better replacement of missing values.
- Proximities computed rapidly in small memory.
- Proximities computed for test sets.
- Use of "prototypes" to give understandable data pictures.
- The ability to detect interactions between variables.
- Two-stage runs where the second run uses only the variables found most important in the first run.
- For running new data down a saved forest, Version 5 adds:
- The capability of replacing missing values in the new data.
- Deriving outlier measures for the new data.