rfsrc.fast {randomForestSRC} | R Documentation |
Fast approximate random forests using subsampling with forest options set to encourage computational speed. Applies to all families.
rfsrc.fast(formula, data, ntree = 500, nsplit = 10, bootstrap = "by.root", ensemble = "oob", sampsize = function(x){min(x * .632, max(150, x ^ (3/4)))}, samptype = "swor", samp = NULL, ntime = 50, forest = FALSE, terminal.qualts = FALSE, ...)
formula |
A symbolic description of the model to be fit. If missing, unsupervised splitting is implemented. |
data |
Data frame containing the y-outcome and x-variables. |
ntree |
Number of trees. |
nsplit |
Non-negative integer value specifying number of random split points used to split a node (deterministic splitting corresponds to the value zero and is much slower). |
bootstrap |
Bootstrap protocol used in growing a tree. |
ensemble |
Specifies the type of ensemble. We request only out-of-sample which corresponds to "oob". |
sampsize |
Function specifying size of subsampled data. Can also be a number. |
samptype |
Type of bootstrap used. |
samp |
Bootstrap specification when |
ntime |
Integer value used for survival to
constrain ensemble calculations to a grid of |
forest |
Should the forest object be returned? Turn this on if you want prediction on test data but for big data this can be large. |
terminal.qualts |
Should terminal node membership information be
returned? Ensure this is off in the presence of big data as memory
warnings can occur otherwise. In either case, this parameter does not effect
the ability to restore the model. When this is |
... |
Further arguments to be passed to |
Calls rfsrc
under various options (including subsampling) to
encourage computational speeds. This will provide a good
approximation but will not be as good as default settings of
rfsrc
.
An object of class (rfsrc, grow)
.
Hemant Ishwaran and Udaya B. Kogalur
## ------------------------------------------------------------ ## Iowa housing regression example ## ------------------------------------------------------------ ## load the Iowa housing data data(housing, package = "randomForestSRC") ## do quick and *dirty* imputation housing <- impute(SalePrice ~ ., housing, ntree = 50, nimpute = 1, splitrule = "random") ## grow a fast forest o1 <- rfsrc.fast(SalePrice ~ ., housing) o2 <- rfsrc.fast(SalePrice ~ ., housing, nodesize = 1) print(o1) print(o2) ## grow a fast bivariate forest o3 <- rfsrc.fast(cbind(SalePrice,Overall.Qual) ~ ., housing) print(o3) ## ------------------------------------------------------------ ## White wine classification example ## ------------------------------------------------------------ data(wine, package = "randomForestSRC") wine$quality <- factor(wine$quality) o <- rfsrc.fast(quality ~ ., wine) print(o) ## ------------------------------------------------------------ ## pbc survival example ## ------------------------------------------------------------ data(pbc, package = "randomForestSRC") o <- rfsrc.fast(Surv(days, status) ~ ., pbc) print(o) ## ------------------------------------------------------------ ## WIHS competing risk example ## ------------------------------------------------------------ data(wihs, package = "randomForestSRC") o <- rfsrc.fast(Surv(time, status) ~ ., wihs) print(o)