multiFit {MultiFit} | R Documentation |
Perform multiscale test of independence for multivariate vectors. See vignettes for further examples.
multiFit(xy, x = NULL, y = NULL, p_star = NULL, R_max = NULL, R_star = 1, rank.transform = TRUE, test.method = "Fisher", correct = TRUE, min.tbl.tot = 25L, min.row.tot = 10L, min.col.tot = 10L, p.adjust.methods = c("H", "Hcorrected", "MH"), compute.all.holm = TRUE, cutoff = 0.05, top.max.ps = 4L, return.all.pvs = TRUE, save.all.pvs = FALSE, all.pvs.fname = NULL, uv.approx.null = FALSE, uv.exact.null = FALSE, uv.null.sim = 10000L, plot.marginals = FALSE, rk = FALSE, M = 10, verbose = FALSE)
xy |
A list, whose first element corresponds to the matrix x as below, and
its second element corresponds to the matrix y as below. If |
x |
A matrix, number of columns = dimension of random vector, number of rows = number of observations. |
y |
A matrix, number of columns = dimension of random vector, number of rows = number of observations. |
p_star |
Numeric, cuboids associated with tests whose p-value is below |
R_max |
A positive integer (or Inf), the maximal number of
resolutions to scan (algorithm will stop at a lower resolution if
all tables in it do not meet the criteria specified at |
R_star |
A positive integer, if set to an integer
between 0 and |
rank.transform |
Logical, if |
test.method |
String, choose "Fisher" for Fisher's exact test (slowest), "chi.sq" for Chi-squared test, "LR" for likelihood-ratio test and "norm.approx" for approximating the hypergeometric distribution with a normal distribution (fastest). |
correct |
Logical, if |
min.tbl.tot |
Non-negative integer, the minimal number of observations
per table below which a |
min.row.tot |
Non-negative integer, the minimal number of observations for row totals in the 2x2 contingency tables below which a contingency table will not be tested. |
min.col.tot |
Non-negative integer, the minimal number of observations for column totals in the 2x2 contingency tables below which a contingency table will not be tested. |
p.adjust.methods |
String, choose between "H" for Holm, "Hcorrected" for Holm with
the correction as specified in |
compute.all.holm |
Logical, if |
cutoff |
Numerical between 0 and 1, an upper limit for the |
top.max.ps |
Positive integer, report the mean of the top |
return.all.pvs |
Logical, if TRUE, a data frame with all |
save.all.pvs |
Logical, if |
all.pvs.fname |
String, file name to which all |
uv.approx.null |
Logical, in a univariate case, if |
uv.exact.null |
Logical, in a univariate case, if |
uv.null.sim |
Positive integer, the number of simulated values to be computed in a univariate case when an exact or approximate null distribution is simulated. |
plot.marginals |
Logical, if |
rk |
Logical, if |
M |
A positive integer (or Inf), the number of top ranking tests to continue to split at each resolution. FWER control not guaranteed for this method. |
verbose |
Logical. |
test.stats
, a named numerical vector containing the test
statistics for the global null hypothesis (i.e. x independent of y)
p.values
, a named numerical vector containing the p
-values of
for the global null hypothesis (i.e. x independent of y). These are not computed
if p.adjust.methods
is NULL
.
pvs
, a data frame that contains all p
-values and adjusted
p
-values that are computed. Returned if return.all.pvs
is TRUE
.
all
, a nested list. Each entry is named and contains data about a resolution
that was tested. Each resolution is a list in itself, with cuboids
, a summary of
all tested cuboids in a resolution, tables
, a summary of all 2x2
contingency tables in a resolution, pv
, a numerical vector containing the
p
-values from the tests of independence on 2x2 contingency table in tables
that meet the criteria defined by min.tbl.tot
, min.row.tot
and min.col.tot
.
The length of pv
is equal to the number of rows of tables
. pv.correct
,
similar to the above pv
, corrected p
-values are computed and returned when
correct
is TRUE
. rank.tests
, logical vector that indicates
whether or not a test was ranked among the top M
tests in a resolution. The
length of rank.tests
is equal to the number of rows of tables
. parent.cuboids
,
an integer vector, indicating which cuboids in a resolution are associated with
the ranked tests, and will be further halved in the next higher resolution.
parent.tests
, a logical vector of the same length as the
number of rows of tables
, indicating whether or not a test was chosen as a parent
test (same tests may have multiple children).
approx.nulls
, in a univariate case, a list of numerical vectors
whose values are the simulated approximate null values.
exact.nulls
, in a univariate case, a list of numerical vectors
whose values are the simulated theoretical null values.
set.seed(1) n = 300 Dx = Dy = 2 x = matrix(0, nrow=n, ncol=Dx) y = matrix(0, nrow=n, ncol=Dy) x[,1] = rnorm(n) x[,2] = runif(n) y[,1] = rnorm(n) y[,2] = sin(5*pi*x[,2]) + 1/5*rnorm(n) fit = multiFit(x=x, y=y, verbose=TRUE) w = multiSummary(x=x, y=y, fit=fit, alpha=0.0001)