irlba {irlba}R Documentation

Find a few approximate largest singular values and corresponding singular vectors of a matrix.

Description

The augmented implicitly restarted Lanczos bidiagonalization algorithm (IRLBA) finds a few approximate largest singular values and corresponding singular vectors of a sparse or dense matrix using a method of Baglama and Reichel. It is a fast and memory-efficient way to compute a partial SVD.

Usage

irlba(A, nv = 5, nu, maxit = 1000, work = nv + 7, reorth = TRUE,
  tol = 1e-05, v = NULL, right_only = FALSE, verbose = FALSE, scale,
  center, du, ds, dv, shift, mult, fastpath = TRUE)

Arguments

A

numeric real- or complex-valued matrix or real-valued sparse matrix.

nv

number of right singular vectors to estimate.

nu

number of left singular vectors to estimate (defaults to nv).

maxit

maximum number of iterations.

work

working subspace dimension, larger values can speed convergence at the cost of more memory use.

reorth

if TRUE, apply full reorthogonalization to both SVD bases, otherwise only apply reorthogonalization to the right SVD basis vectors; the latter case is cheaper per iteration but, overall, may require more iterations for convergence. Always set to TRUE when fastpath=TRUE (see below).

tol

convergence is determined when ||AV - US|| < tol*||A||, where the spectral norm ||A|| is approximated by the largest estimated singular value, and U, V, S are the matrices corresponding to the estimated left and right singular vectors, and diagonal matrix of estimated singular values, respectively.

v

optional starting vector or output from a previous run of irlba used to restart the algorithm from where it left off (see the notes).

right_only

logical value indicating return only the right singular vectors (TRUE) or both sets of vectors (FALSE). The right_only option can be cheaper to compute and use much less memory when nrow(A) >> ncol(A) but note that right_only = TRUE sets fastpath = FALSE (only use this option for really large problems that run out of memory).

verbose

logical value that when TRUE prints status messages during the computation.

scale

optional column scaling vector whose values divide each column of A; must be as long as the number of columns of A (see notes).

center

optional column centering vector whose values are subtracted from each column of A; must be as long as the number of columns of A and may

du

DEPRECATED optional subspace deflation vector (see notes).

ds

DEPRECATED optional subspace deflation scalar (see notes).

dv

DEPRECATED optional subspace deflation vector (see notes).

shift

optional shift value (square matrices only, see notes).

mult

optional custom matrix multiplication function (default is %*%, see notes).

fastpath

try a fast C algorithm implementation if possible; set fastpath=FALSE to use the reference R implementation. See notes.

Value

Returns a list with entries:

d:

max(nu, nv) approximate singular values

u:

nu approximate left singular vectors (only when right_only=FALSE)

v:

nv approximate right singular vectors

iter:

The number of Lanczos iterations carried out

mprod:

The total number of matrix vector products carried out

Note

The syntax of irlba partially follows svd, with an important exception. The usual R svd function always returns a complete set of singular values, even if the number of singular vectors nu or nv is set less than the maximum. The irlba function returns a number of estimated singular values equal to the maximum of the number of specified singular vectors nu and nv.

Use the optional scale parameter to implicitly scale each column of the matrix A by the values in the scale vector, computing the truncated SVD of the column-scaled sweep(A, 2, scale, FUN=`/`), or equivalently, A %*% diag(1 / scale), without explicitly forming the scaled matrix. scale must be a non-zero vector of length equal to the number of columns of A.

Use the optional center parameter to implicitly subtract the values in the center vector from each column of A, computing the truncated SVD of sweep(A, 2, center, FUN=`-`), without explicitly forming the centered matrix. This option may not be used together with the general rank 1 deflation options. center must be a vector of length equal to the number of columns of A. This option may be used to efficiently compute principal components without explicitly forming the centered matrix (which can, importantly, preserve sparsity in the matrix). See the examples.

The optional deflation parameters are deprecated and will be removed in a future version. They could be used to compute the rank-one deflated SVD of A - ds*du %*% t(dv), where t(du) %*% A - ds * t(dv) == 0. For example, the triple ds, du, dv may be a known singular value and corresponding singular vectors. Or ds=m and dv and du represent a vector of column means of A and of ones, respectively, where m is the number of rows of A. This functionality can be effectively replaced with custom matrix product functions.

Specify an optional alternative matrix multiplication operator in the mult parameter. mult must be a function of two arguments, and must handle both cases where one argument is a vector and the other a matrix. See the examples and demo("custom_matrix_multiply", package="irlba") for an alternative approach.

Use the v option to supply a starting vector for the iterative method. A random vector is used by default (precede with set.seed() to for reproducibility). Optionally set v to the output of a previous run of irlba to restart the method, adding additional singular values/vectors without recomputing the solution subspace. See the examples.

The function may generate the following warnings:

The function might return an error for several reasons including a situation when the starting vector v is near the null space of the matrix. In that case, try a different v.

The fastpath=TRUE option only supports real-valued matrices and sparse matrices of type dgCMatrix (for now). Other problems fall back to the reference R implementation.

References

Augmented Implicitly Restarted Lanczos Bidiagonalization Methods, J. Baglama and L. Reichel, SIAM J. Sci. Comput. 2005.

See Also

svd, prcomp, partial_eigen

Examples

set.seed(1)

A <- matrix(runif(400), nrow=20)
S <- irlba(A, 3)
S$d

# Compare with svd
svd(A)$d[1:3]

# Restart the algorithm to compute more singular values
# (starting with an existing solution S)
S1 <- irlba(A, 5, v=S)

# Principal components (see also prcomp_irlba)
P <- irlba(A, nv=1, center=colMeans(A))

# Compare with prcomp and prcomp_irlba (might vary up to sign)
cbind(P$v,
      prcomp(A)$rotation[, 1],
      prcomp_irlba(A)$rotation[, 1])

# A custom matrix multiplication function that scales the columns of A
# (cf the scale option). This function scales the columns of A to unit norm.
col_scale <- sqrt(apply(A, 2, crossprod))
mult <- function(x, y)
        {
          # check if x is a  vector
          if (is.vector(x))
          {
            return((x %*% y) / col_scale)
          }
          # else x is the matrix
          x %*% (y / col_scale)
        }
irlba(A, 3, mult=mult)$d

# Compare with:
irlba(A, 3, scale=col_scale)$d

# Compare with:
svd(sweep(A, 2, col_scale, FUN=`/`))$d[1:3]


[Package irlba version 2.1.2 Index]