pylimma.mrlm
- pylimma.mrlm(M, design=None, ndups=1, spacing=1, weights=None, method='huber', maxit=20, acc=0.0001, k=1.345)[source]
Robustly fit linear model for each gene using M-estimation.
Uses iteratively reweighted least squares with a robust loss function (Huber’s T by default) to downweight outliers. Matches R’s MASS::rlm.
- Parameters:
M (ndarray) – Expression matrix, shape (n_genes, n_samples).
design (ndarray, optional) – Design matrix, shape (n_samples, n_coefficients). If None, uses intercept-only model.
ndups (int, default 1) – Number of within-array duplicate spots.
spacing (int, default 1) – Spacing between duplicate spots.
weights (ndarray, optional) – Prior weights for observations.
method (str, default "huber") – Robust estimation method. Options: “huber”, “bisquare”.
maxit (int, default 20) – Maximum iterations for IRLS (matches R’s default).
acc (float, default 1e-4) – Convergence tolerance (matches R’s default).
k (float, default 1.345) – Tuning constant for Huber estimator (matches R’s default).
- Returns:
coefficients : ndarray, shape (n_genes, n_coefs) stdev_unscaled : ndarray, shape (n_genes, n_coefs) sigma : ndarray, shape (n_genes,) df_residual : ndarray, shape (n_genes,) cov_coefficients : ndarray, shape (n_coefs, n_coefs)
- Return type:
Notes
This function implements R’s MASS::rlm algorithm exactly: - MAD scale estimation:
median(abs(resid)) / 0.6745- Huber weights: psi(u)/u where psi is the Huber function - Convergence on residuals: sqrt(sum((old - new)^2) / sum(old^2))References
Huber, P. J. (1981). Robust Statistics. Wiley. Venables, W. N. & Ripley, B. D. (2002). Modern Applied Statistics with S.