pylimma.array_weights

pylimma.array_weights(object, design=None, weights=None, var_design=None, var_group=None, prior_n=10, method='auto', maxiter=50, tol=1e-05, trace=False, *, layer=None, weights_layer=None)[source]

Estimate relative quality weights for each array/sample.

Estimates the relative reliability of each sample in a gene expression experiment. Samples with higher variability get lower weights.

Parameters:
  • object (dict) – Dict with ‘E’ (expression matrix) and optionally ‘weights’. Typically the output from voom().

  • design (ndarray, optional) – Design matrix for the linear model. If None, taken from object or defaults to intercept-only.

  • weights (ndarray, optional) – Prior observation weights. If None, taken from object.

  • var_design (ndarray, optional) – Design matrix for the variance model. Columns should sum to zero.

  • var_group (ndarray, optional) – Factor defining variance groups. Takes precedence over var_design.

  • prior_n (float, default 10) – Prior sample size for regularization. Higher values give more stable but less responsive estimates.

  • method (str, default "auto") – Estimation method: - “auto”: Choose automatically (genebygene if weights or NAs present) - “genebygene”: Gene-by-gene update algorithm - “reml”: REML estimation (faster when no weights or NAs)

  • maxiter (int, default 50) – Maximum iterations for REML method.

  • tol (float, default 1e-5) – Convergence tolerance for REML method.

  • trace (bool, default False) – If True, print iteration progress (array weight range) to stdout.

  • layer (str | None)

  • weights_layer (str | None)

Returns:

Quality weights for each sample, shape (n_samples,). Higher weights indicate more reliable samples.

Return type:

ndarray

Notes

Array weights are useful when some samples have higher technical variability than others. By downweighting noisy samples, the analysis gains power while remaining valid.

The weights are relative (their geometric mean is approximately 1) and can be incorporated into downstream analysis via lm_fit().