pylimma.gls_series

pylimma.gls_series(M, design=None, ndups=2, spacing=1, block=None, correlation=None, weights=None)[source]

Fit linear model for each gene using generalized least squares.

Allows for correlation between samples, either through duplicate spots within arrays or through blocking factors.

Parameters:
  • M (ndarray) – Expression matrix, shape (n_genes, n_samples).

  • design (ndarray, optional) – Design matrix, shape (n_samples, n_coefficients). If None, uses intercept-only model.

  • ndups (int, default 1) – Number of within-array duplicate spots.

  • spacing (int, default 1) – Spacing between duplicate spots.

  • block (array_like, optional) – Block indicator for correlated samples. If provided, ndups and spacing are ignored.

  • correlation (float, optional) – Intra-block correlation. If None, will need to be estimated externally (e.g., via duplicate_correlation()).

  • weights (ndarray, optional) – Observation weights.

Returns:

coefficients : ndarray, shape (n_genes, n_coefs) stdev_unscaled : ndarray, shape (n_genes, n_coefs) sigma : ndarray, shape (n_genes,) df_residual : ndarray, shape (n_genes,) cov_coefficients : ndarray, shape (n_coefs, n_coefs) correlation : float block : ndarray or None ndups : int spacing : int

Return type:

dict

Notes

This function uses Cholesky decomposition to transform the GLS problem to an equivalent OLS problem. The correlation structure is either:

  • Within-array duplicates (ndups > 1): spots within the same array are correlated with the specified correlation.

  • Between-sample blocks (block != None): samples within the same block are correlated.

References

Smyth, G. K., Michaud, J. and Scott, H. S. (2005). Use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics, 21, 2067-2075.