pylimma.e_bayes

pylimma.e_bayes(data, proportion=0.01, stdev_coef_lim=(0.1, 4.0), trend=False, span=None, robust=False, winsor_tail_p=(0.05, 0.1), legacy=None, key='pylimma')[source]

Empirical Bayes moderation of t-statistics.

Computes moderated t-statistics, p-values, and B-statistics (log-odds of differential expression) by empirical Bayes shrinkage of the gene-wise sample variances towards a common prior.

Parameters:
  • data (AnnData or dict) – Either an AnnData object with fit results in adata.uns[key], or a dict returned by lm_fit() or contrasts_fit().

  • proportion (float, default 0.01) – Expected proportion of differentially expressed genes.

  • stdev_coef_lim (tuple, default (0.1, 4.0)) – Limits for the prior standard deviation of coefficients.

  • trend (bool or array_like, default False) – If True, allow prior variance to depend on mean expression (Amean from the fit). If a numeric array, use that as the covariate for the mean-variance trend.

  • span (float, optional) – Span for lowess smoothing when fitting the mean-variance trend. Only used when trend is True. If None, an appropriate span is chosen automatically.

  • robust (bool, default False) – If True, use robust estimation of hyperparameters. Outlier variances are Winsorized and the prior df is estimated robustly.

  • winsor_tail_p (tuple, default (0.05, 0.1)) – Winsorization proportions for robust estimation. The first value is for the lower tail, the second for the upper tail.

  • legacy (bool, optional) – If True, use the original limma hyperparameter estimation method. If False, use the newer method which handles unequal residual df better. If None (default), auto-detect based on whether all residual df are equal.

  • key (str, default "pylimma") – Key for fit results in adata.uns (AnnData input only).

Returns:

If input is dict, returns updated dict with moderated statistics. If input is AnnData, updates adata.uns[key] in place and returns None.

Return type:

dict or None

Notes

The moderated statistics added to the fit are:

  • t: moderated t-statistics

  • p_value: two-sided p-values

  • lods: B-statistics (log-odds of differential expression)

  • s2_prior: prior variance

  • df_prior: prior degrees of freedom

  • s2_post: posterior variance

  • df_total: total degrees of freedom

  • F: moderated F-statistic (if multiple contrasts)

  • F_p_value: F-statistic p-value (if multiple contrasts)

References

Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3(1), Article 3.