pylimma.fit_f_dist_robustly

pylimma.fit_f_dist_robustly(x, df1, covariate=None, winsor_tail_p=(0.05, 0.1), trace=False)[source]

Robust estimation of scaled F-distribution parameters.

Estimates the scale factor and denominator degrees of freedom using Winsorized moments of log(F) values, which provides robustness to outlier variances.

Parameters:
  • x (array_like) – Sample variances. Should be positive.

  • df1 (array_like or float) – Numerator degrees of freedom for each variance.

  • covariate (array_like, optional) – If provided, allows the scale to vary as a function of the covariate. Not yet fully implemented.

  • winsor_tail_p (tuple of float, default (0.05, 0.1)) – Lower and upper tail proportions for Winsorization.

  • trace (bool)

Returns:

scalefloat or ndarray

Estimated prior variance (s0^2).

df2float

Estimated prior degrees of freedom (d0).

df2_shrunkndarray

Gene-wise shrunken prior df, accounting for outliers.

Return type:

dict

Notes

This function is more robust than fit_f_dist() when there are outlier variances. It uses Winsorization to limit the influence of extreme values on the moment estimates.

The df2_shrunk values are shrunk towards the pooled df for genes identified as potential outliers (having unusually large variances).

References

Phipson, B. and Smyth, G. K. (2016). Robust hyperparameter estimation protects against hypervariable genes and improves power to detect differential expression. Annals of Applied Statistics, 10(2), 946-963.