pylimma.fit_f_dist

pylimma.fit_f_dist(x, df1, covariate=None)[source]

Fit a scaled F-distribution to sample variances.

Estimates the scale factor (prior variance) and denominator degrees of freedom (prior df) by the method of moments on log(F). Port of R limma’s fitFDist function (Gordon Smyth).

Parameters:

x (array_like) – Sample variances. Should be positive.
df1 (array_like or float) – Numerator degrees of freedom for each variance (residual df).
covariate (array_like, optional) – If provided, allows the scale to vary as a function of the covariate (e.g., mean expression). Uses natural cubic splines to fit the trend.

Returns:

scalefloat or ndarray: Estimated prior variance (s0^2). If covariate is provided, this is an array of gene-specific prior variances.
df2float: Estimated prior degrees of freedom (d0).

Return type:

dict

Notes

Uses the relationship that if s^2 ~ s0^2 * F(d1, d0), then E[log(s^2)] = log(s0^2) + digamma(d1/2) - digamma(d0/2) + log(d0/d1) and Var[log(s^2)] = trigamma(d1/2) + trigamma(d0/2).

When covariate is provided, the scale (s0^2) is allowed to vary as a smooth function of the covariate, typically the average log-expression.

References

Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3(1), Article 3.