pylimma.ids2indices

pylimma.ids2indices(gene_sets, identifiers, remove_empty=True)[source]

Map named gene sets of identifier strings to zero-based integer indices.

Port of R limma’s ids2indices. Indices are returned in Python’s zero-based convention (the R version returns 1-based indices); this is what the downstream Python gene-set functions expect.

Parameters:
  • gene_sets (dict or list) – Either a dict mapping set names to iterables of identifiers, or a single iterable (wrapped as {"Set1": gene_sets}, matching R’s if(!is.list(gene.sets)) branch).

  • identifiers (array_like of str) – Identifier vector; the returned indices are positions in this vector.

  • remove_empty (bool, default True) – Drop sets that contain no matches.

Returns:

Dict mapping each set name to an int64 array of zero-based indices into identifiers.

Return type:

dict[str, np.ndarray]