senid.preprocessing

Functions

binomial_deviance_selection(adata[, layer, ...])

Python implementation of the brilliantly effective feature selection method, developed by Will Townes

calculate_binomial_deviance_batch(counts, size_factors)

Package Contents

senid.preprocessing.binomial_deviance_selection(adata, layer=None, deviance_key='binomial_deviance', highly_variable_key='highly_deviant', n_top_genes=1000, batch_key=None, sort_genes=True)[source]

Python implementation of the brilliantly effective feature selection method, developed by Will Townes (see Townes et al. 2019: doi.org/10.1186/s13059-019-1861-6). The idea is that we use a binomial deviance to quantify the variability of a gene, based on a multinomial model of UMI counts. We only calculate the binomial deviance, whereas the scry package developed by Townes has the option to calculate the Poisson deviance.

Parameters:
  • adata (AnnData) – Annotated data matrix.

  • highly_vairable_key (str, optional (default: 'highly_deviant')) – The key in adata.var to store the highly variable genes.

  • layer (str, optional (default: None)) – The layer of the AnnData object to use. If None, this method won’t work.

  • n_top_genes (int, optional (default: 1000)) – Number of top genes to select when assigning genes as ‘highly variable’

  • batch_key (str, optional (default: None)) – The batch label in adata.obs. If used, we calculate the binomial deviance per batch and then define the binomial deviance per gene as the sum of the per-batch deviances.

  • sort_genes (bool, optional (default: True)) – If True, sort genes by binomial deviance.

  • deviance_key (Optional[str])

  • highly_variable_key (Optional[str])

Returns:

adata – Annotated data matrix with the binomial deviance per gene stored in adata.var[‘binomial_deviance’]. The top_n_genes highly variable genes are stored in adata.var[highly_variable_key].

Return type:

AnnData

senid.preprocessing.calculate_binomial_deviance_batch(counts, size_factors, batch_keys=None)[source]
Parameters:
Return type:

numpy.ndarray