The genes are allocated to a bin based on their average log-level of expression, then with each bin the dispersion (variance over mean of the log-levels) is z-scored. Genes having a sufficiently high dispersion z-scores are returned.
This method replicates FindVariableFeatures
from the Seurat package.
zscore_threshold | a numeric value indicating the zcored dispersion threshold above which the gene names are returned. Default to 0. |
---|---|
num_bins | a integer value indicating the number of bins used to calculate z-score into. Default to 20. |
data_status | character string. Specifies whether the gene expression levels used for calculation are raw ("Raw"), normalized ("Normalized") or have been imputed ("Smoothed"). Default to "Raw". |
invert | logical. If FALSE (default) genes with z-scored dispersion higher than the threshold are returned. If FALSE, the complementary set is returned. |
a character vector of genes showing a sufficient level of dispersion.