The genes are allocated to a bin based on their average log-level of expression, then with each bin the dispersion (variance over mean of the log-levels) is z-scored. Genes not having a sufficiently high dispersion z-scores are excluded from the dataset. This method replicates FindVariableFeatures from the Seurat package.

Arguments

zscore_threshold

a numeric value indicating the zcored dispersion threshold above which the gene names are returned. Default to 0.

num_bins

a integer value indicating the number of bins used to calculate z-score into. Default to 20.

data_status

character string. Specifies whether the gene expression levels used for calculation are raw ("Raw"), normalized ("Normalized") or have been imputed ("Smoothed"). Default to "Raw".