The raw gene expression level are normalized according to one the following methods:

  • 'CPM' or 'Count-Per-Million': The raw gene expression levels are divided by the total read count of the cell and multiplied by 1e6. This method corrects for sequencing depth.

  • 'FPKM': The CPM levels are divided by the gene length in kilobase. This method corrects for sequencing depth and gene length.

  • 'MR' or 'Median-of-ratios': The raw gene expression levels are divided by cell size factors equals to the median ratio of gene counts relative to geometric mean per gene. This method corrects for library size and RNA composition bias and is a reimplementation of DESeq2' `estimateSizeFactors()` with the "poscounts" estimator which deals with genes with some zeros.

Arguments

method

a character string. The method to use for normalization: either 'CPM', 'Count-Per-Million', 'FPKM', 'MR', 'Median-of-ratios' (Default to 'CPM').

gene_length

if the normalization method is 'FPKM', a numeric vector containing the length of each genes (Default to NULL).