Class for holding parameters used to simulate data using rescueSim
Slots
nTimepoints
Number of timepoints (i.e. samples) per subject. Holds a single numeric value >0 representing the number of timepoints for all subjects.
twoGroupDesign
Logical value indicating whether to simulate two groups (ex. treatment and control group).
nSubjsPerGroup
Number of subjects per group (if using two group design) or number of total subjects (if only single group). Holds a single numeric value >0 indicating the number of subjects per group.
maxCellsPerSamp
Maximum parameter used when drawing number of cells per sample from a discrete uniform distribution. Holds a single numeric value >0 indicating the maximum number of cells per sample, or a vector with length equal to the number of conditions (group/timepoint combinations) with each value representing the maximum for a single condition, or a vector with length equal to the number of total samples with each value representing the maximum for a single sample.
minCellsPerSamp
Minimum parameter used when drawing number of cells per sample from a discrete uniform distribution. Holds a single numeric value >0 indicating the minimum number of cells per sample, or a vector with length equal to the number of conditions (group/timepoint combinations) with each value representing the minimum for a single condition, or a vector with length equal to the number of total samples with each value representing the minimum for a single sample.
logLibMean
Mean library size (log scale) parameter. Used to draw library size from a log-normal distribution. Holds a single value >0 indicating the mean library size on a log scale.
logLibSD
Library size standard deviation (log scale) parameter. Used to draw library size from a log-normal distribution. Holds a single value >=0 indicating the standard deviation of library size on a log scale.
customLibSizes
An optional numeric vector of user-provided library sizes. When provided, library sizes for simulation will be sampled (with replacement) from this vector instead of the default log-normal distribution. Log-transformed values are adjusted by a sample specific factor so that the average log library size equals the overall average multiplied by the sample-specific factor.
logLibFacVar
Variance used for drawing sample-level multiplicative factors which give different library size distributions for each sample. Larger values result in larger variation in library size distributions by sample. Holds a single value >=0 with 0 indicating no difference in the library size distributions by sample.
exprsMean
Gene-specific mean expression value representing the average expression of each gene in the dataset. Holds a vector of numeric values >=0 with length equal to desired number of genes in the simulated data where each value indicates the average expression for a single gene. If a named vector is given, these names will be used as the gene names in the simulated
SingleCellExperiment
object.dispersion
Gene-specific dispersion value representing the variation in expression for each gene in the dataset. Holds a vector of numeric values >0 with length equal to desired number of genes in the simulated data where each value indicates the dispersion for a single gene.
sampleFacVarMean
Mean used for drawing variance (log-scale) of sample-level multiplicative factors. Larger values result in more between-sample variation.
sampleFacVarSD
Standard deviation used for drawing variance (log-scale) of sample-level multiplicative factors. Larger values result in more variation in amount of between-sample variation across different genes. Must be a value >=0
subjectFacVarMean
Mean used for drawing variance (log-scale) of subject-level multiplicative factors. Larger values result in larger between-subject variation.
subjectFacVarSD
Standard deviation used for drawing variance (log-scale) of subject-level multiplicative factors. Larger values result in more variation in amount of between-subject variation across different genes. Must be a value >=0.
propDE
Proportion of genes differentially expressed between timepoints/groups. Must be a numeric value between 0 and 1.
deLog2FC
Fold change values used for differentially expressed genes.
Specifies the log2 fold changes for differentially expressed (DE) genes. All values are interpreted as relative to a common baseline condition: time0 and/or group0.
Acceptable formats include:
A single numeric value
>= 0
, indicating the absolute log2 fold change between baseline and the final condition. Log2 fold changes will be randomly assigned as positive or negative across DE genes.A numeric vector of possible log2 fold change values (positive and/or negative) from which values will be randomly drawn for DE genes.
A named list of numeric vectors specifying gene-specific log2 fold changes for each experimental condition, relative to the baseline (
"time0"
,"group0"
, or"time0_group0"
depending on design). Each vector must have length equal to the number of genes. Valid names for list elements depend on the experimental design:For single-group, multi-timepoint designs:
"time1"
,"time2"
, etc.For two-group, single-timepoint designs:
"group1"
For multi-timepoint, two-group designs:
"time1_group0"
,"time1_group1"
, etc.
The reference condition —
"time0"
,"group0"
, or"time0_group0"
— must not be included in the list, as it is implicitly treated as having log2FC = 0.
When a single value or vector is provided (i.e., not a list), log2 fold changes are applied in a structured way:
For two-group, two-timepoint designs: group 0 shows no change over time, while group 1 exhibits a linear change over time from 0 to the specified log2FC.
For designs with more than two timepoints: a linear trajectory is simulated, and the log2FC value represents the total change from time 0 to the final timepoint.