splisosm.hyptest_glmm

splisosm.hyptest_glmm#

GLMM-based hypothesis tests for spatial isoform usage.

Classes#

SplisosmGLMM

Parametric spatial isoform statistical modeling using GLMM.

Module Contents#

class splisosm.hyptest_glmm.SplisosmGLMM(model_type='glmm-full', share_variance=True, var_parameterization_sigma_theta=True, var_fix_sigma=False, var_prior_model='none', var_prior_model_params={}, init_ratio='observed', fitting_method='joint_gd', fitting_configs={'max_epochs': -1})#

Parametric spatial isoform statistical modeling using GLMM.

This is a convenience class that wraps around the splisosm.model.MultinomGLMM for batched model fitting and spatial variability and differential usage testing.

Examples

Setup data:

>>> from splisosm import SplisosmGLMM
>>> import torch
>>> # Simulate data for 10 genes with different number of isoforms
>>> data_3_iso = [torch.randint(low=0, high=5, size=(100, 3)) for _ in range(5)]  # 5 genes with 3 isoforms
>>> data_4_iso = [torch.randint(low=0, high=5, size=(100, 4)) for _ in range(5)]  # 5 genes with 4 isoforms
>>> data = data_3_iso + data_4_iso
>>> coordinates = torch.rand(100, 2)  # 100 spots with 2D coordinates
>>> design_mtx = torch.rand(100, 2)  # 100 spots with 2 covariates

Model fitting:

>>> model = SplisosmGLMM(model_type='glmm-full')
>>> model.setup_data(data, coordinates, design_mtx=design_mtx, group_gene_by_n_iso=True)
>>> model.fit(n_jobs=1, batch_size=5, with_design_mtx=False)
>>> fitted_models = model.get_fitted_models()
>>> print(fitted_models[0])  # print the fitted model for the first gene

Differential usage test:

>>> model.test_differential_usage(method='score')
>>> du_results = model.get_formatted_test_results('du')
>>> print(du_results.head())

Parameters:

model_type (Literal['glmm-full', 'glmm-null', 'glm']) – Which model to fit. Can be one of 'glmm-full' (Multinomial GLMM with spatial random effects), 'glmm-null' (Multinomial GLMM with white noise), 'glm' (Multinomial GLM).
share_variance (bool) – Whether to share the variance component across isoforms.
var_parameterization_sigma_theta (bool) – Whether to parameterize the variance components as (sigma, theta_logit) or (sigma_sp, sigma_nsp). If True, the variance components will be (sigma, theta_logit), where sigma is the total variance and theta_logit is the logit of the spatial variance proportion. If False, the variance components will be (sigma_sp, sigma_nsp), where sigma_sp is the spatial variance and sigma_nsp is the non-spatial variance.
var_fix_sigma (bool) – Whether to fix the total variance (sigma) or not. If True, the total variance will be fixed to the initial value, which is the average per-spot variance of isoform counts normalized by its mean expression. See MultinomGLMM._initialize_params for details.
var_prior_model (str) – The prior model on the total variance sigma. Default is 'none' with no prior. Other options are 'gamma' (Gamma prior) and 'inv_gamma' (Inverse Gamma prior).
var_prior_model_params (dict) – The parameters for the prior model on the total variance sigma. For 'gamma', the default parameters are {'alpha': 2.0, 'beta': 0.3}. For 'inv_gamma', the default parameters are {'alpha': 3, 'beta': 0.5}.
init_ratio (str) – The initialization method for the logit isoform usage ratio. Options are 'observed' (initialize using observed counts) and 'uniform' (equal isoform usage across space).
fitting_method (str) – The fitting method to use when model_type='glmm-full' or 'glmm-null'. Options are 'joint_gd' (joint likelihood with gradient descent), 'joint_newton' (joint likelihood with Newton’s method), 'marginal_gd' (marginal likelihood with gradient descent), and 'marginal_newton' (marginal likelihood with Newton’s method).
fitting_configs (dict) –
A dictionary of fitting configurations with the following keys:
- 'lr': float, learning rate
- 'optim': str, optimization method (one of 'adam', 'sgd', or 'lbfgs')
- 'tol': float, tolerance for convergence
- 'max_epochs': int, maximum number of epochs
- 'patience': int, number of epochs to wait for improvement before stopping
- 'update_nu_every_k': int, number of iterations to update nu when using fitting_method='marginal_newton'