splisosm.model#

Multinomial GLM/GLMM model implementations for SPLISOSM.

Classes#

MultinomGLM

The Multinomial Generalized Linear Model for spatial isoform expression.

MultinomGLMM

The Multinomial Generalized Linear Mixed Model for spatial isoform expression.

Module Contents#

class splisosm.model.MultinomGLM(fitting_method='iwls', fitting_configs={})#

Bases: BaseModel, torch.nn.Module

The Multinomial Generalized Linear Model for spatial isoform expression.

Compared to MultinomGLMM, this model does not have a random effect term:

Y ~ Multinomial(alpha, Y.sum(1))
eta = multinomial-logit(alpha) = X @ beta + bias_eta

Given isoform counts of a gene Y (n_spots, n_isos) and design matrix X (n_spots, n_factors), MultinomGLM.fit will find the MAP estimates of the following learnable parameters:

  • beta: (n_factors, n_isos - 1) covariate coefficients of the fixed effect term.

  • bias_eta: (n_isos - 1) intercepts of the fixed effect term.

Inference is performed by maximizing the log likelihood using fitting_method (default: 'iwls')

Example

>>> from splisosm.model import MultinomGLM
>>> import torch
>>> # Generate synthetic data
>>> counts = torch.randint(0, 10, (5, 100, 3))  # 5 genes, 100 spots, each 3 isoforms
>>> # Fit the GLM model
>>> model = MultinomGLM(fitting_method='iwls')
>>> model.setup_data(counts, design_mtx=None)
>>> model.fit()
>>> print(model)
>>> # Extract the fitted isoform ratios
>>> isoform_ratios = model.get_isoform_ratio()  # shape (5, 100, 3)
>>> # Fitted parameters
>>> print(model.beta.shape)  # shape (5, 0, 2)
>>> print(model.bias_eta.shape)  # shape (5, 2)
Parameters:
  • fitting_method (Literal['iwls', 'newton', 'gd']) – Method for fitting the model. 'iwls': Iteratively reweighted least squares. 'newton': Newton’s method. 'gd': Gradient descent.

  • fitting_configs (dict) –

    Dictionary of fitting configurations. Keys include

    • 'lr': float, Learning rate for gradient descent or Newton’s method.

    • 'optim': str, Optimizer type, one of 'adam', 'sgd', or 'lbfgs'.

    • 'tol': float, Tolerance for convergence.

    • 'max_epochs': int, Maximum number of epochs for fitting.

    • 'patience': int, Number of epochs to wait for improvement before stopping.

clone()#

Clone a model with the same set of parameters.

Return type:

MultinomGLM

fit(diagnose=False, verbose=False, quiet=False, random_seed=None)#

Fit the model using all data

Parameters:
  • diagnose (bool) – Whether to store parameter changes during training (passed to PatienceLogger).

  • verbose (bool) – Whether to print verbose information during fitting.

  • quiet (bool) – Whether to suppress output during fitting.

  • random_seed (Optional[int]) – Random seed for reproducibility.

Returns:

params_iter – If diagnose is True, returns a dictionary of parameter changes during training. Otherwise returns None.

Return type:

dict or None

forward()#

Calculate log probability given data.

Returns:

log_prob – Shape (n_genes,), the log probability for each gene.

Return type:

Tensor

get_isoform_ratio()#

Extract the fitted isoform ratio across space.

Returns:

ratio – Shape (n_genes, n_spots, n_isos), the fitted isoform ratio across space.

Return type:

Tensor

setup_data(counts, design_mtx=None, device='cpu')#

Set up the data for the model.

Parameters:
  • counts (Tensor) – Shape (n_genes, n_spots, n_isos) or (n_spots, n_isos). For batched calculations, all genes in the batch must have the same number of isoforms.

  • design_mtx (Optional[Tensor]) – Shape (n_spots, n_factors). Design matrix of spatial covariates. If None, an intercept-only design matrix will be used.

  • device (Literal['cpu', 'cuda']) – ‘cpu’ or ‘cuda’. ‘mps’ currently not supported (torch.lgamma not supported on mps).

Return type:

None

update_params_from_dict(params)#

Update a subset of model parameters with a dictionary of parameters.

Parameters:

params (dict) – A dictionary of parameters to be updated. The keys must be existing parameter names in the model.

Return type:

None

fitting_configs: dict#

Dictionary of fitting configurations.

fitting_method: Literal['iwls', 'newton', 'gd']#

Method for fitting the model.

fitting_time: float#

Time taken for fitting the model.

n_factors: int | None#

Number of covariates in the design matrix

n_genes: int#

Number of genes in the batch.

n_isos: int#

Number of isoforms per gene in the batch.

n_spots: int#

Number of samples/spots

class splisosm.model.MultinomGLMM(share_variance=True, var_parameterization_sigma_theta=True, var_fix_sigma=True, var_prior_model='none', var_prior_model_params={}, init_ratio='observed', fitting_method='joint_gd', fitting_configs={})#

Bases: MultinomGLM, BaseModel, torch.nn.Module

The Multinomial Generalized Linear Mixed Model for spatial isoform expression.

The model is defined as follows:

Y ~ Multinomial(alpha, Y.sum(1))
eta = multinomial-logit(alpha) = X @ beta + bias_eta + nu
nu ~ MVN(0, sigma^2 * (theta * V_sp + (1-theta) * I) =
     MVN(0, sigma_sp^2 * V_sp + sigma_nsp^2 * I)

Given isoform counts of a gene Y (n_spots, n_isos), design matrix X (n_spots, n_factors), and spatial covariance matrix V_sp (n_spots, n_spots), the model estimates the isoform usage ratio alpha (n_spots, n_isos) across space. Specifically, MultinomGLMM.fit will find the MAP estimates of the following learnable parameters:

  • beta: (n_factors, n_isos - 1) covariate coefficients of the fixed effect term.

  • bias_eta: (n_isos - 1) intercepts of the fixed effect term.

  • nu: (n_spots, n_isos - 1) the random effect term.

  • variance components: each of length n_isos - 1 (or 1 if share_variance is True). If var_parameterization_sigma_theta is True, they are (sigma, theta_logit), representing total variance and logit of spatial variance proportion theta. Otherwise they are (sigma_sp, sigma_nsp), representing spatial and non-spatial variance components.

Inference algorithms can be categorized into two types based on the optimization objective:

  • Joint: Maximize the joint likelihood (with the random effect nu). This is equivalent to the first-order Laplace approximation of the marginal likelihood.

  • Marginal: Maximize the marginal likelihood (with the random effect nu integrated out). The integral is approximated by a second-order Laplace approximation.

Methods implemented:

  • 'joint_gd': Maximize the joint likelihood using gradient descent.

  • 'joint_newton': Maximize the joint likelihood using Newton’s method.

  • 'marginal_gd': Maximize the marginal likelihood using gradient descent.

  • 'marginal_newton': Maximize the marginal likelihood using Newton’s method. In this method, nu is first updated using Newton’s method every 'update_nu_every_k' iterations, and beta, bias_eta, and variance components are updated using gradient descent.

Notes

It is also possible to implement held-out likelihood for model selection.

Example

>>> from splisosm.model import MultinomGLMM
>>> from splisosm.utils import get_cov_sp
>>> import torch
>>> # Generate synthetic data
>>> counts = torch.randint(0, 10, (5, 100, 3))  # 5 genes, 100 spots, each 3 isoforms
>>> coords = torch.rand(100, 2)  # 100 spots with 2D coordinates
>>> K_sp = get_cov_sp(coords, k=4, rho=0.9) # spatial covariance matrix of shape (100, 100)
>>> # Fit the GLMM model
>>> model = MultinomGLMM(fitting_method='joint_gd')
>>> model.setup_data(counts, corr_sp=K_sp, design_mtx=None)
>>> model.fit()
>>> print(model)
>>> # Extract the fitted isoform ratios
>>> isoform_ratios = model.get_isoform_ratio()  # shape (5, 100, 3)
>>> # Fitted parameters
>>> print(model.beta.shape)  # shape (5, 0, 2)
>>> print(model.bias_eta.shape)  # shape (5, 2)
>>> print(model.nu.shape)  # shape (5, 100, 2)
>>> print(model.sigma.shape)  # shape (5, 1)
>>> print(model.theta_logit.shape)  # shape (5, 1)
Parameters:
  • share_variance (bool) – Whether to use the same variance across isoforms. If True, the variance components will be of length 1. If False, the variance components will be of length n_isos - 1.

  • var_parameterization_sigma_theta (bool) – Whether to parameterize the variance components as (sigma, theta_logit) or (sigma_sp, sigma_nsp). If True, the variance components will be (sigma, theta_logit), where sigma is the total variance and theta_logit is the logit of the spatial variance proportion. If False, the variance components will be (sigma_sp, sigma_nsp), where sigma_sp is the spatial variance and sigma_nsp is the non-spatial variance.

  • var_fix_sigma (bool) – Whether to fix the total variance (sigma) or not. If True, the total variance will be fixed to the initial value, which is the average per-spot variance of isoform counts normalized by its mean expression. See MultinomGLMM._initialize_params for details.

  • var_prior_model (Literal['none', 'gamma', 'inv_gamma']) – The prior model on the total variance sigma. Default is 'none' with no prior. Other options are 'gamma' (Gamma prior) and 'inv_gamma' (Inverse Gamma prior).

  • var_prior_model_params (dict) – The parameters for the prior model on the total variance sigma. For 'gamma', the default parameters are {'alpha': 2.0, 'beta': 0.3}. For 'inv_gamma', the default parameters are {'alpha': 3, 'beta': 0.5}.

  • init_ratio (Literal['observed', 'uniform']) – The initialization method for the logit isoform usage ratio. Options are 'observed' (initialize using observed counts) and 'uniform' (equal isoform usage across space).

  • fitting_method (Literal['joint_gd', 'joint_newton', 'marginal_gd', 'marginal_newton']) – The fitting method to use. Options are 'joint_gd' (joint likelihood with gradient descent), 'joint_newton' (joint likelihood with Newton’s method), 'marginal_gd' (marginal likelihood with gradient descent), and 'marginal_newton' (marginal likelihood with Newton’s method).

  • fitting_configs (dict) –

    A dictionary of fitting configurations with the following keys:

    • 'lr': float, learning rate

    • 'optim': str, optimization method (one of 'adam', 'sgd', or 'lbfgs')

    • 'tol': float, tolerance for convergence

    • 'max_epochs': int, maximum number of epochs

    • 'patience': int, number of epochs to wait for improvement before stopping

    • 'update_nu_every_k': int, number of iterations to update nu when using fitting_method='marginal_newton'

clone()#

Clone a Multinomial GLMM model with the same set of parameters.

Return type:

MultinomGLMM

fit(diagnose=False, verbose=False, quiet=False, random_seed=None)#

Fit the model using all data

Parameters:
  • diagnose (bool) – Whether to store parameter changes during training (passed to PatienceLogger).

  • verbose (bool) – Whether to print verbose information during fitting.

  • quiet (bool) – Whether to suppress output during fitting.

  • random_seed (Optional[int]) – Random seed for reproducibility.

Returns:

params_iter – If diagnose is True, returns a dictionary of parameter changes during training. Otherwise returns None.

Return type:

dict or None

forward()#

Calculate the log-likelihood or log-marginal-likelihood of the model.

Returns:

log_prob – Shape (n_genes,), the log probability for each gene.

Return type:

Tensor

setup_data(counts, corr_sp=None, design_mtx=None, device='cpu', corr_sp_eigvals=None, corr_sp_eigvecs=None)#

Set up the data for the model.

Parameters:
  • counts (Tensor) – Shape (n_genes, n_spots, n_isoforms) or (n_spots, n_isoforms). For batched calculations, all genes in the batch must have the same number of isoforms.

  • corr_sp (Optional[Tensor]) – Shape (n_spots, n_spots), spatial covariance matrix. If None, the eigendecomposition of the spatial covariance matrix must be provided.

  • design_mtx (Optional[Tensor]) – Shape (n_spots, n_factors). Design matrix of spatial covariates. If None, an intercept-only design matrix will be used.

  • device (Literal['cpu', 'cuda']) – ‘cpu’ or ‘cuda’. ‘mps’ currently not supported (torch.lgamma not supported on mps).

  • corr_sp_eigvals (Optional[Tensor]) – Shape (n_spots,), eigenvalues of spatial covariance. If None, the spatial covariance matrix corr_sp must be provided.

  • corr_sp_eigvecs (Optional[Tensor]) – Shape (n_spots, n_spots), eigenvectors of spatial covariance. If None, the spatial covariance matrix corr_sp must be provided.

Return type:

None

var_sp_prop()#

Output the proporptions of the spatial variance.

Returns:

var_sp_prop – The proportion theta of spatial variance of shape (n_genes, n_var_components).

Return type:

Tensor

var_total()#

Output the total variance.

Returns:

var_total – The total variance sigma of shape (n_genes, n_var_components).

Return type:

Tensor

fitting_configs: dict#

A dictionary of fitting configurations.

fitting_method: Literal['joint_gd', 'joint_newton', 'marginal_gd', 'marginal_newton']#

The fitting method to use.

fitting_time: float#

The time taken to fit the model.

init_ratio: Literal['observed', 'uniform']#

The initialization method for the logit isoform usage ratio gamma.

share_variance: bool#

Whether to use the same variance across isoforms.

var_fix_sigma: bool#

Whether to fix the total variance (sigma) or not.

var_parameterization_sigma_theta: bool#

Whether variance components are parameterized as (sigma, theta_logit) or (sigma_sp, sigma_nsp).

var_prior_model: Literal['none', 'gamma', 'inv_gamma']#

The prior model on the total variance sigma.

var_prior_model_params: dict#

The parameters for the prior model on the total variance sigma.