GP Regression Backends#
SPLISOSM uses spatial Gaussian-process (GP) residualization for conditional
differential-usage tests with method="hsic-gp". In routine workflows,
choose a backend through gpr_backend and tune it through gpr_configs on
splisosm.SplisosmNP.test_differential_usage() or
splisosm.SplisosmFFT.test_differential_usage().
This page documents the lower-level operator and residualizer classes for users
who need backend-specific options or direct residualization calls. Shapes use
n_spots for spatial locations and m for response columns.
Backend selection guide:
sklearn: dense reference backend for small to moderate datasets.gpytorch: exact or FITC sparse GP with optional GPU support.fft: regular raster grids, used bysplisosm.SplisosmFFT.nufft/finufft: large irregular 2-D coordinates without a dense kernel.
For NUFFT, lml_approx_rank controls only the approximate log marginal
likelihood used during hyperparameter fitting on irregular coordinates. It is
not a low-rank spatial test. The NUFFT matvec still uses the selected Fourier
grid; larger ranks improve GP hyperparameter accuracy but use more memory.
Kernel operators#
Kernel operators provide matrix-vector products, solves, residualization, and spectral summaries without requiring each backend to expose a dense kernel matrix.
Abstract spatial kernel linear operator. |
|
Dense kernel matrix wrapped as a |
|
Implicit RBF kernel linear operator on a periodic 2-D grid via FFT. |
|
Implicit periodic RBF kernel on irregular 2-D coordinates via FINUFFT. |
- class splisosm.gpr.SpatialKernelOp#
Bases:
ABCAbstract spatial kernel linear operator.
Provides kernel-vector products and linear system solves without requiring the full n x n kernel matrix to be materialised in memory.
Concrete subclasses#
DenseKernelOpStores K explicitly as a
torch.Tensor; solves via Cholesky. Suitable for n <= ~10,000.FFTKernelOp(splisosm.hyptest.fft)Operates entirely in the spectral domain; O(N log N) per operation; no n x n matrix formed at any point.
Extending
SplisosmNPto large datasets#SplisosmNPcurrently buildsDenseKernelOpinstances and passes them toSklearnKernelGPR. To support n > 10,000, replace the operator with an implicit subclass (e.g.LanczosKernelOp) and pair it with an iterativeKernelGPRthat only callsmatvec; the rest of the pipeline is unchanged.- abstractmethod matvec(v)#
Compute
K @ v.
- abstractmethod solve(v, epsilon)#
Solve
(K + epsilon * I) u = vand returnu.
- residuals(v, epsilon)#
Apply the kernel regression residual operator.
Computes
epsilon * (K + epsilon * I)**(-1) @ v, i.e. the part ofvthat is not explained by the GP mean.
- abstractmethod eigenvalues(k=None)#
Return eigenvalues of K in descending order.
- Parameters:
k (int or None) – Number of leading eigenvalues to return. None returns all.
- Returns:
Eigenvalues in descending order.
- Return type:
Tensor, shape (k,) or (n_obs,)
- trace()#
Return
trace(K)(sum of eigenvalues).- Return type:
float
- square_trace()#
Return
trace(K**2).- Return type:
float
- class splisosm.gpr.DenseKernelOp(K)#
Bases:
SpatialKernelOpDense kernel matrix wrapped as a
SpatialKernelOp.Stores the full n x n kernel matrix and solves linear systems via Cholesky factorization. Suitable for n <= ~10,000.
- Parameters:
K (Tensor, shape (n_obs, n_obs)) – Symmetric positive semi-definite kernel matrix.
- matvec(v)#
Compute
K @ v.
- solve(v, epsilon)#
Solve
(K + epsilon * I) u = vand returnu.
- residuals(v, epsilon)#
Apply the kernel regression residual operator.
Computes
epsilon * (K + epsilon * I)**(-1) @ v, i.e. the part ofvthat is not explained by the GP mean.
- class splisosm.gpr.FFTKernelOp(ny, nx, dy, dx, constant_value, length_scale, workers=None)#
Bases:
SpatialKernelOpImplicit RBF kernel linear operator on a periodic 2-D grid via FFT.
Eigenvalues are computed as
FFT2(K_row)whereK_row[i, j] = sigma^2 * exp(-(di^2 + dj^2) / (2 * l^2))with periodic (torus) distances — no/n_gridnormalization. This matches the convention used byDenseKernelOpandSklearnKernelGPR.All linear-algebra operations are O(N log N) via FFT; the n x n kernel matrix is never formed.
- Parameters:
ny (int) – Grid shape.
nx (int) – Grid shape.
dy (float) – Physical spacing between neighboring grid cells (used to compute torus distances).
dx (float) – Physical spacing between neighboring grid cells (used to compute torus distances).
constant_value (float) – RBF signal variance sigma^2.
length_scale (float) – RBF length scale (in the same units as dy/dx).
workers (int or None) – Number of
scipy.fftworkers.
- matvec(v)#
Compute
K @ vin O(N log N).- Parameters:
v (np.ndarray, shape (n_obs,) or (n_obs, m))
- Return type:
np.ndarray, same shape as v.
- solve(v, epsilon)#
Solve
(K + epsilon * I) u = vin O(N log N).- Parameters:
v (np.ndarray, shape (n_obs,) or (n_obs, m))
epsilon (float) – Regularization level (> 0).
- Return type:
np.ndarray, same shape as v.
- residuals(v, epsilon)#
Apply
epsilon * (K + epsilon * I)^{-1} @ vin O(N log N).- Parameters:
v (np.ndarray, shape (n_obs,) or (n_obs, m))
epsilon (float)
- Return type:
np.ndarray, same shape as v.
- eigenvalues(k=None)#
Return eigenvalues in descending order.
- Parameters:
k (int or None) – Number of leading eigenvalues.
Nonereturns all.- Return type:
np.ndarray, shape (k,) or (n_obs,).
- trace()#
Return
trace(K).- Return type:
float
- square_trace()#
Return
trace(K^2).- Return type:
float
- class splisosm.gpr.NUFFTKernelOp(coords, constant_value, length_scale, n_modes=None, max_auto_modes=None, nufft_eps=1e-06, nufft_opts=None, period_margin=0.5, cg_rtol=1e-05, cg_maxiter=None, workers=None)#
Bases:
SpatialKernelOpImplicit periodic RBF kernel on irregular 2-D coordinates via FINUFFT.
The operator approximates a squared-exponential kernel with periodic boundary conditions over the observed coordinate bounding box. Matrix-vector products use two FINUFFT calls:
nonuniform points -> Fourier modes -> nonuniform points.The dense
n_obs x n_obskernel matrix is never formed. Linear solves use conjugate gradients against the implicit operator.- Parameters:
coords (Tensor or np.ndarray, shape (n_obs, 2)) – Spatial coordinates. Units match
length_scale.constant_value (float) – RBF signal variance sigma².
length_scale (float) – RBF length scale in the same coordinate units.
n_modes (int, tuple[int, int], or None) – Fourier mode grid.
Nonechooses the full effective grid from the observed coordinate density, with aspect ratio matching the padded periodic box. For example, roughly square data with 1M spots and no padding uses about(1000, 1000)modes before FINUFFT’s internal oversampling.max_auto_modes (int or None) – Optional cap on the total automatically chosen Fourier modes. Ignored when
n_modesis explicit.Noneuses the full effective grid.nufft_eps (float) – FINUFFT requested precision.
nufft_opts (dict or None) – Extra FINUFFT options, such as
{"upsampfac": 2.0}. Thread count is controlled byworkersinstead ofnufft_opts["nthreads"].period_margin (float) – Fractional padding added to each side of the coordinate bounding box before wrapping onto
[-pi, pi).cg_rtol (float) – Relative tolerance for conjugate-gradient solves.
cg_maxiter (int or None) – Maximum CG iterations.
workers (int or None) – FINUFFT thread count passed as
nthreads.Noneuses one FINUFFT thread, which avoids OpenMP runtime conflicts on common macOS PyTorch/conda stacks.
- matvec(v)#
Compute
K @ vwithout materializingK.
- solve(v, epsilon)#
Solve
(K + epsilon * I) u = vwith conjugate gradients.
- residuals(v, epsilon)#
Apply
epsilon * (K + epsilon * I)^{-1} @ v.
- eigenvalues(k=None)#
Return an FFT-style spectral proxy, sorted descending.
Irregular coordinates do not diagonalize in the Fourier basis, so these values are useful only as a rough low-rank spectrum summary. They should not be interpreted as exact eigenvalues of the sampled kernel matrix.
- Parameters:
k (int | None)
- Return type:
- trace()#
Return the exact trace implied by the normalized kernel diagonal.
- Return type:
float
- square_trace()#
Return a spectral proxy for
trace(K**2).- Return type:
float
Residualizer classes#
Abstract GP residualizer. |
|
Dense sklearn GP residualizer. |
|
GPyTorch GP residualizer. |
|
FFT GP residualizer for regular |
|
FINUFFT GP residualizer for irregular 2-D coordinates. |
- class splisosm.gpr.KernelGPR#
Bases:
ABCAbstract GP residualizer.
Subclasses determine how hyperparameters are optimized (MLE, variational, fixed) and whether the kernel matrix is materialized (dense) or kept implicit.
All subclasses expose the same
fit_residualsinterface so that the DU-test pipeline inSplisosmNPandSplisosmFFTis backend-agnostic.- abstractmethod fit_residuals(coords, Y)#
Fit a spatial GP and return residuals.
- Parameters:
- Returns:
residuals – Shape
(n_spots, m). ResidualsY - GP_smooth(Y | coords). NaN rows inYare preserved as NaN.- Return type:
- fit_residuals_batch(coords, Y_list)#
Fit residuals for many targets sharing the same coordinates.
The default implementation calls
fit_residuals()sequentially. Subclasses may override this to amortize work across targets (e.g.SklearnKernelGPRprecomputes a shared eigendecomposition when signal bounds are fixed).
- get_kernel_op(coords)#
Build the kernel operator for given coordinates.
Returns a
SpatialKernelOpwhose hyperparameters have been set (but not yet optimized for a specific target Y). Useful when the caller needs the operator for its own purposes (e.g. computing eigenvalues for the HSIC null distribution).Not all backends support this method.
- Parameters:
coords (Tensor) – Shape
(n_spots, d). Spatial coordinates.- Returns:
Kernel operator for the given coordinates.
- Return type:
- Raises:
NotImplementedError – If not implemented by the subclass.
- class splisosm.gpr.SklearnKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', n_inducing=5000)#
Bases:
KernelGPRDense sklearn GP residualizer.
Uses
GaussianProcessRegressorwith aConstantKernel x RBF + WhiteKernelto learn spatial hyperparameters and residualize targets.When both signal bounds are
"fixed"(signal_bounds_fixed), the base kernel matrix is the same for every target sharing the same coordinates. Callingprecompute_shared_kernel()once before a loop over many targets reduces per-target cost fromO(n_spots^3)GP fitting toO(n_spots^2)spectral operations.- Parameters:
constant_value (float) – Initial signal amplitude.
constant_value_bounds (tuple or
"fixed") – Search bounds for the signal amplitude."fixed"disables optimization of this parameter.length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or
"fixed") – Search bounds for the length scale.n_inducing (int or None) – Maximum number of observations used for hyperparameter fitting. When
n_spots <= n_inducing(orn_inducingisNone), all observations are used for an exact GP fit. Whenn_spots > n_inducing, a randomly sampled subset-of-data ofn_inducingpoints is used for hyperparameter search, then the fitted kernel is applied to predict on all observations. This is not an inducing-point (FITC/VFE) approximation; the subset is only used to determine hyperparameters. Settingn_inducing=Nonedisables the subset shortcut and always uses the full dataset; a warning is issued whenn_spots > 10_000.
Notes
This backend always materializes a dense
n_spots x n_spotskernel matrix (up ton_inducingxn_inducingwhen the subset path is taken). For large data, useNUFFTKernelGPRfor irregular 2-D coordinates,FFTKernelGPRfor regular raster grids, orGPyTorchKernelGPRwith FITC inducing points.- classmethod from_config(config)#
Construct from a configuration dictionary.
- Parameters:
config (dict) – Configuration dict with optional keys:
constant_value,constant_value_bounds,length_scale,length_scale_bounds,n_inducing.- Returns:
New instance constructed from the configuration.
- Return type:
Precompute and cache the eigendecomposition of the fixed-param kernel.
Call this once before a loop over many targets that share the same coordinates and have
signal_bounds_fixed == True. Subsequent calls tofit_residuals()will use the cached decomposition and run inO(n_spots^2)rather thanO(n_spots^3).Has no effect (and logs a warning) when
signal_bounds_fixedis False.- Parameters:
coords (Tensor) – Shape
(n_spots, d). Normalized spatial coordinates.- Return type:
None
- fit_residuals(coords, Y)#
Fit GP residuals for a single target matrix.
Chooses the fast spectral path when
signal_bounds_fixedand the shared eigendecomposition has been precomputed; otherwise falls back to a full sklearn GP fit whenn_spots <= n_inducing, or a subset-of-data hyperparameter fit whenn_spots > n_inducing.NaN rows in
Yare excluded from fitting and reinserted as NaN in the output.
- fit_residuals_batch(coords, Y_list)#
Fit residuals for multiple targets, amortizing kernel setup.
If
signal_bounds_fixedand no eigendecomposition has been precomputed yet, this method precomputes it before the loop.
- class splisosm.gpr.GPyTorchKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', n_inducing=None, n_iter=100, lr=0.01, device='cpu')#
Bases:
KernelGPRGPyTorch GP residualizer.
Uses GPyTorch’s lazy-tensor infrastructure so that all linear-system solves during training use Lanczos-based conjugate gradients rather than dense Cholesky. Hyperparameters are optimized with Adam on the marginal log-likelihood.
For moderate datasets, the exact-GP path is used. Passing
n_inducing > 0enables the FITC sparse GP approximation via InducingPointKernel.- Parameters:
constant_value (float) – Initial signal amplitude (outputscale).
constant_value_bounds (tuple or "fixed") – Search bounds for the signal amplitude.
"fixed"disables optimization of this parameter.length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or "fixed") – Search bounds for the length scale.
n_inducing (int or None) – Number of inducing points for FITC sparse GP approximation. When set, memory scales as
O(n_spots * n_inducing)rather thanO(n_spots^2). Set toNoneto use exact GP.n_iter (int) – Adam optimizer iterations for hyperparameter learning.
lr (float) – Adam learning rate.
device (str) – PyTorch device string, e.g.
"cpu"or"cuda".
Notes
Requires
gpytorch(optional dependency):pip install gpytorch
Hyperparameter search uses the mean of all target columns as the training signal, then applies the fitted kernel to every column. This matches the sklearn backend behavior and amortizes GP fitting across output columns.
The noise (epsilon) search is handled by GPyTorch’s
GaussianLikelihood, which is always optimized regardless of the signal bounds setting.- fit_residuals(coords, Y)#
Fit GP residuals using GPyTorch’s exact-GP backend.
Hyperparameters are optimized on the mean of all target columns. The fitted kernel and noise level are then applied to every column of
Yto compute residuals.NaN rows are excluded from fitting and reinserted as NaN in output.
- class splisosm.gpr.FFTKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', epsilon_bounds=(1e-05, 10.0), workers=None)#
Bases:
KernelGPRFFT GP residualizer for regular
(ny, nx)grids.Accepts the same hyperparameter interface as
SklearnKernelGPR. Kernel eigenvalues are computed withFFTKernelOp; matrix-vector operations costO(n_grid log n_grid)and no densen_grid x n_gridkernel is formed.Notes
Hyperparameter optimization strategy
The spectral log-marginal likelihood (LML) for
Ywithn_grid = ny * nxcells andmresponse channels is:-2 LML = m * sum_k log(lambda_k + eps) + (1 / n_grid) * sum_k ||Y_hat_k||^2 / (lambda_k + eps)
where
lambda_k = sigma^2 * s_k(l)are the kernel eigenvalues with signal variancesigma^2and unit-spectrum eigenvaluess_k(l)at length scalel;epsis the white-noise variance; andY_hat = FFT2(Y_cube)(unnormalized 2-D DFT).Four optimization cases based on which bounds (constant_value_bounds, length_scale_bounds) are
"fixed":Both fixed → 1-D bounded scalar search over
log(eps)only. Fastest; unit spectrum cached across repeated calls.cv free, ls fixed → 2-D L-BFGS-B over
(log_sigma2, log_eps). Unit spectrum cached.cv fixed, ls free → 2-D L-BFGS-B over
(log_l, log_eps). A freshFFTKernelOpis built at each optimizer step.Both free → 3-D L-BFGS-B over
(log_sigma2, log_l, log_eps). A freshFFTKernelOpis built at each optimizer step.
- Parameters:
constant_value (float) – Initial signal variance
sigma^2.constant_value_bounds (tuple or "fixed") – Bounds for
sigma^2."fixed"pinssigma^2.length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or "fixed") – Bounds for the RBF length scale.
"fixed"pins it; a 2-tuple(lo, hi)of positive floats enables length-scale optimization.epsilon_bounds (tuple[float, float]) – Search bounds for the white-noise / regularization level.
workers (int or None) – Number of
scipy.fftworkers.
- fit_residuals(coords, Y)#
Fit GP and return spatial residuals.
Auto-detects the regular grid from
coords, rasterizesY, optimizes hyperparameters via spectral LML, then de-rasterizes.
- fit_residuals_batch(coords, Y_list)#
Residualize a list of targets, amortizing grid detection and unit spectrum.
When
signal_bounds_fixedis True the unit-spectrum eigenvalues are shared; only eps is re-optimised per target (1-D search). Otherwise each target gets full 2-D optimisation.
- fit_residuals_cube(data_cube, spacing=(1.0, 1.0), epsilon=None)#
Residualize a raster cube in O(N log N).
Kept for backward compatibility with
SplisosmFFT.- Parameters:
data_cube (np.ndarray, shape (ny, nx) or (ny, nx, m))
spacing ((dy, dx)) – Physical grid spacing.
epsilon (float or None) – If given, skip optimization and use this value directly.
- Returns:
residuals (np.ndarray, same shape as data_cube)
epsilon (float) – Regularization level used.
- Return type:
tuple[ndarray, float]
- class splisosm.gpr.NUFFTKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', epsilon_bounds=(1e-05, 10.0), n_modes=None, max_auto_modes=None, nufft_eps=1e-06, nufft_opts=None, lml_approx_rank=256, lml_exact_max_n=512, eigsh_tol=0.0001, period_margin=0.5, cg_rtol=1e-05, cg_maxiter=None, workers=None)#
Bases:
KernelGPRFINUFFT GP residualizer for irregular 2-D coordinates.
The backend keeps the RBF kernel implicit: FINUFFT evaluates kernel-vector products and conjugate gradients solve
(K + epsilon * I)systems. Regular grids with matchingn_modesuse the same spectral likelihood asFFTKernelGPR; irregular grids use a leading-eigenpair summary plustr(K)/tr(K**2)tail correction for marginal likelihood fitting.- Parameters:
constant_value (float) – Initial RBF signal variance.
constant_value_bounds (tuple or "fixed") – Bounds for the signal variance.
"fixed"pins it.length_scale (float) – Initial RBF length scale in coordinate units.
length_scale_bounds (tuple or "fixed") – Bounds for the length scale.
"fixed"pins it.epsilon_bounds (tuple[float, float]) – Search bounds for the white-noise level.
n_modes (int, tuple[int, int], or None) – Fourier mode grid.
Nonechooses the full effective grid from the observed point count and aspect ratio.max_auto_modes (int or None) – Optional cap on automatic
n_modes. Ignored whenn_modesis explicit.nufft_eps (float) – FINUFFT requested precision.
nufft_opts (dict or None) – Extra FINUFFT options, such as
{"upsampfac": 2.0}. Useworkersfor thread count.lml_approx_rank (int or None) – Leading eigenpair count for irregular-grid marginal likelihoods. Memory is
O(n_spots * lml_approx_rank). Ignored on compatible regular grids.Noneforces exact dense eigendecomposition for small diagnostics.lml_exact_max_n (int) – Maximum
n_spotsfor exact dense eigendecomposition.eigsh_tol (float) – Relative tolerance for
eigshin the irregular-grid eigensummary path.period_margin (float) – Fractional padding around the coordinate bounding box before periodic wrapping.
cg_rtol (float) – Relative tolerance for conjugate-gradient solves.
cg_maxiter (int or None) – Maximum CG iterations.
workers (int or None) – FINUFFT thread count passed as
nthreads.Noneuses one FINUFFT thread.
Notes
Requires
finufft(optional dependency):pip install "splisosm[gp]"
Results should match
FFTKernelGPRon compatible regular grids up to numerical tolerance. Compared withSklearnKernelGPR, differences near the bounding-box edges are expected because this backend uses periodic boundary conditions.- fit_residuals(coords, Y)#
Fit GP residuals for one target matrix on irregular 2-D coordinates.
- fit_residuals_batch(coords, Y_list)#
Residualize multiple targets while reusing coordinates and mode grid.
Factory helper#
- splisosm.gpr.make_kernel_gpr(backend='sklearn', **kwargs)#
Construct a GP residualizer from a backend name.
- Parameters:
backend ({"sklearn", "gpytorch", "fft", "nufft", "finufft"}) – Backend to use.
**kwargs – GP backend configuration. Common keys are
constant_value,constant_value_bounds,length_scale, andlength_scale_bounds.n_inducingis used by"sklearn"and"gpytorch"only. NUFFT-specific keys includen_modes,max_auto_modes,lml_approx_rank,period_margin, and conjugate-gradient / FINUFFT controls. Backend-irrelevant known keys from_DEFAULT_GPR_CONFIGSare ignored.
- Returns:
GPR residualizer instance for the specified backend.
- Return type:
- Raises:
ValueError – If backend is not one of the supported options. If an unknown configuration key is supplied.