splisosm.kernel_gpr

splisosm.kernel_gpr#

Kernel Gaussian Process Regression backends for SPLISOSM spatial residualization.

This module provides an abstract interface and concrete implementations for kernel-based conditional independence testing (via spatial residualization) used by the SPLISOSM differential-usage tests.

Kernel-operator hierarchy#

SpatialKernelOp (abstract):

DenseKernelOp - stores K explicitly; Cholesky solve; O(n^2) memory.

FFTKernelOp - operates in the spectral domain; O(N log N); no matrix.

GPR-residualizer hierarchy#

KernelGPR (abstract):

SklearnKernelGPR - sklearn backend; dense; suitable for n <= ~10,000.

GPyTorchKernelGPR - GPyTorch backend (optional dep); lazy tensors.

FFTKernelGPR - FFT-based kernel; suitable for regular grids.

Classes#

`DenseKernelOp`	Dense kernel matrix wrapped as a `SpatialKernelOp`.
`FFTKernelGPR`	FFT-based GPR residualizer for regular (ny, nx) grids using an RBF kernel.
`FFTKernelOp`	Implicit RBF kernel linear operator on a periodic 2-D grid via FFT.
`GPyTorchKernelGPR`	GPyTorch-based GP residualizer.
`KernelGPR`	Abstract GPR residualizer: learns a spatial kernel and returns residuals.
`SklearnKernelGPR`	sklearn-based GPR residualizer.
`SpatialKernelOp`	Abstract spatial kernel linear operator.

Functions#

`fit_kernel_gpr`(X, Y[, normalize_x, normalize_y, ...])	Fit a Gaussian process regression and optionally return residuals.
`linear_hsic_test`(X, Y[, centering])	The linear HSIC test (multivariate RV coefficient).
`make_kernel_gpr`([backend])	Construct a KernelGPR from a backend name.

Module Contents#

class splisosm.kernel_gpr.DenseKernelOp(K)#

Bases: SpatialKernelOp

Dense kernel matrix wrapped as a SpatialKernelOp.

Stores the full n x n kernel matrix and solves linear systems via Cholesky factorization. Suitable for n <= ~10,000.

Parameters:: K (Tensor, shape (n_obs, n_obs)) – Symmetric positive semi-definite kernel matrix.

eigenvalues(k=None)#

Return eigenvalues of K in descending order.

Parameters:: k (int or None) – Number of leading eigenvalues to return. None returns all.
Returns:: Eigenvalues in descending order.
Return type:: Tensor, shape (k,) or (n_obs,)

eigh()#

Return (eigenvalues, eigenvectors) in ascending order.

Useful for spectral operations such as the fast epsilon search in _kernel_residuals_from_eigdecomp().

Returns:

eigenvalues (torch.Tensor, shape (n_obs,)) – Eigenvalues in ascending order.
eigenvectors (torch.Tensor, shape (n_obs, n_obs)) – Corresponding orthonormal eigenvectors.

Return type:

tuple[Tensor, Tensor]

matvec(v)#

Compute K @ v.

Parameters:: v (Tensor, shape (n_obs,) or (n_obs, m)) – Input vector or matrix.
Returns:: Result of K @ v.
Return type:: Tensor, shape (n_obs,) or (n_obs, m)

solve(v, epsilon)#

Solve (K + epsilon * I) u = v and return u.

Parameters:

v (Tensor, shape (n_obs, m)) – Right-hand side vector or matrix.
epsilon (float) – Regularization / noise level (> 0).

Returns:

Solution u.

Return type:

Tensor, shape (n_obs, m)

property n: int#

Number of data points.

Return type:: int

class splisosm.kernel_gpr.FFTKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', epsilon_bounds=(1e-05, 10.0), workers=None)#

Bases: KernelGPR

FFT-based GPR residualizer for regular (ny, nx) grids using an RBF kernel.

Accepts the same hyperparameter interface as SklearnKernelGPR. Kernel eigenvalues are computed via FFTKernelOp; all linear-algebra is O(N log N) with no n x n matrix formed.

Hyperparameter optimization strategy#

The spectral log-marginal-likelihood (LML) for Y (n observations, m channels) is:

-2 LML = m * sum_k log(lam_k + eps)
         + (1/n) * sum_k ||Y_hat_k||^2 / (lam_k + eps)

where lam_k = sigma^2 * s_k(l) are the kernel eigenvalues, s_k(l) the unit-spectrum eigenvalues at length scale l, and Y_hat = FFT2(Y_cube) (unnormalized 2-D DFT).

Four optimization cases based on which bounds are "fixed":

Both fixed → 1-D bounded scalar search over log(eps) only. Fastest; unit spectrum cached across repeated calls.
cv free, ls fixed → 2-D L-BFGS-B over (log_sigma2, log_eps). Unit spectrum cached.
cv fixed, ls free → 2-D L-BFGS-B over (log_l, log_eps). A fresh FFTKernelOp is built at each optimizer step.
Both free → 3-D L-BFGS-B over (log_sigma2, log_l, log_eps). A fresh FFTKernelOp is built at each optimizer step.

param constant_value:: Initial signal variance sigma^2.
type constant_value:: float
param constant_value_bounds:: Bounds for sigma^2. "fixed" pins sigma^2.
type constant_value_bounds:: tuple or “fixed”
param length_scale:: Initial RBF length scale.
type length_scale:: float
param length_scale_bounds:: Bounds for the RBF length scale. "fixed" pins it; a 2-tuple (lo, hi) of positive floats enables length-scale optimization.
type length_scale_bounds:: tuple or “fixed”
param epsilon_bounds:: Search bounds for the white-noise / regularization level.
type epsilon_bounds:: tuple[float, float]
param workers:: Number of scipy.fft workers.
type workers:: int or None

fit_residuals(coords, Y)#

Fit GP and return spatial residuals.

Auto-detects the regular grid from coords, rasterizes Y, optimizes hyperparameters via spectral LML, then de-rasterizes.

Parameters:

coords (Tensor, shape (n_obs, 2)) – Spatial coordinates forming a regular grid.
Y (Tensor, shape (n_obs, m)) – Target values (may contain NaN rows).

Returns:

Spatial residuals. NaN rows are preserved.

Return type:

Tensor, shape (n_obs, m)

fit_residuals_batch(coords, Y_list)#

Residualize a list of targets, amortizing grid detection and unit spectrum.

When signal_bounds_fixed is True the unit-spectrum eigenvalues are shared; only eps is re-optimised per target (1-D search). Otherwise each target gets full 2-D optimisation.

Parameters:

coords (Tensor, shape (n_obs, 2))
Y_list (list of Tensor, each (n_obs, m_i))

Return type:

list of Tensor, each (n_obs, m_i)

fit_residuals_cube(data_cube, spacing=(1.0, 1.0), epsilon=None)#

Residualize a raster cube in O(N log N).

Kept for backward compatibility with SplisosmFFT.

Parameters:

data_cube (np.ndarray, shape (ny, nx) or (ny, nx, m))
spacing ((dy, dx)) – Physical grid spacing.
epsilon (float or None) – If given, skip optimization and use this value directly.

Returns:

residuals (np.ndarray, same shape as data_cube)
epsilon (float) – Regularization level used.

Return type:

tuple[ndarray, float]

get_kernel_op(coords=None)#

Return the cached unit-spectrum FFTKernelOp.

Raises:: RuntimeError – If called before any residualization.
Parameters:: coords (Optional[Tensor])
Return type:: FFTKernelOp

property n: int | None#

Total grid cells (ny * nx).

Return type:: Optional[int]

property nx: int | None#

Grid width (set after first residualization).

Return type:: Optional[int]

property ny: int | None#

Grid height (set after first residualization).

Return type:: Optional[int]

property signal_bounds_fixed: bool#

True when BOTH constant_value_bounds and length_scale_bounds are "fixed".

Return type:: bool

Parameters:

constant_value (float)
constant_value_bounds (Union[tuple, str])
length_scale (float)
length_scale_bounds (Union[tuple, str])
epsilon_bounds (tuple)
workers (Optional[int])

class splisosm.kernel_gpr.FFTKernelOp(ny, nx, dy, dx, constant_value, length_scale, workers=None)#

Bases: SpatialKernelOp

Implicit RBF kernel linear operator on a periodic 2-D grid via FFT.

Eigenvalues are computed as FFT2(K_row) where K_row[i, j] = sigma^2 * exp(-(di^2 + dj^2) / (2 * l^2)) with periodic (torus) distances — no /n_grid normalization. This matches the convention used by DenseKernelOp and SklearnKernelGPR.

All linear-algebra operations are O(N log N) via FFT; the n x n kernel matrix is never formed.

Parameters:

ny (int) – Grid shape.
nx (int) – Grid shape.
dy (float) – Physical spacing between neighboring grid cells (used to compute torus distances).
dx (float) – Physical spacing between neighboring grid cells (used to compute torus distances).
constant_value (float) – RBF signal variance sigma^2.
length_scale (float) – RBF length scale (in the same units as dy/dx).
workers (int or None) – Number of scipy.fft workers.

eigenvalues(k=None)#

Return eigenvalues in descending order.

Parameters:: k (int or None) – Number of leading eigenvalues. None returns all.
Return type:: np.ndarray, shape (k,) or (n_obs,).

matvec(v)#

Compute K @ v in O(N log N).

Parameters:: v (np.ndarray, shape (n_obs,) or (n_obs, m))
Return type:: np.ndarray, same shape as v.

residuals(v, epsilon)#

Apply epsilon * (K + epsilon * I)^{-1} @ v in O(N log N).

Parameters:

v (np.ndarray, shape (n_obs,) or (n_obs, m))
epsilon (float)

Return type:

np.ndarray, same shape as v.

solve(v, epsilon)#

Solve (K + epsilon * I) u = v in O(N log N).

Parameters:

v (np.ndarray, shape (n_obs,) or (n_obs, m))
epsilon (float) – Regularization level (> 0).

Return type:

np.ndarray, same shape as v.

square_trace()#

Return trace(K^2).

Return type:: float

trace()#

Return trace(K).

Return type:: float

property eigenvalues_2d: ndarray#

2-D eigenvalue array, shape (ny, nx). Computed lazily.

Return type:: ndarray

property n: int#

Total grid cells ny * nx.

Return type:: int

property nx: int#

Grid width.

Return type:: int

property ny: int#

Grid height.

Return type:: int

class splisosm.kernel_gpr.GPyTorchKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', n_inducing=None, n_iter=100, lr=0.01, device='cpu')#

Bases: KernelGPR

GPyTorch-based GP residualizer.

Uses GPyTorch’s lazy-tensor infrastructure so that all linear-system solves during training use Lanczos-based conjugate gradients rather than dense Cholesky. Hyperparameters are optimized with Adam on the marginal log-likelihood.

For most datasets (n <= ~10,000) the exact-GP path is used. Passing n_inducing > 0 enables the FITC sparse GP approximation via InducingPointKernel.

Parameters:

constant_value (float) – Initial signal amplitude (outputscale).
constant_value_bounds (tuple or "fixed") – Search bounds for the signal amplitude. "fixed" disables optimization of this parameter.
length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or "fixed") – Search bounds for the length scale.
n_inducing (int or None) – Number of inducing points for FITC sparse GP approximation. When set, memory scales as O(n x n_inducing) rather than O(n²). Set to None to use exact GP.
n_iter (int) – Adam optimizer iterations for hyperparameter learning.
lr (float) – Adam learning rate.
device (str) – PyTorch device string, e.g. "cpu" or "cuda".

Notes

Requires gpytorch (optional dependency):

pip install gpytorch

Hyperparameter search uses the mean of all target columns as the training signal, then applies the fitted kernel to every column. This matches the sklearn backend behavior and amortizes O(n^3) GP fitting across outputs.

The noise (epsilon) search is handled by GPyTorch’s GaussianLikelihood, which is always optimized regardless of the signal bounds setting.

fit_residuals(coords, Y)#

Fit GP residuals using GPyTorch’s exact-GP backend.

Hyperparameters are optimized on the mean of all target columns. The fitted kernel and noise level are then applied to every column of Y to compute residuals.

NaN rows are excluded from fitting and reinserted as NaN in output.

Parameters:

coords (Tensor, shape (n_obs, d)) – Normalized spatial coordinates.
Y (Tensor, shape (n_obs, m)) – Target values; may contain NaN rows.

Returns:

Spatial residuals epsilon * (K + epsilon * I)**(-1) @ Y. NaN rows from Y are preserved as NaN.

Return type:

Tensor, shape (n_obs, m)

device = 'cpu'#

lr = 0.01#

n_inducing = None#

n_iter = 100#

property signal_bounds_fixed: bool#

True when both constant_value and length_scale bounds are "fixed".

Return type:: bool

class splisosm.kernel_gpr.KernelGPR#

Bases: abc.ABC

Abstract GPR residualizer: learns a spatial kernel and returns residuals.

Subclasses determine how hyperparameters are optimized (MLE, variational, fixed) and whether the kernel matrix is materialized (dense) or kept implicit.

All subclasses expose the same fit_residuals interface so that the DU-test pipeline in SplisosmNP and SplisosmFFT is backend-agnostic.

abstractmethod fit_residuals(coords, Y)#

Fit GP to (coords, Y) and return spatial residuals.

Parameters:

coords (Tensor, shape (n_obs, d)) – Spatial coordinates (should be pre-normalized by the caller).
Y (Tensor, shape (n_obs, m)) – Target values (may contain NaN rows, which are handled internally).

Returns:

residuals – Residuals Y - GP_smooth(Y | coords). NaN rows in Y are preserved as NaN.

Return type:

Tensor, shape (n_obs, m)

fit_residuals_batch(coords, Y_list)#

Fit residuals for many targets sharing the same coordinates.

The default implementation calls fit_residuals() sequentially. Subclasses may override this to amortize work across targets (e.g. SklearnKernelGPR precomputes a shared eigendecomposition when signal bounds are fixed).

Parameters:

coords (Tensor, shape (n_obs, d)) – Spatial coordinates.
Y_list (list of Tensor) – List of target matrices, each shape (n_obs, m_i).

Returns:

List of residual matrices, each shape (n_obs, m_i).

Return type:

list of Tensor

abstractmethod get_kernel_op(coords)#

Build the kernel operator for given coordinates.

Returns a SpatialKernelOp whose hyperparameters have been set (but not yet optimized for a specific target Y). Useful when the caller needs the operator for its own purposes (e.g. computing eigenvalues for the HSIC null distribution).

Not all backends support this method.

Parameters:: coords (Tensor, shape (n_obs, d)) – Spatial coordinates.
Returns:: Kernel operator for the given coordinates.
Return type:: SpatialKernelOp
Raises:: NotImplementedError – If not implemented by the subclass.

class splisosm.kernel_gpr.SklearnKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', n_inducing=5000)#

Bases: KernelGPR

sklearn-based GPR residualizer.

Uses GaussianProcessRegressor with a ConstantKernel x RBF + WhiteKernel to learn spatial hyperparameters and residualize targets.

When both signal bounds are "fixed" (signal_bounds_fixed), the base kernel matrix is the same for every target sharing the same coordinates. Calling precompute_shared_kernel() once before a loop over many targets reduces per-target cost from O(n^3) GP fitting to O(n^2) spectral operations (see _kernel_residuals_from_eigdecomp()).

Parameters:

constant_value (float) – Initial signal amplitude.
constant_value_bounds (tuple or "fixed") – Search bounds for the signal amplitude. "fixed" disables optimization of this parameter.
length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or "fixed") – Search bounds for the length scale.
n_inducing (int) – Number of inducing points. When n_obs <= n_inducing the full exact GP is used. When n_obs > n_inducing, a subset of n_inducing randomly sampled points is used as the inducing set for an approximate GP (subset-of-data / Nyström-style approximation).

Notes

This backend always materialises a dense n x n kernel matrix up to n_inducing points. For very large n consider GPyTorchKernelGPR with inducing points.

fit_residuals(coords, Y)#

Fit GP residuals for a single target matrix.

Chooses the fast spectral path when signal_bounds_fixed and the shared eigendecomposition has been precomputed; otherwise falls back to a full sklearn GP fit when n_obs <= n_inducing, or an inducing-point approximation when n_obs > n_inducing.

NaN rows in Y are excluded from fitting and reinserted as NaN in the output.

Parameters:

coords (Tensor, shape (n_obs, d)) – Normalized spatial coordinates.
Y (Tensor, shape (n_obs, m)) – Target values; may contain NaN rows.

Returns:

Spatial residuals. NaN rows from Y are preserved as NaN.

Return type:

Tensor, shape (n_obs, m)

fit_residuals_batch(coords, Y_list)#

Fit residuals for multiple targets, amortizing kernel setup.

If signal_bounds_fixed and no eigendecomposition has been precomputed yet, this method precomputes it before the loop.

Parameters:

coords (Tensor, shape (n_obs, d)) – Normalized spatial coordinates.
Y_list (list of Tensor) – List of target matrices, each shape (n_obs, m_i).

Returns:

List of residual matrices, each shape (n_obs, m_i).

Return type:

list of Tensor

classmethod from_config(config)#

Construct from a configuration dictionary.

Parameters:: config (dict) – Configuration dict with optional keys: constant_value, constant_value_bounds, length_scale, length_scale_bounds, n_inducing.
Returns:: New instance constructed from the configuration.
Return type:: SklearnKernelGPR

get_kernel_op(coords)#

Return a DenseKernelOp built from the current (fixed) params.

Parameters:: coords (Tensor, shape (n_obs, d)) – Normalized spatial coordinates.
Returns:: Kernel operator for the given coordinates.
Return type:: DenseKernelOp

precompute_shared_kernel(coords)#

Precompute and cache the eigendecomposition of the fixed-param kernel.

Call this once before a loop over many targets that share the same coordinates and have signal_bounds_fixed == True. Subsequent calls to fit_residuals() will use the cached decomposition and run in O(n^2) rather than O(n^3).

Has no effect (and logs a warning) when signal_bounds_fixed is False.

Parameters:: coords (Tensor, shape (n_obs, d)) – Normalized spatial coordinates, shape (n_obs, d).
Return type:: None

n_inducing = 5000#

property signal_bounds_fixed: bool#

True when both constant_value and length_scale bounds are fixed.

Return type:: bool

class splisosm.kernel_gpr.SpatialKernelOp#

Bases: abc.ABC

Abstract spatial kernel linear operator.

Provides kernel-vector products and linear system solves without requiring the full n x n kernel matrix to be materialised in memory.

Concrete subclasses#

DenseKernelOp: Stores K explicitly as a torch.Tensor; solves via Cholesky. Suitable for n <= ~10,000.
FFTKernelOp (splisosm.hyptest_fft): Operates entirely in the spectral domain; O(N log N) per operation; no n x n matrix formed at any point.

Extending `SplisosmNP` to large datasets#

SplisosmNP currently builds DenseKernelOp instances and passes them to SklearnKernelGPR. To support n > 10,000, replace the operator with an implicit subclass (e.g. LanczosKernelOp) and pair it with an iterative KernelGPR that only calls matvec; the rest of the pipeline is unchanged.

abstractmethod eigenvalues(k=None)#

Return eigenvalues of K in descending order.

Parameters:: k (int or None) – Number of leading eigenvalues to return. None returns all.
Returns:: Eigenvalues in descending order.
Return type:: Tensor, shape (k,) or (n_obs,)

abstractmethod matvec(v)#

Compute K @ v.

Parameters:: v (Tensor, shape (n_obs,) or (n_obs, m)) – Input vector or matrix.
Returns:: Result of K @ v.
Return type:: Tensor, shape (n_obs,) or (n_obs, m)

residuals(v, epsilon)#

Apply the kernel regression residual operator.

Computes epsilon * (K + epsilon * I)**(-1) @ v, i.e. the part of v that is not explained by the GP mean.

Parameters:

v (Tensor, shape (n_obs, m)) – Input vector or matrix.
epsilon (float) – Regularization / noise level (> 0).

Returns:

Residual vector or matrix.

Return type:

Tensor, shape (n_obs, m)

abstractmethod solve(v, epsilon)#

Solve (K + epsilon * I) u = v and return u.

Parameters:

v (Tensor, shape (n_obs, m)) – Right-hand side vector or matrix.
epsilon (float) – Regularization / noise level (> 0).

Returns:

Solution u.

Return type:

Tensor, shape (n_obs, m)

square_trace()#

Return trace(K**2).

Return type:: float

trace()#

Return trace(K) (sum of eigenvalues).

Return type:: float

property n: int#

Abstractmethod:
Return type:: int

Number of data points.

splisosm.kernel_gpr.fit_kernel_gpr(X, Y, normalize_x=True, normalize_y=True, return_residuals=True, constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds=(0.01, 100.0))#

Fit a Gaussian process regression and optionally return residuals.

Deprecated since version Use: SklearnKernelGPR directly. This wrapper is kept for backward compatibility and will be removed in a future release.

Parameters:

X (Tensor, shape (n_obs, d)) – Input spatial coordinates.
Y (Tensor, shape (n_obs, m)) – Target values.
normalize_x (bool) – Whether to z-score normalize X before fitting.
normalize_y (bool) – Whether to z-score normalize Y before fitting.
return_residuals (bool) – If True, return residuals; if False, return (Kxy, epsilon).
constant_value (float) – Initial signal amplitude.
constant_value_bounds (tuple or "fixed") – Search bounds for the signal amplitude.
length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or "fixed") – Search bounds for the length scale.

Returns:

If return_residuals is False: (Kxy, epsilon) kernel matrix and noise level. If return_residuals is True: residual tensor of shape (n_obs, m).

Return type:

tuple[Tensor, float] or Tensor

splisosm.kernel_gpr.linear_hsic_test(X, Y, centering=True)#

The linear HSIC test (multivariate RV coefficient).

Equivalent to a multivariate extension of Pearson correlation.

Parameters:

X (Tensor, shape (n_samples, n_x))
Y (Tensor, shape (n_samples, n_y))
centering (bool) – Whether to mean-centre X and Y.

Returns:

hsic (float) – HSIC test statistic (scaled by 1 / (n - 1)**2).
pvalue (float) – P-value from the asymptotic chi-squared mixture distribution.

Return type:

tuple[float, float]

splisosm.kernel_gpr.make_kernel_gpr(backend='sklearn', **kwargs)#

Construct a KernelGPR from a backend name.

Parameters:

backend ({"sklearn", "gpytorch", "fft"}) – Backend to use. "fft" accepts the same kwargs as "sklearn" (constant_value, constant_value_bounds, length_scale, length_scale_bounds) plus epsilon_bounds and workers. n_inducing is supported by both "sklearn" and "gpytorch" with the same name (see _DEFAULT_GPR_CONFIGS).
**kwargs – Passed to the backend constructor.

Returns:

GPR residualizer instance for the specified backend.

Return type:

KernelGPR

Raises:

ValueError – If backend is not one of the supported options.

splisosm.kernel_gpr

Contents

splisosm.kernel_gpr#

Kernel-operator hierarchy#

GPR-residualizer hierarchy#

Classes#

Functions#

Module Contents#

Hyperparameter optimization strategy#

Concrete subclasses#

Extending SplisosmNP to large datasets#

Extending `SplisosmNP` to large datasets#