splisosm.kernel_gpr#
Kernel Gaussian Process Regression backends for SPLISOSM spatial residualization.
This module provides an abstract interface and concrete implementations for kernel-based conditional independence testing (via spatial residualization) used by the SPLISOSM differential-usage tests.
GPR-residualizer hierarchy#
KernelGPR (abstract):
SklearnKernelGPR- sklearn backend; dense; suitable for n <= ~10,000.
GPyTorchKernelGPR- GPyTorch backend (optional dep); lazy tensors.
FFTKernelGPR- FFT-based kernel; suitable for regular grids.
Classes#
FFT-based GPR residualizer for regular (ny, nx) grids using an RBF kernel. |
|
GPyTorch-based GP residualizer. |
|
Abstract GPR residualizer: learns a spatial kernel and returns residuals. |
|
sklearn-based GPR residualizer. |
Functions#
|
Fit a Gaussian process regression and optionally return residuals. |
|
The linear HSIC test (multivariate RV coefficient). |
|
Construct a KernelGPR from a backend name. |
Module Contents#
- class splisosm.kernel_gpr.FFTKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', epsilon_bounds=(1e-05, 10.0), workers=None)#
Bases:
KernelGPRFFT-based GPR residualizer for regular (ny, nx) grids using an RBF kernel.
Accepts the same hyperparameter interface as
SklearnKernelGPR. Kernel eigenvalues are computed viaFFTKernelOp; all linear-algebra is O(N log N) with no n x n matrix formed.Notes
Hyperparameter optimization strategy
The spectral log-marginal-likelihood (LML) for Y (n = nx * ny observations, m channels) is:
-2 LML = m * sum_k log(lambda_k + eps) + (1/n) * sum_k ||Y_hat_k||^2 / (lambda_k + eps)
where
lambda_k = sigma^2 * s_k(l)are the kernel eigenvalues withsigmabeing the constant value (signal variance) ands_k(l)the unit-spectrum eigenvalues at length scalel, andY_hat = FFT2(Y_cube)(unnormalized 2-D DFT).Four optimization cases based on which bounds (constant_value_bounds, length_scale_bounds) are
"fixed":Both fixed → 1-D bounded scalar search over
log(eps)only. Fastest; unit spectrum cached across repeated calls.cv free, ls fixed → 2-D L-BFGS-B over
(log_sigma2, log_eps). Unit spectrum cached.cv fixed, ls free → 2-D L-BFGS-B over
(log_l, log_eps). A freshFFTKernelOpis built at each optimizer step.Both free → 3-D L-BFGS-B over
(log_sigma2, log_l, log_eps). A freshFFTKernelOpis built at each optimizer step.
- Parameters:
constant_value (float) – Initial signal variance sigma^2.
constant_value_bounds (tuple or "fixed") – Bounds for sigma^2.
"fixed"pins sigma^2.length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or "fixed") – Bounds for the RBF length scale.
"fixed"pins it; a 2-tuple(lo, hi)of positive floats enables length-scale optimization.epsilon_bounds (tuple[float, float]) – Search bounds for the white-noise / regularization level.
workers (int or None) – Number of
scipy.fftworkers.
- fit_residuals(coords, Y)#
Fit GP and return spatial residuals.
Auto-detects the regular grid from
coords, rasterizesY, optimizes hyperparameters via spectral LML, then de-rasterizes.
- fit_residuals_batch(coords, Y_list)#
Residualize a list of targets, amortizing grid detection and unit spectrum.
When
signal_bounds_fixedis True the unit-spectrum eigenvalues are shared; only eps is re-optimised per target (1-D search). Otherwise each target gets full 2-D optimisation.
- fit_residuals_cube(data_cube, spacing=(1.0, 1.0), epsilon=None)#
Residualize a raster cube in O(N log N).
Kept for backward compatibility with
SplisosmFFT.- Parameters:
- Returns:
residuals (np.ndarray, same shape as data_cube)
epsilon (float) – Regularization level used.
- Return type:
tuple[ndarray, float]
- get_kernel_op(coords=None)#
Return the cached unit-spectrum FFTKernelOp.
- Raises:
RuntimeError – If called before any residualization.
- Parameters:
coords (Optional[Tensor])
- Return type:
FFTKernelOp
- property n: int | None#
Total grid cells (ny * nx).
- Return type:
Optional[int]
- property nx: int | None#
Grid width (set after first residualization).
- Return type:
Optional[int]
- property ny: int | None#
Grid height (set after first residualization).
- Return type:
Optional[int]
- property signal_bounds_fixed: bool#
True when BOTH
constant_value_boundsandlength_scale_boundsare"fixed".- Return type:
bool
- class splisosm.kernel_gpr.GPyTorchKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', n_inducing=None, n_iter=100, lr=0.01, device='cpu')#
Bases:
KernelGPRGPyTorch-based GP residualizer.
Uses GPyTorch’s lazy-tensor infrastructure so that all linear-system solves during training use Lanczos-based conjugate gradients rather than dense Cholesky. Hyperparameters are optimized with Adam on the marginal log-likelihood.
For most datasets (n <= ~10,000) the exact-GP path is used. Passing
n_inducing > 0enables the FITC sparse GP approximation via InducingPointKernel.- Parameters:
constant_value (float) – Initial signal amplitude (outputscale).
constant_value_bounds (tuple or "fixed") – Search bounds for the signal amplitude.
"fixed"disables optimization of this parameter.length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or "fixed") – Search bounds for the length scale.
n_inducing (int or None) – Number of inducing points for FITC sparse GP approximation. When set, memory scales as O(n x n_inducing) rather than O(n²). Set to None to use exact GP.
n_iter (int) – Adam optimizer iterations for hyperparameter learning.
lr (float) – Adam learning rate.
device (str) – PyTorch device string, e.g.
"cpu"or"cuda".
Notes
Requires
gpytorch(optional dependency):pip install gpytorch
Hyperparameter search uses the mean of all target columns as the training signal, then applies the fitted kernel to every column. This matches the sklearn backend behavior and amortizes O(n^3) GP fitting across outputs.
The noise (epsilon) search is handled by GPyTorch’s
GaussianLikelihood, which is always optimized regardless of the signal bounds setting.- fit_residuals(coords, Y)#
Fit GP residuals using GPyTorch’s exact-GP backend.
Hyperparameters are optimized on the mean of all target columns. The fitted kernel and noise level are then applied to every column of
Yto compute residuals.NaN rows are excluded from fitting and reinserted as NaN in output.
- device = 'cpu'#
- lr = 0.01#
- n_inducing = None#
- n_iter = 100#
- property signal_bounds_fixed: bool#
True when both constant_value and length_scale bounds are
"fixed".- Return type:
bool
- class splisosm.kernel_gpr.KernelGPR#
Bases:
abc.ABCAbstract GPR residualizer: learns a spatial kernel and returns residuals.
Subclasses determine how hyperparameters are optimized (MLE, variational, fixed) and whether the kernel matrix is materialized (dense) or kept implicit.
All subclasses expose the same
fit_residualsinterface so that the DU-test pipeline inSplisosmNPandSplisosmFFTis backend-agnostic.- abstractmethod fit_residuals(coords, Y)#
Fit GP to (coords, Y) and return spatial residuals.
- Parameters:
- Returns:
residuals – Residuals Y - GP_smooth(Y | coords). NaN rows in Y are preserved as NaN.
- Return type:
Tensor, shape (n_obs, m)
- fit_residuals_batch(coords, Y_list)#
Fit residuals for many targets sharing the same coordinates.
The default implementation calls
fit_residuals()sequentially. Subclasses may override this to amortize work across targets (e.g.SklearnKernelGPRprecomputes a shared eigendecomposition when signal bounds are fixed).
- abstractmethod get_kernel_op(coords)#
Build the kernel operator for given coordinates.
Returns a
SpatialKernelOpwhose hyperparameters have been set (but not yet optimized for a specific target Y). Useful when the caller needs the operator for its own purposes (e.g. computing eigenvalues for the HSIC null distribution).Not all backends support this method.
- Parameters:
coords (Tensor, shape (n_obs, d)) – Spatial coordinates.
- Returns:
Kernel operator for the given coordinates.
- Return type:
SpatialKernelOp
- Raises:
NotImplementedError – If not implemented by the subclass.
- class splisosm.kernel_gpr.SklearnKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', n_inducing=5000)#
Bases:
KernelGPRsklearn-based GPR residualizer.
Uses
GaussianProcessRegressorwith aConstantKernel x RBF + WhiteKernelto learn spatial hyperparameters and residualize targets.When both signal bounds are
"fixed"(signal_bounds_fixed), the base kernel matrix is the same for every target sharing the same coordinates. Callingprecompute_shared_kernel()once before a loop over many targets reduces per-target cost from O(n^3) GP fitting to O(n^2) spectral operations (see_kernel_residuals_from_eigdecomp()).- Parameters:
constant_value (float) – Initial signal amplitude.
constant_value_bounds (tuple or
"fixed") – Search bounds for the signal amplitude."fixed"disables optimization of this parameter.length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or
"fixed") – Search bounds for the length scale.n_inducing (int or None) – Maximum number of observations used for hyperparameter fitting. When
n_obs <= n_inducing(orn_inducingisNone), all observations are used for an exact GP fit. Whenn_obs > n_inducing, a randomly sampled subset-of-data ofn_inducingpoints is used for hyperparameter search, then the fitted kernel is applied to predict on all observations. This is not an inducing-point (FITC/VFE) approximation — the subset is only used to determine hyperparameters. Settingn_inducing=Nonedisables the subset shortcut and always uses the full dataset; a warning is issued whenn_obs > 10_000.
Notes
This backend always materialises a dense n x n kernel matrix (up to
n_inducingxn_inducingwhen the subset path is taken). For very large n, considerGPyTorchKernelGPRwith FITC inducing points orFFTKernelGPRfor regular raster grids.- fit_residuals(coords, Y)#
Fit GP residuals for a single target matrix.
Chooses the fast spectral path when
signal_bounds_fixedand the shared eigendecomposition has been precomputed; otherwise falls back to a full sklearn GP fit when n_obs <= n_inducing, or an inducing-point approximation when n_obs > n_inducing.NaN rows in
Yare excluded from fitting and reinserted as NaN in the output.
- fit_residuals_batch(coords, Y_list)#
Fit residuals for multiple targets, amortizing kernel setup.
If
signal_bounds_fixedand no eigendecomposition has been precomputed yet, this method precomputes it before the loop.
- classmethod from_config(config)#
Construct from a configuration dictionary.
- Parameters:
config (dict) – Configuration dict with optional keys:
constant_value,constant_value_bounds,length_scale,length_scale_bounds,n_inducing.- Returns:
New instance constructed from the configuration.
- Return type:
- get_kernel_op(coords)#
Return a
DenseKernelOpbuilt from the current (fixed) params.- Parameters:
coords (Tensor, shape (n_obs, d)) – Normalized spatial coordinates.
- Returns:
Kernel operator for the given coordinates.
- Return type:
DenseKernelOp
Precompute and cache the eigendecomposition of the fixed-param kernel.
Call this once before a loop over many targets that share the same coordinates and have
signal_bounds_fixed == True. Subsequent calls tofit_residuals()will use the cached decomposition and run in O(n^2) rather than O(n^3).Has no effect (and logs a warning) when
signal_bounds_fixedis False.- Parameters:
coords (Tensor, shape (n_obs, d)) – Normalized spatial coordinates, shape (n_obs, d).
- Return type:
None
- n_inducing = 5000#
- property signal_bounds_fixed: bool#
True when both constant_value and length_scale bounds are fixed.
- Return type:
bool
- splisosm.kernel_gpr.fit_kernel_gpr(X, Y, normalize_x=True, normalize_y=True, return_residuals=True, constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds=(0.01, 100.0))#
Fit a Gaussian process regression and optionally return residuals.
Deprecated since version 1.0.4: Use
SklearnKernelGPRdirectly. This wrapper is kept for backward compatibility and will be removed in a future release.- Parameters:
X (Tensor, shape (n_obs, d)) – Input spatial coordinates.
Y (Tensor, shape (n_obs, m)) – Target values.
normalize_x (bool) – Whether to z-score normalize X before fitting.
normalize_y (bool) – Whether to z-score normalize Y before fitting.
return_residuals (bool) – If True, return residuals; if False, return
(Kxy, epsilon).constant_value (float) – Initial signal amplitude.
constant_value_bounds (tuple or
"fixed") – Search bounds for the signal amplitude.length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or
"fixed") – Search bounds for the length scale.
- Returns:
If
return_residualsis False:(Kxy, epsilon)kernel matrix and noise level. Ifreturn_residualsis True: residual tensor of shape (n_obs, m).- Return type:
- splisosm.kernel_gpr.linear_hsic_test(X, Y, centering=True)#
The linear HSIC test (multivariate RV coefficient).
Equivalent to a multivariate extension of Pearson correlation.
Supports sparse inputs for memory and speed efficiency:
Sparse X (scipy sparse matrix, shape
(n, p)): the cross-productY_c.T @ X_cis computed via a sparse matrix multiply (X.T.dot(Y_c)). BecauseYis mean-centred, theXcentering correction reduces to zero, so only the original sparseXis needed.X_c.T @ X_cis obtained asX.T @ X - n * mean_X ⊗ mean_X, keeping the first term sparse.Sparse Y (scipy sparse or torch sparse COO): densified once upfront before any computation.
- Parameters:
- Returns:
hsic (float) – HSIC test statistic (scaled by 1 / (n - 1)**2).
pvalue (float) – P-value from the asymptotic chi-squared mixture distribution.
- Return type:
tuple[float, float]
- splisosm.kernel_gpr.make_kernel_gpr(backend='sklearn', **kwargs)#
Construct a KernelGPR from a backend name.
- Parameters:
backend ({"sklearn", "gpytorch", "fft"}) – Backend to use.
"fft"accepts the same kwargs as"sklearn"(constant_value,constant_value_bounds,length_scale,length_scale_bounds) plusepsilon_boundsandworkers.n_inducingis supported by both"sklearn"and"gpytorch"with the same name (see_DEFAULT_GPR_CONFIGS).**kwargs – Passed to the backend constructor.
- Returns:
GPR residualizer instance for the specified backend.
- Return type:
- Raises:
ValueError – If backend is not one of the supported options.