GP Regression Backends

GP Regression Backends#

SPLISOSM uses spatial Gaussian-process (GP) residualization for conditional differential-usage tests with method="hsic-gp". In routine workflows, choose a backend through gpr_backend and tune it through gpr_configs on splisosm.SplisosmNP.test_differential_usage() or splisosm.SplisosmFFT.test_differential_usage().

This page documents the lower-level operator and residualizer classes for users who need backend-specific options or direct residualization calls. Shapes use n_spots for spatial locations and m for response columns.

Backend selection guide:

sklearn: dense reference backend for small to moderate datasets.
gpytorch: exact or FITC sparse GP with optional GPU support.
fft: regular raster grids, used by splisosm.SplisosmFFT.
nufft / finufft: large irregular 2-D coordinates without a dense kernel.

For NUFFT, lml_approx_rank controls only the approximate log marginal likelihood used during hyperparameter fitting on irregular coordinates. It is not a low-rank spatial test. The NUFFT matvec still uses the selected Fourier grid; larger ranks improve GP hyperparameter accuracy but use more memory.

Kernel operators#

Kernel operators provide matrix-vector products, solves, residualization, and spectral summaries without requiring each backend to expose a dense kernel matrix.

`splisosm.gpr.SpatialKernelOp`	Abstract spatial kernel linear operator.
`splisosm.gpr.DenseKernelOp`	Dense kernel matrix wrapped as a `SpatialKernelOp`.
`splisosm.gpr.FFTKernelOp`	Implicit RBF kernel linear operator on a periodic 2-D grid via FFT.
`splisosm.gpr.NUFFTKernelOp`	Implicit periodic RBF kernel on irregular 2-D coordinates via FINUFFT.

class splisosm.gpr.SpatialKernelOp#

Bases: ABC

Abstract spatial kernel linear operator.

Provides kernel-vector products and linear system solves without requiring the full n x n kernel matrix to be materialised in memory.

Concrete subclasses#

DenseKernelOp: Stores K explicitly as a torch.Tensor; solves via Cholesky. Suitable for n <= ~10,000.
FFTKernelOp (splisosm.hyptest.fft): Operates entirely in the spectral domain; O(N log N) per operation; no n x n matrix formed at any point.

Extending `SplisosmNP` to large datasets#

SplisosmNP currently builds DenseKernelOp instances and passes them to SklearnKernelGPR. To support n > 10,000, replace the operator with an implicit subclass (e.g. LanczosKernelOp) and pair it with an iterative KernelGPR that only calls matvec; the rest of the pipeline is unchanged.

abstractmethod matvec(v)#

Compute K @ v.

Parameters:: v (Tensor, shape (n_obs,) or (n_obs, m)) – Input vector or matrix.
Returns:: Result of K @ v.
Return type:: Tensor, shape (n_obs,) or (n_obs, m)

abstractmethod solve(v, epsilon)#

Solve (K + epsilon * I) u = v and return u.

Parameters:

v (Tensor, shape (n_obs, m)) – Right-hand side vector or matrix.
epsilon (float) – Regularization / noise level (> 0).

Returns:

Solution u.

Return type:

Tensor, shape (n_obs, m)

residuals(v, epsilon)#

Apply the kernel regression residual operator.

Computes epsilon * (K + epsilon * I)**(-1) @ v, i.e. the part of v that is not explained by the GP mean.

Parameters:

v (Tensor, shape (n_obs, m)) – Input vector or matrix.
epsilon (float) – Regularization / noise level (> 0).

Returns:

Residual vector or matrix.

Return type:

Tensor, shape (n_obs, m)

abstractmethod eigenvalues(k=None)#

Return eigenvalues of K in descending order.

Parameters:: k (int or None) – Number of leading eigenvalues to return. None returns all.
Returns:: Eigenvalues in descending order.
Return type:: Tensor, shape (k,) or (n_obs,)

trace()#

Return trace(K) (sum of eigenvalues).

Return type:: float

square_trace()#

Return trace(K**2).

Return type:: float

class splisosm.gpr.DenseKernelOp(K)#

Bases: SpatialKernelOp

Dense kernel matrix wrapped as a SpatialKernelOp.

Stores the full n x n kernel matrix and solves linear systems via Cholesky factorization. Suitable for n <= ~10,000.

Parameters:: K (Tensor, shape (n_obs, n_obs)) – Symmetric positive semi-definite kernel matrix.

matvec(v)#

Compute K @ v.

Parameters:: v (Tensor, shape (n_obs,) or (n_obs, m)) – Input vector or matrix.
Returns:: Result of K @ v.
Return type:: Tensor, shape (n_obs,) or (n_obs, m)

solve(v, epsilon)#

Solve (K + epsilon * I) u = v and return u.

Parameters:

v (Tensor, shape (n_obs, m)) – Right-hand side vector or matrix.
epsilon (float) – Regularization / noise level (> 0).

Returns:

Solution u.

Return type:

Tensor, shape (n_obs, m)

residuals(v, epsilon)#

Apply the kernel regression residual operator.

Computes epsilon * (K + epsilon * I)**(-1) @ v, i.e. the part of v that is not explained by the GP mean.

Parameters:

v (Tensor, shape (n_obs, m)) – Input vector or matrix.
epsilon (float) – Regularization / noise level (> 0).

Returns:

Residual vector or matrix.

Return type:

Tensor, shape (n_obs, m)

eigenvalues(k=None)#

Return eigenvalues of K in descending order.

Parameters:: k (int or None) – Number of leading eigenvalues to return. None returns all.
Returns:: Eigenvalues in descending order.
Return type:: Tensor, shape (k,) or (n_obs,)

class splisosm.gpr.FFTKernelOp(ny, nx, dy, dx, constant_value, length_scale, workers=None)#

Bases: SpatialKernelOp

Implicit RBF kernel linear operator on a periodic 2-D grid via FFT.

Eigenvalues are computed as FFT2(K_row) where K_row[i, j] = sigma^2 * exp(-(di^2 + dj^2) / (2 * l^2)) with periodic (torus) distances — no /n_grid normalization. This matches the convention used by DenseKernelOp and SklearnKernelGPR.

All linear-algebra operations are O(N log N) via FFT; the n x n kernel matrix is never formed.

Parameters:

ny (int) – Grid shape.
nx (int) – Grid shape.
dy (float) – Physical spacing between neighboring grid cells (used to compute torus distances).
dx (float) – Physical spacing between neighboring grid cells (used to compute torus distances).
constant_value (float) – RBF signal variance sigma^2.
length_scale (float) – RBF length scale (in the same units as dy/dx).
workers (int or None) – Number of scipy.fft workers.

matvec(v)#

Compute K @ v in O(N log N).

Parameters:: v (np.ndarray, shape (n_obs,) or (n_obs, m))
Return type:: np.ndarray, same shape as v.

solve(v, epsilon)#

Solve (K + epsilon * I) u = v in O(N log N).

Parameters:

v (np.ndarray, shape (n_obs,) or (n_obs, m))
epsilon (float) – Regularization level (> 0).

Return type:

np.ndarray, same shape as v.

residuals(v, epsilon)#

Apply epsilon * (K + epsilon * I)^{-1} @ v in O(N log N).

Parameters:

v (np.ndarray, shape (n_obs,) or (n_obs, m))
epsilon (float)

Return type:

np.ndarray, same shape as v.

eigenvalues(k=None)#

Return eigenvalues in descending order.

Parameters:: k (int or None) – Number of leading eigenvalues. None returns all.
Return type:: np.ndarray, shape (k,) or (n_obs,).

trace()#

Return trace(K).

Return type:: float

square_trace()#

Return trace(K^2).

Return type:: float

class splisosm.gpr.NUFFTKernelOp(coords, constant_value, length_scale, n_modes=None, max_auto_modes=None, nufft_eps=1e-06, nufft_opts=None, period_margin=0.5, cg_rtol=1e-05, cg_maxiter=None, workers=None)#

Bases: SpatialKernelOp

Implicit periodic RBF kernel on irregular 2-D coordinates via FINUFFT.

The operator approximates a squared-exponential kernel with periodic boundary conditions over the observed coordinate bounding box. Matrix-vector products use two FINUFFT calls:

nonuniform points -> Fourier modes -> nonuniform points.

The dense n_obs x n_obs kernel matrix is never formed. Linear solves use conjugate gradients against the implicit operator.

Parameters:

coords (Tensor or np.ndarray, shape (n_obs, 2)) – Spatial coordinates. Units match length_scale.
constant_value (float) – RBF signal variance sigma².
length_scale (float) – RBF length scale in the same coordinate units.
n_modes (int, tuple[int, int], or None) – Fourier mode grid. None chooses the full effective grid from the observed coordinate density, with aspect ratio matching the padded periodic box. For example, roughly square data with 1M spots and no padding uses about (1000, 1000) modes before FINUFFT’s internal oversampling.
max_auto_modes (int or None) – Optional cap on the total automatically chosen Fourier modes. Ignored when n_modes is explicit. None uses the full effective grid.
nufft_eps (float) – FINUFFT requested precision.
nufft_opts (dict or None) – Extra FINUFFT options, such as {"upsampfac": 2.0}. Thread count is controlled by workers instead of nufft_opts["nthreads"].
period_margin (float) – Fractional padding added to each side of the coordinate bounding box before wrapping onto [-pi, pi).
cg_rtol (float) – Relative tolerance for conjugate-gradient solves.
cg_maxiter (int or None) – Maximum CG iterations.
workers (int or None) – FINUFFT thread count passed as nthreads. None uses one FINUFFT thread, which avoids OpenMP runtime conflicts on common macOS PyTorch/conda stacks.

matvec(v)#

Compute K @ v without materializing K.

Parameters:: v (Tensor | ndarray)
Return type:: ndarray

solve(v, epsilon)#

Solve (K + epsilon * I) u = v with conjugate gradients.

Parameters:

v (Tensor | ndarray)
epsilon (float)

Return type:

ndarray

residuals(v, epsilon)#

Apply epsilon * (K + epsilon * I)^{-1} @ v.

Parameters:

v (Tensor | ndarray)
epsilon (float)

Return type:

ndarray

eigenvalues(k=None)#

Return an FFT-style spectral proxy, sorted descending.

Irregular coordinates do not diagonalize in the Fourier basis, so these values are useful only as a rough low-rank spectrum summary. They should not be interpreted as exact eigenvalues of the sampled kernel matrix.

Parameters:: k (int | None)
Return type:: ndarray

trace()#

Return the exact trace implied by the normalized kernel diagonal.

Return type:: float

square_trace()#

Return a spectral proxy for trace(K**2).

Return type:: float

Residualizer classes#

`splisosm.gpr.KernelGPR`	Abstract GP residualizer.
`splisosm.gpr.SklearnKernelGPR`	Dense sklearn GP residualizer.
`splisosm.gpr.GPyTorchKernelGPR`	GPyTorch GP residualizer.
`splisosm.gpr.FFTKernelGPR`	FFT GP residualizer for regular `(ny, nx)` grids.
`splisosm.gpr.NUFFTKernelGPR`	FINUFFT GP residualizer for irregular 2-D coordinates.

class splisosm.gpr.KernelGPR#

Bases: ABC

Abstract GP residualizer.

Subclasses determine how hyperparameters are optimized (MLE, variational, fixed) and whether the kernel matrix is materialized (dense) or kept implicit.

All subclasses expose the same fit_residuals interface so that the DU-test pipeline in SplisosmNP and SplisosmFFT is backend-agnostic.

abstractmethod fit_residuals(coords, Y)#

Fit a spatial GP and return residuals.

Parameters:

coords (Tensor) – Shape (n_spots, d). Spatial coordinates, pre-normalized by the caller.
Y (Tensor) – Shape (n_spots, m). Target values; may contain NaN rows.

Returns:

residuals – Shape (n_spots, m). Residuals Y - GP_smooth(Y | coords). NaN rows in Y are preserved as NaN.

Return type:

Tensor

fit_residuals_batch(coords, Y_list)#

Fit residuals for many targets sharing the same coordinates.

The default implementation calls fit_residuals() sequentially. Subclasses may override this to amortize work across targets (e.g. SklearnKernelGPR precomputes a shared eigendecomposition when signal bounds are fixed).

Parameters:

coords (Tensor) – Shape (n_spots, d). Spatial coordinates.
Y_list (list of Tensor) – List of target matrices, each shape (n_spots, m_i).

Returns:

List of residual matrices, each shape (n_spots, m_i).

Return type:

list of Tensor

get_kernel_op(coords)#

Build the kernel operator for given coordinates.

Returns a SpatialKernelOp whose hyperparameters have been set (but not yet optimized for a specific target Y). Useful when the caller needs the operator for its own purposes (e.g. computing eigenvalues for the HSIC null distribution).

Not all backends support this method.

Parameters:: coords (Tensor) – Shape (n_spots, d). Spatial coordinates.
Returns:: Kernel operator for the given coordinates.
Return type:: SpatialKernelOp
Raises:: NotImplementedError – If not implemented by the subclass.

class splisosm.gpr.SklearnKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', n_inducing=5000)#

Bases: KernelGPR

Dense sklearn GP residualizer.

Uses GaussianProcessRegressor with a ConstantKernel x RBF + WhiteKernel to learn spatial hyperparameters and residualize targets.

When both signal bounds are "fixed" (signal_bounds_fixed), the base kernel matrix is the same for every target sharing the same coordinates. Calling precompute_shared_kernel() once before a loop over many targets reduces per-target cost from O(n_spots^3) GP fitting to O(n_spots^2) spectral operations.

Parameters:

constant_value (float) – Initial signal amplitude.
constant_value_bounds (tuple or "fixed") – Search bounds for the signal amplitude. "fixed" disables optimization of this parameter.
length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or "fixed") – Search bounds for the length scale.
n_inducing (int or None) – Maximum number of observations used for hyperparameter fitting. When n_spots <= n_inducing (or n_inducing is None), all observations are used for an exact GP fit. When n_spots > n_inducing, a randomly sampled subset-of-data of n_inducing points is used for hyperparameter search, then the fitted kernel is applied to predict on all observations. This is not an inducing-point (FITC/VFE) approximation; the subset is only used to determine hyperparameters. Setting n_inducing=None disables the subset shortcut and always uses the full dataset; a warning is issued when n_spots > 10_000.

Notes

This backend always materializes a dense n_spots x n_spots kernel matrix (up to n_inducing x n_inducing when the subset path is taken). For large data, use NUFFTKernelGPR for irregular 2-D coordinates, FFTKernelGPR for regular raster grids, or GPyTorchKernelGPR with FITC inducing points.

classmethod from_config(config)#

Construct from a configuration dictionary.

Parameters:: config (dict) – Configuration dict with optional keys: constant_value, constant_value_bounds, length_scale, length_scale_bounds, n_inducing.
Returns:: New instance constructed from the configuration.
Return type:: SklearnKernelGPR

precompute_shared_kernel(coords)#

Precompute and cache the eigendecomposition of the fixed-param kernel.

Call this once before a loop over many targets that share the same coordinates and have signal_bounds_fixed == True. Subsequent calls to fit_residuals() will use the cached decomposition and run in O(n_spots^2) rather than O(n_spots^3).

Has no effect (and logs a warning) when signal_bounds_fixed is False.

Parameters:: coords (Tensor) – Shape (n_spots, d). Normalized spatial coordinates.
Return type:: None

fit_residuals(coords, Y)#

Fit GP residuals for a single target matrix.

Chooses the fast spectral path when signal_bounds_fixed and the shared eigendecomposition has been precomputed; otherwise falls back to a full sklearn GP fit when n_spots <= n_inducing, or a subset-of-data hyperparameter fit when n_spots > n_inducing.

NaN rows in Y are excluded from fitting and reinserted as NaN in the output.

Parameters:

coords (Tensor) – Shape (n_spots, d). Normalized spatial coordinates.
Y (Tensor) – Shape (n_spots, m). Target values; may contain NaN rows.

Returns:

Shape (n_spots, m). Spatial residuals. NaN rows from Y are preserved as NaN.

Return type:

Tensor

fit_residuals_batch(coords, Y_list)#

Fit residuals for multiple targets, amortizing kernel setup.

If signal_bounds_fixed and no eigendecomposition has been precomputed yet, this method precomputes it before the loop.

Parameters:

coords (Tensor) – Shape (n_spots, d). Normalized spatial coordinates.
Y_list (list of Tensor) – List of target matrices, each shape (n_spots, m_i).

Returns:

List of residual matrices, each shape (n_spots, m_i).

Return type:

list of Tensor

get_kernel_op(coords)#

Return a DenseKernelOp built from the current (fixed) params.

Parameters:: coords (Tensor) – Shape (n_spots, d). Normalized spatial coordinates.
Returns:: Kernel operator for the given coordinates.
Return type:: DenseKernelOp

class splisosm.gpr.GPyTorchKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', n_inducing=None, n_iter=100, lr=0.01, device='cpu')#

Bases: KernelGPR

GPyTorch GP residualizer.

Uses GPyTorch’s lazy-tensor infrastructure so that all linear-system solves during training use Lanczos-based conjugate gradients rather than dense Cholesky. Hyperparameters are optimized with Adam on the marginal log-likelihood.

For moderate datasets, the exact-GP path is used. Passing n_inducing > 0 enables the FITC sparse GP approximation via InducingPointKernel.

Parameters:

constant_value (float) – Initial signal amplitude (outputscale).
constant_value_bounds (tuple or "fixed") – Search bounds for the signal amplitude. "fixed" disables optimization of this parameter.
length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or "fixed") – Search bounds for the length scale.
n_inducing (int or None) – Number of inducing points for FITC sparse GP approximation. When set, memory scales as O(n_spots * n_inducing) rather than O(n_spots^2). Set to None to use exact GP.
n_iter (int) – Adam optimizer iterations for hyperparameter learning.
lr (float) – Adam learning rate.
device (str) – PyTorch device string, e.g. "cpu" or "cuda".

Notes

Requires gpytorch (optional dependency):

pip install gpytorch

Hyperparameter search uses the mean of all target columns as the training signal, then applies the fitted kernel to every column. This matches the sklearn backend behavior and amortizes GP fitting across output columns.

The noise (epsilon) search is handled by GPyTorch’s GaussianLikelihood, which is always optimized regardless of the signal bounds setting.

fit_residuals(coords, Y)#

Fit GP residuals using GPyTorch’s exact-GP backend.

Hyperparameters are optimized on the mean of all target columns. The fitted kernel and noise level are then applied to every column of Y to compute residuals.

NaN rows are excluded from fitting and reinserted as NaN in output.

Parameters:

coords (Tensor) – Shape (n_spots, d). Normalized spatial coordinates.
Y (Tensor) – Shape (n_spots, m). Target values; may contain NaN rows.

Returns:

Shape (n_spots, m). Spatial residuals epsilon * (K + epsilon * I)**(-1) @ Y. NaN rows from Y are preserved as NaN.

Return type:

Tensor

class splisosm.gpr.FFTKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', epsilon_bounds=(1e-05, 10.0), workers=None)#

Bases: KernelGPR

FFT GP residualizer for regular (ny, nx) grids.

Accepts the same hyperparameter interface as SklearnKernelGPR. Kernel eigenvalues are computed with FFTKernelOp; matrix-vector operations cost O(n_grid log n_grid) and no dense n_grid x n_grid kernel is formed.

Notes

Hyperparameter optimization strategy

The spectral log-marginal likelihood (LML) for Y with n_grid = ny * nx cells and m response channels is:

-2 LML = m * sum_k log(lambda_k + eps)
         + (1 / n_grid) * sum_k ||Y_hat_k||^2 / (lambda_k + eps)

where lambda_k = sigma^2 * s_k(l) are the kernel eigenvalues with signal variance sigma^2 and unit-spectrum eigenvalues s_k(l) at length scale l; eps is the white-noise variance; and Y_hat = FFT2(Y_cube) (unnormalized 2-D DFT).

Four optimization cases based on which bounds (constant_value_bounds, length_scale_bounds) are "fixed":

Both fixed → 1-D bounded scalar search over log(eps) only. Fastest; unit spectrum cached across repeated calls.
cv free, ls fixed → 2-D L-BFGS-B over (log_sigma2, log_eps). Unit spectrum cached.
cv fixed, ls free → 2-D L-BFGS-B over (log_l, log_eps). A fresh FFTKernelOp is built at each optimizer step.
Both free → 3-D L-BFGS-B over (log_sigma2, log_l, log_eps). A fresh FFTKernelOp is built at each optimizer step.

Parameters:

constant_value (float) – Initial signal variance sigma^2.
constant_value_bounds (tuple or "fixed") – Bounds for sigma^2. "fixed" pins sigma^2.
length_scale (float) – Initial RBF length scale.
length_scale_bounds (tuple or "fixed") – Bounds for the RBF length scale. "fixed" pins it; a 2-tuple (lo, hi) of positive floats enables length-scale optimization.
epsilon_bounds (tuple[float, float]) – Search bounds for the white-noise / regularization level.
workers (int or None) – Number of scipy.fft workers.

fit_residuals(coords, Y)#

Fit GP and return spatial residuals.

Auto-detects the regular grid from coords, rasterizes Y, optimizes hyperparameters via spectral LML, then de-rasterizes.

Parameters:

coords (Tensor) – Shape (n_spots, 2). Spatial coordinates forming a regular grid.
Y (Tensor) – Shape (n_spots, m). Target values; may contain NaN rows.

Returns:

Shape (n_spots, m). Spatial residuals. NaN rows are preserved.

Return type:

Tensor

fit_residuals_batch(coords, Y_list)#

Residualize a list of targets, amortizing grid detection and unit spectrum.

When signal_bounds_fixed is True the unit-spectrum eigenvalues are shared; only eps is re-optimised per target (1-D search). Otherwise each target gets full 2-D optimisation.

Parameters:

coords (Tensor) – Shape (n_spots, 2). Spatial coordinates forming a regular grid.
Y_list (list of Tensor) – Target matrices, each shape (n_spots, m_i).

Returns:

Residual matrices, each shape (n_spots, m_i).

Return type:

list of Tensor

fit_residuals_cube(data_cube, spacing=(1.0, 1.0), epsilon=None)#

Residualize a raster cube in O(N log N).

Kept for backward compatibility with SplisosmFFT.

Parameters:

data_cube (np.ndarray, shape (ny, nx) or (ny, nx, m))
spacing ((dy, dx)) – Physical grid spacing.
epsilon (float or None) – If given, skip optimization and use this value directly.

Returns:

residuals (np.ndarray, same shape as data_cube)
epsilon (float) – Regularization level used.

Return type:

tuple[ndarray, float]

get_kernel_op(coords=None)#

Return the cached unit-spectrum FFTKernelOp.

Raises:: RuntimeError – If called before any residualization.
Parameters:: coords (Tensor | None)
Return type:: FFTKernelOp

class splisosm.gpr.NUFFTKernelGPR(constant_value=1.0, constant_value_bounds=(0.001, 1000.0), length_scale=1.0, length_scale_bounds='fixed', epsilon_bounds=(1e-05, 10.0), n_modes=None, max_auto_modes=None, nufft_eps=1e-06, nufft_opts=None, lml_approx_rank=256, lml_exact_max_n=512, eigsh_tol=0.0001, period_margin=0.5, cg_rtol=1e-05, cg_maxiter=None, workers=None)#

Bases: KernelGPR

FINUFFT GP residualizer for irregular 2-D coordinates.

The backend keeps the RBF kernel implicit: FINUFFT evaluates kernel-vector products and conjugate gradients solve (K + epsilon * I) systems. Regular grids with matching n_modes use the same spectral likelihood as FFTKernelGPR; irregular grids use a leading-eigenpair summary plus tr(K) / tr(K**2) tail correction for marginal likelihood fitting.

Parameters:

constant_value (float) – Initial RBF signal variance.
constant_value_bounds (tuple or "fixed") – Bounds for the signal variance. "fixed" pins it.
length_scale (float) – Initial RBF length scale in coordinate units.
length_scale_bounds (tuple or "fixed") – Bounds for the length scale. "fixed" pins it.
epsilon_bounds (tuple[float, float]) – Search bounds for the white-noise level.
n_modes (int, tuple[int, int], or None) – Fourier mode grid. None chooses the full effective grid from the observed point count and aspect ratio.
max_auto_modes (int or None) – Optional cap on automatic n_modes. Ignored when n_modes is explicit.
nufft_eps (float) – FINUFFT requested precision.
nufft_opts (dict or None) – Extra FINUFFT options, such as {"upsampfac": 2.0}. Use workers for thread count.
lml_approx_rank (int or None) – Leading eigenpair count for irregular-grid marginal likelihoods. Memory is O(n_spots * lml_approx_rank). Ignored on compatible regular grids. None forces exact dense eigendecomposition for small diagnostics.
lml_exact_max_n (int) – Maximum n_spots for exact dense eigendecomposition.
eigsh_tol (float) – Relative tolerance for eigsh in the irregular-grid eigensummary path.
period_margin (float) – Fractional padding around the coordinate bounding box before periodic wrapping.
cg_rtol (float) – Relative tolerance for conjugate-gradient solves.
cg_maxiter (int or None) – Maximum CG iterations.
workers (int or None) – FINUFFT thread count passed as nthreads. None uses one FINUFFT thread.

Notes

Requires finufft (optional dependency):

pip install "splisosm[gp]"

Results should match FFTKernelGPR on compatible regular grids up to numerical tolerance. Compared with SklearnKernelGPR, differences near the bounding-box edges are expected because this backend uses periodic boundary conditions.

fit_residuals(coords, Y)#

Fit GP residuals for one target matrix on irregular 2-D coordinates.

Parameters:

coords (Tensor) – Shape (n_spots, 2). Normalized spatial coordinates.
Y (Tensor) – Shape (n_spots, m). Target values; may contain NaN rows.

Returns:

Shape (n_spots, m). Spatial residuals. NaN rows from Y are preserved as NaN.

Return type:

Tensor

fit_residuals_batch(coords, Y_list)#

Residualize multiple targets while reusing coordinates and mode grid.

Parameters:

coords (Tensor)
Y_list (list[Tensor])

Return type:

list[Tensor]

get_kernel_op(coords)#

Return a NUFFT kernel operator using the current fixed parameters.

Parameters:: coords (Tensor)
Return type:: NUFFTKernelOp

Factory helper#

splisosm.gpr.make_kernel_gpr(backend='sklearn', **kwargs)#

Construct a GP residualizer from a backend name.

Parameters:

backend ({"sklearn", "gpytorch", "fft", "nufft", "finufft"}) – Backend to use.
**kwargs – GP backend configuration. Common keys are constant_value, constant_value_bounds, length_scale, and length_scale_bounds. n_inducing is used by "sklearn" and "gpytorch" only. NUFFT-specific keys include n_modes, max_auto_modes, lml_approx_rank, period_margin, and conjugate-gradient / FINUFFT controls. Backend-irrelevant known keys from _DEFAULT_GPR_CONFIGS are ignored.

Returns:

GPR residualizer instance for the specified backend.

Return type:

KernelGPR

Raises:

ValueError – If backend is not one of the supported options. If an unknown configuration key is supplied.

GP Regression Backends

Contents

GP Regression Backends#

Kernel operators#

Concrete subclasses#

Extending SplisosmNP to large datasets#

Residualizer classes#

Factory helper#

Extending `SplisosmNP` to large datasets#