splisosm.kernel_gpr
===================

.. py:module:: splisosm.kernel_gpr

.. autoapi-nested-parse::

   Kernel Gaussian Process Regression backends for SPLISOSM spatial residualization.

   This module provides an abstract interface and concrete implementations for
   kernel-based conditional independence testing (via spatial residualization)
   used by the SPLISOSM differential-usage tests.

   Kernel-operator hierarchy
   --------------------------
   ``SpatialKernelOp`` (abstract):

       - ``DenseKernelOp``     - stores K explicitly; Cholesky solve; O(n^2) memory.
       - ``FFTKernelOp``       - operates in the spectral domain; O(N log N); no matrix.

   GPR-residualizer hierarchy
   ---------------------------
   ``KernelGPR`` (abstract):

       - ``SklearnKernelGPR``  - sklearn backend; dense; suitable for n <= ~10,000.
       - ``GPyTorchKernelGPR`` - GPyTorch backend (optional dep); lazy tensors.
       - ``FFTKernelGPR``      - FFT-based kernel; suitable for regular grids.


Classes
-------

.. autoapisummary::

   splisosm.kernel_gpr.DenseKernelOp
   splisosm.kernel_gpr.FFTKernelGPR
   splisosm.kernel_gpr.FFTKernelOp
   splisosm.kernel_gpr.GPyTorchKernelGPR
   splisosm.kernel_gpr.KernelGPR
   splisosm.kernel_gpr.SklearnKernelGPR
   splisosm.kernel_gpr.SpatialKernelOp


Functions
---------

.. autoapisummary::

   splisosm.kernel_gpr.fit_kernel_gpr
   splisosm.kernel_gpr.linear_hsic_test
   splisosm.kernel_gpr.make_kernel_gpr


Module Contents
---------------

.. py:class:: DenseKernelOp(K)

   Bases: :py:obj:`SpatialKernelOp`


   Dense kernel matrix wrapped as a ``SpatialKernelOp``.

   Stores the full n x n kernel matrix and solves linear systems via
   Cholesky factorization.  Suitable for n <= ~10,000.

   :param K: Symmetric positive semi-definite kernel matrix.
   :type K: torch.Tensor, shape (n_obs, n_obs)


   .. py:method:: eigenvalues(k = None)

      Return eigenvalues of K in descending order.

      :param k: Number of leading eigenvalues to return. None returns all.
      :type k: int or None

      :returns: Eigenvalues in descending order.
      :rtype: torch.Tensor, shape (k,) or (n_obs,)


   .. py:method:: eigh()

      Return ``(eigenvalues, eigenvectors)`` in ascending order.

      Useful for spectral operations such as the fast epsilon search in
      :func:`_kernel_residuals_from_eigdecomp`.

      :returns: * **eigenvalues** (*torch.Tensor, shape (n_obs,)*) -- Eigenvalues in ascending order.
                * **eigenvectors** (*torch.Tensor, shape (n_obs, n_obs)*) -- Corresponding orthonormal eigenvectors.


   .. py:method:: matvec(v)

      Compute ``K @ v``.

      :param v: Input vector or matrix.
      :type v: torch.Tensor, shape (n_obs,) or (n_obs, m)

      :returns: Result of K @ v.
      :rtype: torch.Tensor, shape (n_obs,) or (n_obs, m)


   .. py:method:: solve(v, epsilon)

      Solve ``(K + epsilon * I) u = v`` and return ``u``.

      :param v: Right-hand side vector or matrix.
      :type v: torch.Tensor, shape (n_obs, m)
      :param epsilon: Regularization / noise level (> 0).
      :type epsilon: float

      :returns: Solution u.
      :rtype: torch.Tensor, shape (n_obs, m)


   .. py:property:: n
      :type: int


      Number of data points.


.. py:class:: FFTKernelGPR(constant_value = 1.0, constant_value_bounds = (0.001, 1000.0), length_scale = 1.0, length_scale_bounds = 'fixed', epsilon_bounds = (1e-05, 10.0), workers = None)

   Bases: :py:obj:`KernelGPR`


   FFT-based GPR residualizer for regular (ny, nx) grids using an RBF kernel.

   Accepts the same hyperparameter interface as :class:`SklearnKernelGPR`.
   Kernel eigenvalues are computed via :class:`FFTKernelOp`; all linear-algebra
   is O(N log N) with no n x n matrix formed.

   Hyperparameter optimization strategy
   -------------------------------------
   The spectral log-marginal-likelihood (LML) for Y (n observations, m channels)
   is::

       -2 LML = m * sum_k log(lam_k + eps)
                + (1/n) * sum_k ||Y_hat_k||^2 / (lam_k + eps)

   where ``lam_k = sigma^2 * s_k(l)`` are the kernel eigenvalues,
   ``s_k(l)`` the unit-spectrum eigenvalues at length scale ``l``, and
   ``Y_hat = FFT2(Y_cube)`` (unnormalized 2-D DFT).

   Four optimization cases based on which bounds are ``"fixed"``:

   1. **Both fixed** → 1-D bounded scalar search over ``log(eps)`` only.
      Fastest; unit spectrum cached across repeated calls.
   2. **cv free, ls fixed** → 2-D L-BFGS-B over ``(log_sigma2, log_eps)``.
      Unit spectrum cached.
   3. **cv fixed, ls free** → 2-D L-BFGS-B over ``(log_l, log_eps)``.
      A fresh ``FFTKernelOp`` is built at each optimizer step.
   4. **Both free** → 3-D L-BFGS-B over ``(log_sigma2, log_l, log_eps)``.
      A fresh ``FFTKernelOp`` is built at each optimizer step.

   :param constant_value: Initial signal variance sigma^2.
   :type constant_value: float
   :param constant_value_bounds: Bounds for sigma^2.  ``"fixed"`` pins sigma^2.
   :type constant_value_bounds: tuple or "fixed"
   :param length_scale: Initial RBF length scale.
   :type length_scale: float
   :param length_scale_bounds: Bounds for the RBF length scale.  ``"fixed"`` pins it; a 2-tuple
                               ``(lo, hi)`` of positive floats enables length-scale optimization.
   :type length_scale_bounds: tuple or "fixed"
   :param epsilon_bounds: Search bounds for the white-noise / regularization level.
   :type epsilon_bounds: tuple[float, float]
   :param workers: Number of ``scipy.fft`` workers.
   :type workers: int or None


   .. py:method:: fit_residuals(coords, Y)

      Fit GP and return spatial residuals.

      Auto-detects the regular grid from ``coords``, rasterizes ``Y``,
      optimizes hyperparameters via spectral LML, then de-rasterizes.

      :param coords: Spatial coordinates forming a regular grid.
      :type coords: torch.Tensor, shape (n_obs, 2)
      :param Y: Target values (may contain NaN rows).
      :type Y: torch.Tensor, shape (n_obs, m)

      :returns: Spatial residuals. NaN rows are preserved.
      :rtype: torch.Tensor, shape (n_obs, m)


   .. py:method:: fit_residuals_batch(coords, Y_list)

      Residualize a list of targets, amortizing grid detection and unit spectrum.

      When :attr:`signal_bounds_fixed` is True the unit-spectrum
      eigenvalues are shared; only eps is re-optimised per target (1-D
      search).  Otherwise each target gets full 2-D optimisation.

      :param coords:
      :type coords: torch.Tensor, shape (n_obs, 2)
      :param Y_list:
      :type Y_list: list of torch.Tensor, each (n_obs, m_i)

      :rtype: list of torch.Tensor, each (n_obs, m_i)


   .. py:method:: fit_residuals_cube(data_cube, spacing = (1.0, 1.0), epsilon = None)

      Residualize a raster cube in O(N log N).

      Kept for backward compatibility with :class:`SplisosmFFT`.

      :param data_cube:
      :type data_cube: np.ndarray, shape (ny, nx) or (ny, nx, m)
      :param spacing: Physical grid spacing.
      :type spacing: (dy, dx)
      :param epsilon: If given, skip optimization and use this value directly.
      :type epsilon: float or None

      :returns: * **residuals** (*np.ndarray, same shape as data_cube*)
                * **epsilon** (*float*) -- Regularization level used.


   .. py:method:: get_kernel_op(coords = None)

      Return the cached unit-spectrum FFTKernelOp.

      :raises RuntimeError: If called before any residualization.


   .. py:property:: n
      :type: Optional[int]


      Total grid cells (ny * nx).


   .. py:property:: nx
      :type: Optional[int]


      Grid width (set after first residualization).


   .. py:property:: ny
      :type: Optional[int]


      Grid height (set after first residualization).


   .. py:property:: signal_bounds_fixed
      :type: bool


      True when BOTH ``constant_value_bounds`` and ``length_scale_bounds`` are ``"fixed"``.


.. py:class:: FFTKernelOp(ny, nx, dy, dx, constant_value, length_scale, workers = None)

   Bases: :py:obj:`SpatialKernelOp`


   Implicit RBF kernel linear operator on a periodic 2-D grid via FFT.

   Eigenvalues are computed as ``FFT2(K_row)`` where
   ``K_row[i, j] = sigma^2 * exp(-(di^2 + dj^2) / (2 * l^2))`` with
   periodic (torus) distances — no ``/n_grid`` normalization.  This
   matches the convention used by :class:`DenseKernelOp` and
   :class:`SklearnKernelGPR`.

   All linear-algebra operations are O(N log N) via FFT; the n x n kernel
   matrix is never formed.

   :param ny: Grid shape.
   :type ny: int
   :param nx: Grid shape.
   :type nx: int
   :param dy: Physical spacing between neighboring grid cells (used to compute
              torus distances).
   :type dy: float
   :param dx: Physical spacing between neighboring grid cells (used to compute
              torus distances).
   :type dx: float
   :param constant_value: RBF signal variance sigma^2.
   :type constant_value: float
   :param length_scale: RBF length scale (in the same units as dy/dx).
   :type length_scale: float
   :param workers: Number of ``scipy.fft`` workers.
   :type workers: int or None


   .. py:method:: eigenvalues(k = None)

      Return eigenvalues in descending order.

      :param k: Number of leading eigenvalues. ``None`` returns all.
      :type k: int or None

      :rtype: np.ndarray, shape (k,) or (n_obs,).


   .. py:method:: matvec(v)

      Compute ``K @ v`` in O(N log N).

      :param v:
      :type v: np.ndarray, shape (n_obs,) or (n_obs, m)

      :rtype: np.ndarray, same shape as v.


   .. py:method:: residuals(v, epsilon)

      Apply ``epsilon * (K + epsilon * I)^{-1} @ v`` in O(N log N).

      :param v:
      :type v: np.ndarray, shape (n_obs,) or (n_obs, m)
      :param epsilon:
      :type epsilon: float

      :rtype: np.ndarray, same shape as v.


   .. py:method:: solve(v, epsilon)

      Solve ``(K + epsilon * I) u = v`` in O(N log N).

      :param v:
      :type v: np.ndarray, shape (n_obs,) or (n_obs, m)
      :param epsilon: Regularization level (> 0).
      :type epsilon: float

      :rtype: np.ndarray, same shape as v.


   .. py:method:: square_trace()

      Return ``trace(K^2)``.


   .. py:method:: trace()

      Return ``trace(K)``.


   .. py:property:: eigenvalues_2d
      :type: numpy.ndarray


      2-D eigenvalue array, shape (ny, nx). Computed lazily.


   .. py:property:: n
      :type: int


      Total grid cells ny * nx.


   .. py:property:: nx
      :type: int


      Grid width.


   .. py:property:: ny
      :type: int


      Grid height.


.. py:class:: GPyTorchKernelGPR(constant_value = 1.0, constant_value_bounds = (0.001, 1000.0), length_scale = 1.0, length_scale_bounds = 'fixed', n_inducing = None, n_iter = 100, lr = 0.01, device = 'cpu')

   Bases: :py:obj:`KernelGPR`


   GPyTorch-based GP residualizer.

   Uses GPyTorch's lazy-tensor infrastructure so that all linear-system
   solves during training use Lanczos-based conjugate gradients rather
   than dense Cholesky.  Hyperparameters are optimized with Adam on the
   marginal log-likelihood.

   For most datasets (n <= ~10,000) the exact-GP path is used.  Passing
   ``n_inducing > 0`` enables the FITC sparse GP approximation via
   InducingPointKernel.

   :param constant_value: Initial signal amplitude (outputscale).
   :type constant_value: float
   :param constant_value_bounds: Search bounds for the signal amplitude.  ``"fixed"`` disables
                                 optimization of this parameter.
   :type constant_value_bounds: tuple or "fixed"
   :param length_scale: Initial RBF length scale.
   :type length_scale: float
   :param length_scale_bounds: Search bounds for the length scale.
   :type length_scale_bounds: tuple or "fixed"
   :param n_inducing: Number of inducing points for FITC sparse GP approximation. When set,
                      memory scales as O(n x n_inducing) rather than O(n²). Set to None to
                      use exact GP.
   :type n_inducing: int or None
   :param n_iter: Adam optimizer iterations for hyperparameter learning.
   :type n_iter: int
   :param lr: Adam learning rate.
   :type lr: float
   :param device: PyTorch device string, e.g. ``"cpu"`` or ``"cuda"``.
   :type device: str

   .. rubric:: Notes

   Requires ``gpytorch`` (optional dependency)::

       pip install gpytorch

   Hyperparameter search uses the mean of all target columns as the
   training signal, then applies the fitted kernel to every column.
   This matches the sklearn backend behavior and amortizes O(n^3) GP
   fitting across outputs.

   The noise (epsilon) search is handled by GPyTorch's
   ``GaussianLikelihood``, which is always optimized regardless of the
   signal bounds setting.


   .. py:method:: fit_residuals(coords, Y)

      Fit GP residuals using GPyTorch's exact-GP backend.

      Hyperparameters are optimized on the mean of all target columns.
      The fitted kernel and noise level are then applied to every column
      of ``Y`` to compute residuals.

      NaN rows are excluded from fitting and reinserted as NaN in output.

      :param coords: Normalized spatial coordinates.
      :type coords: torch.Tensor, shape (n_obs, d)
      :param Y: Target values; may contain NaN rows.
      :type Y: torch.Tensor, shape (n_obs, m)

      :returns: Spatial residuals ``epsilon * (K + epsilon * I)**(-1) @ Y``.
                NaN rows from Y are preserved as NaN.
      :rtype: torch.Tensor, shape (n_obs, m)


   .. py:attribute:: device
      :value: 'cpu'


   .. py:attribute:: lr
      :value: 0.01


   .. py:attribute:: n_inducing
      :value: None


   .. py:attribute:: n_iter
      :value: 100


   .. py:property:: signal_bounds_fixed
      :type: bool


      True when both constant_value and length_scale bounds are ``"fixed"``.


.. py:class:: KernelGPR

   Bases: :py:obj:`abc.ABC`


   Abstract GPR residualizer: learns a spatial kernel and returns residuals.

   Subclasses determine how hyperparameters are optimized (MLE, variational,
   fixed) and whether the kernel matrix is materialized (dense) or kept
   implicit.

   All subclasses expose the same ``fit_residuals`` interface so that the
   DU-test pipeline in ``SplisosmNP`` and ``SplisosmFFT`` is backend-agnostic.


   .. py:method:: fit_residuals(coords, Y)
      :abstractmethod:


      Fit GP to (coords, Y) and return spatial residuals.

      :param coords: Spatial coordinates (should be pre-normalized by the caller).
      :type coords: torch.Tensor, shape (n_obs, d)
      :param Y: Target values (may contain NaN rows, which are handled internally).
      :type Y: torch.Tensor, shape (n_obs, m)

      :returns: **residuals** -- Residuals Y - GP_smooth(Y | coords). NaN rows in Y are preserved as NaN.
      :rtype: torch.Tensor, shape (n_obs, m)


   .. py:method:: fit_residuals_batch(coords, Y_list)

      Fit residuals for many targets sharing the same coordinates.

      The default implementation calls :meth:`fit_residuals` sequentially.
      Subclasses may override this to amortize work across targets
      (e.g. ``SklearnKernelGPR`` precomputes a shared eigendecomposition
      when signal bounds are fixed).

      :param coords: Spatial coordinates.
      :type coords: torch.Tensor, shape (n_obs, d)
      :param Y_list: List of target matrices, each shape (n_obs, m_i).
      :type Y_list: list of torch.Tensor

      :returns: List of residual matrices, each shape (n_obs, m_i).
      :rtype: list of torch.Tensor


   .. py:method:: get_kernel_op(coords)
      :abstractmethod:


      Build the kernel operator for given coordinates.

      Returns a :class:`SpatialKernelOp` whose hyperparameters have been set
      (but not yet optimized for a specific target Y).  Useful when the
      caller needs the operator for its own purposes (e.g. computing
      eigenvalues for the HSIC null distribution).

      Not all backends support this method.

      :param coords: Spatial coordinates.
      :type coords: torch.Tensor, shape (n_obs, d)

      :returns: Kernel operator for the given coordinates.
      :rtype: SpatialKernelOp

      :raises NotImplementedError: If not implemented by the subclass.


.. py:class:: SklearnKernelGPR(constant_value = 1.0, constant_value_bounds = (0.001, 1000.0), length_scale = 1.0, length_scale_bounds = 'fixed', n_inducing = 5000)

   Bases: :py:obj:`KernelGPR`


   sklearn-based GPR residualizer.

   Uses ``GaussianProcessRegressor`` with a ``ConstantKernel x RBF +
   WhiteKernel`` to learn spatial hyperparameters and residualize targets.

   When both signal bounds are ``"fixed"`` (:attr:`signal_bounds_fixed`),
   the base kernel matrix is the same for every target sharing the same
   coordinates.  Calling :meth:`precompute_shared_kernel` once before a
   loop over many targets reduces per-target cost from O(n^3) GP fitting
   to O(n^2) spectral operations (see :func:`_kernel_residuals_from_eigdecomp`).

   :param constant_value: Initial signal amplitude.
   :type constant_value: float
   :param constant_value_bounds: Search bounds for the signal amplitude.  ``"fixed"`` disables
                                 optimization of this parameter.
   :type constant_value_bounds: tuple or ``"fixed"``
   :param length_scale: Initial RBF length scale.
   :type length_scale: float
   :param length_scale_bounds: Search bounds for the length scale.
   :type length_scale_bounds: tuple or ``"fixed"``
   :param n_inducing: Number of inducing points.  When ``n_obs <= n_inducing`` the full
                      exact GP is used.  When ``n_obs > n_inducing``, a subset of
                      ``n_inducing`` randomly sampled points is used as the inducing set
                      for an approximate GP (subset-of-data / Nyström-style approximation).
   :type n_inducing: int

   .. rubric:: Notes

   This backend always materialises a dense n x n kernel matrix up to
   ``n_inducing`` points.  For very large n consider ``GPyTorchKernelGPR``
   with inducing points.


   .. py:method:: fit_residuals(coords, Y)

      Fit GP residuals for a single target matrix.

      Chooses the fast spectral path when :attr:`signal_bounds_fixed` and
      the shared eigendecomposition has been precomputed; otherwise falls
      back to a full sklearn GP fit when n_obs <= n_inducing, or an
      inducing-point approximation when n_obs > n_inducing.

      NaN rows in ``Y`` are excluded from fitting and reinserted as NaN
      in the output.

      :param coords: Normalized spatial coordinates.
      :type coords: torch.Tensor, shape (n_obs, d)
      :param Y: Target values; may contain NaN rows.
      :type Y: torch.Tensor, shape (n_obs, m)

      :returns: Spatial residuals. NaN rows from Y are preserved as NaN.
      :rtype: torch.Tensor, shape (n_obs, m)


   .. py:method:: fit_residuals_batch(coords, Y_list)

      Fit residuals for multiple targets, amortizing kernel setup.

      If :attr:`signal_bounds_fixed` and no eigendecomposition has been
      precomputed yet, this method precomputes it before the loop.

      :param coords: Normalized spatial coordinates.
      :type coords: torch.Tensor, shape (n_obs, d)
      :param Y_list: List of target matrices, each shape (n_obs, m_i).
      :type Y_list: list of torch.Tensor

      :returns: List of residual matrices, each shape (n_obs, m_i).
      :rtype: list of torch.Tensor


   .. py:method:: from_config(config)
      :classmethod:


      Construct from a configuration dictionary.

      :param config: Configuration dict with optional keys: ``constant_value``,
                     ``constant_value_bounds``, ``length_scale``, ``length_scale_bounds``,
                     ``n_inducing``.
      :type config: dict

      :returns: New instance constructed from the configuration.
      :rtype: SklearnKernelGPR


   .. py:method:: get_kernel_op(coords)

      Return a ``DenseKernelOp`` built from the current (fixed) params.

      :param coords: Normalized spatial coordinates.
      :type coords: torch.Tensor, shape (n_obs, d)

      :returns: Kernel operator for the given coordinates.
      :rtype: DenseKernelOp


   .. py:method:: precompute_shared_kernel(coords)

      Precompute and cache the eigendecomposition of the fixed-param kernel.

      Call this once before a loop over many targets that share the same
      coordinates and have ``signal_bounds_fixed == True``.  Subsequent calls
      to :meth:`fit_residuals` will use the cached decomposition and run in
      O(n^2) rather than O(n^3).

      Has no effect (and logs a warning) when ``signal_bounds_fixed`` is
      False.

      :param coords: Normalized spatial coordinates, shape (n_obs, d).
      :type coords: torch.Tensor, shape (n_obs, d)

      :rtype: None


   .. py:attribute:: n_inducing
      :value: 5000


   .. py:property:: signal_bounds_fixed
      :type: bool


      True when both constant_value and length_scale bounds are fixed.


.. py:class:: SpatialKernelOp

   Bases: :py:obj:`abc.ABC`


   Abstract spatial kernel linear operator.

   Provides kernel-vector products and linear system solves without requiring
   the full n x n kernel matrix to be materialised in memory.

   Concrete subclasses
   -------------------
   ``DenseKernelOp``
       Stores K explicitly as a ``torch.Tensor``; solves via Cholesky.
       Suitable for n <= ~10,000.
   ``FFTKernelOp`` (``splisosm.hyptest_fft``)
       Operates entirely in the spectral domain; O(N log N) per operation;
       no n x n matrix formed at any point.

   Extending ``SplisosmNP`` to large datasets
   ------------------------------------------
   ``SplisosmNP`` currently builds ``DenseKernelOp`` instances and passes them
   to ``SklearnKernelGPR``.  To support n > 10,000, replace the operator with
   an implicit subclass (e.g. ``LanczosKernelOp``) and pair it with an
   iterative ``KernelGPR`` that only calls ``matvec``; the rest of the
   pipeline is unchanged.


   .. py:method:: eigenvalues(k = None)
      :abstractmethod:


      Return eigenvalues of K in descending order.

      :param k: Number of leading eigenvalues to return. None returns all.
      :type k: int or None

      :returns: Eigenvalues in descending order.
      :rtype: torch.Tensor, shape (k,) or (n_obs,)


   .. py:method:: matvec(v)
      :abstractmethod:


      Compute ``K @ v``.

      :param v: Input vector or matrix.
      :type v: torch.Tensor, shape (n_obs,) or (n_obs, m)

      :returns: Result of K @ v.
      :rtype: torch.Tensor, shape (n_obs,) or (n_obs, m)


   .. py:method:: residuals(v, epsilon)

      Apply the kernel regression residual operator.

      Computes ``epsilon * (K + epsilon * I)**(-1) @ v``, i.e. the part
      of ``v`` that is not explained by the GP mean.

      :param v: Input vector or matrix.
      :type v: torch.Tensor, shape (n_obs, m)
      :param epsilon: Regularization / noise level (> 0).
      :type epsilon: float

      :returns: Residual vector or matrix.
      :rtype: torch.Tensor, shape (n_obs, m)


   .. py:method:: solve(v, epsilon)
      :abstractmethod:


      Solve ``(K + epsilon * I) u = v`` and return ``u``.

      :param v: Right-hand side vector or matrix.
      :type v: torch.Tensor, shape (n_obs, m)
      :param epsilon: Regularization / noise level (> 0).
      :type epsilon: float

      :returns: Solution u.
      :rtype: torch.Tensor, shape (n_obs, m)


   .. py:method:: square_trace()

      Return ``trace(K**2)``.


   .. py:method:: trace()

      Return ``trace(K)`` (sum of eigenvalues).


   .. py:property:: n
      :type: int

      :abstractmethod:


      Number of data points.


.. py:function:: fit_kernel_gpr(X, Y, normalize_x = True, normalize_y = True, return_residuals = True, constant_value = 1.0, constant_value_bounds = (0.001, 1000.0), length_scale = 1.0, length_scale_bounds = (0.01, 100.0))

   Fit a Gaussian process regression and optionally return residuals.

   .. deprecated::
       Use :class:`SklearnKernelGPR` directly.  This wrapper is kept for
       backward compatibility and will be removed in a future release.

   :param X: Input spatial coordinates.
   :type X: torch.Tensor, shape (n_obs, d)
   :param Y: Target values.
   :type Y: torch.Tensor, shape (n_obs, m)
   :param normalize_x: Whether to z-score normalize X before fitting.
   :type normalize_x: bool
   :param normalize_y: Whether to z-score normalize Y before fitting.
   :type normalize_y: bool
   :param return_residuals: If True, return residuals; if False, return ``(Kxy, epsilon)``.
   :type return_residuals: bool
   :param constant_value: Initial signal amplitude.
   :type constant_value: float
   :param constant_value_bounds: Search bounds for the signal amplitude.
   :type constant_value_bounds: tuple or ``"fixed"``
   :param length_scale: Initial RBF length scale.
   :type length_scale: float
   :param length_scale_bounds: Search bounds for the length scale.
   :type length_scale_bounds: tuple or ``"fixed"``

   :returns: If ``return_residuals`` is False: ``(Kxy, epsilon)`` kernel matrix and
             noise level.  If ``return_residuals`` is True: residual tensor of shape
             (n_obs, m).
   :rtype: tuple[torch.Tensor, float] or torch.Tensor


.. py:function:: linear_hsic_test(X, Y, centering = True)

   The linear HSIC test (multivariate RV coefficient).

   Equivalent to a multivariate extension of Pearson correlation.

   :param X:
   :type X: torch.Tensor, shape (n_samples, n_x)
   :param Y:
   :type Y: torch.Tensor, shape (n_samples, n_y)
   :param centering: Whether to mean-centre X and Y.
   :type centering: bool

   :returns: * **hsic** (*float*) -- HSIC test statistic (scaled by 1 / (n - 1)**2).
             * **pvalue** (*float*) -- P-value from the asymptotic chi-squared mixture distribution.


.. py:function:: make_kernel_gpr(backend = 'sklearn', **kwargs)

   Construct a KernelGPR from a backend name.

   :param backend: Backend to use.  ``"fft"`` accepts the same kwargs as ``"sklearn"``
                   (``constant_value``, ``constant_value_bounds``, ``length_scale``,
                   ``length_scale_bounds``) plus ``epsilon_bounds`` and ``workers``.
                   ``n_inducing`` is supported by both ``"sklearn"`` and ``"gpytorch"``
                   with the same name (see :data:`_DEFAULT_GPR_CONFIGS`).
   :type backend: {"sklearn", "gpytorch", "fft"}
   :param \*\*kwargs: Passed to the backend constructor.

   :returns: GPR residualizer instance for the specified backend.
   :rtype: KernelGPR

   :raises ValueError: If backend is not one of the supported options.