SpatialData Loaders#
These functions load common spatial transcriptomics outputs into the data structures expected by SPLISOSM.
Helper function to load Visium spatial metadata. |
|
Load standard Visium Space Ranger probe-based outputs. |
|
Load Visium HD outputs as SpatialData with probe-level binned tables. |
|
Load Xenium outputs and append multi-resolution codeword bin tables. |
- splisosm.io.load_visium_sp_meta(adata, path_to_spatial, library_id=None)#
Helper function to load Visium spatial metadata.
- splisosm.io.load_visium_probe(path, *, counts_file='raw_probe_bc_matrix.h5', library_id=None, load_spatial=True, counts_layer_name='counts', filtered_counts_file=True, return_type='anndata')#
Load standard Visium Space Ranger probe-based outputs.
Reads the probe-level count matrix (
raw_probe_bc_matrix.h5by default) from a Space Rangeroutsdirectory and optionally attaches spatial metadata (coordinates, images, scale factors) from thespatial/subfolder.This is the standard-resolution Visium counterpart of
load_visiumhd_probe()(which handles Visium HD multi-bin outputs).- Parameters:
path (str | Path) – Path to the Space Ranger
outsdirectory, e.g.<run_id>/outs. Must contain the HDF5 count matrix and, whenload_spatial=True, aspatial/subfolder.counts_file (str) –
Name of the HDF5 count matrix file inside
path. Typical choices:"raw_probe_bc_matrix.h5"— all barcodes, probe-level features (default; preserves per-probe information)."raw_feature_bc_matrix.h5"— all barcodes, gene-level features."filtered_feature_bc_matrix.h5"— tissue barcodes only, gene-level features.
library_id (str | None) – Library identifier stored in
adata.uns["spatial"](AnnData mode) or used to name SpatialData elements (SpatialData mode). Defaults to the parent directory name of path.load_spatial (bool) – Whether to load spatial metadata (tissue positions, images, scale factors). Only used when
return_type="anndata".counts_layer_name (str) – Layer name for the raw count matrix. The counts are stored in
adata.layers[counts_layer_name].filtered_counts_file (bool) – If
True(default), keep only in-tissue barcodes that appear infiltered_feature_bc_matrix.h5. IfFalse, keep all barcodes fromcounts_file(including background spots).return_type (str) –
Output format.
"anndata"(default) — return anAnnDatawith spatial metadata in.obsm["spatial"]and.uns["spatial"]."spatialdata"— return aSpatialDataobject built byspatialdata_io.visium(), with probe-levelvarmetadata restored and a counts layer added. Suitable for use withSplisosmFFT.
- Returns:
When
return_type="anndata":.X/.layers[counts_layer_name]— sparse count matrix.var— feature (probe or gene) metadata.obs— barcode metadata within_tissue,array_row,array_col(whenload_spatial=True).obsm["spatial"]—(n_spots, 2)pixel coordinates.uns["spatial"]— images and scale factors
When
return_type="spatialdata":sdata.tables["table"]— AnnData with probe-level counts in.layers[counts_layer_name]and full probe metadata in.varsdata.shapes[dataset_id]— spot geometriessdata.images— tissue images at multiple resolutions
- Return type:
- Raises:
FileNotFoundError – If the counts file or spatial directory is missing.
Examples
Load as AnnData (for
SplisosmNP):>>> from splisosm.io import load_visium_probe >>> adata = load_visium_probe("sample/outs")
Load as SpatialData (for
SplisosmFFT):>>> sdata = load_visium_probe("sample/outs", return_type="spatialdata")
- splisosm.io.load_visiumhd_probe(path, dataset_id=None, bin_sizes=None, bins_as_squares=True, fullres_image_file=None, load_all_images=False, var_names_make_unique=True, filtered_counts_file=True, counts_layer_name='counts', path_to_feature_2um_h5=None)#
Load Visium HD outputs as SpatialData with probe-level binned tables.
This wrapper uses
binned_outputs/square_002um/raw_probe_bc_matrix.h5(or a custompath_to_feature_2um_h5) as the source feature count matrix. It aggregates probe/peak/isoform counts to coarser bins or cells (square_008um,square_016umand, when available,cell_id) according to the spatial mappingbarcode_mappings.parquet(Space Ranger v4.0+ required).- Parameters:
path (str | Path) – Path to Space Ranger
outsdirectory for Visium HD.dataset_id (str | None) – Optional dataset ID passed to the SpatialData reader.
bin_sizes (list[int | str] | None) – Bin resolutions to include. Each entry can be
int(for example8) or Visium HD bin string (for example"square_008um"). IfNone, all availablesquare_*umbins underbinned_outputsare used.bins_as_squares (bool) – Whether bins are represented as squares when loading shapes.
fullres_image_file (str | Path | None) – Path to the full-resolution image.
load_all_images (bool) – Whether to load all optional images via
spatialdata_ioreader.var_names_make_unique (bool) – Whether to call
var_names_make_unique()on probe table variables.filtered_counts_file (bool) – Whether to keep only in-tissue 2um barcodes prior to aggregation. If
True, barcodes are taken from the source bin table loaded byvisium_hd(square_002um). If unavailable, the function falls back tobinned_outputs/square_002um/filtered_feature_bc_matrix.h5.counts_layer_name (str) – Layer name used to store aggregated probe counts in each output table.
path_to_feature_2um_h5 (str | Path | None) – Optional path to the raw 2um probe/peak/isoform counts matrix H5 or H5AD. If not provided, will look for
binned_outputs/square_002um/raw_feature_bc_matrix.h5.
- Returns:
A SpatialData object with probe-level tables for requested bins and, if available, cell-level segmentation.
- Return type:
- Raises:
ImportError – If required optional dependencies are not installed.
ValueError – If required files or requested bins are missing.
- splisosm.io.load_xenium_codeword(path, spatial_resolutions=(8.0, 16.0), quality_threshold=20.0, n_jobs=-1, chunk_batch_size=64, counts_layer_name='counts', build_cell_codeword_table=True, create_square_shapes=True, cells_boundaries=True, nucleus_boundaries=True, cells_as_circles=False, cells_labels=True, nucleus_labels=True, transcripts=True, morphology_mip=True, morphology_focus=True, aligned_images=True, cells_table=True, gex_only=True, show_progress=True)#
Load Xenium outputs and append multi-resolution codeword bin tables.
This wrapper reads Xenium Ranger
outswithspatialdata-ioand then quantifies codewords into square spatial bins at one or more user-defined resolutions using transcript-level chunk data (grids/0/*). Counting is implemented with vectorized sparse aggregation over(spot, codeword)pairs to reduce Python overhead while avoiding dependence on optional precomputed density matrices. For each resolution, a table namedsquare_XXXumis added tosdata.tables; optional square geometries with a_binssuffix are added tosdata.shapesso the tables can be used directly withspatialdata.rasterize_bins().transcripts.zarr.zipis expected to contain thedensity/codewordgroup for codeword indexing (Xenium Ranger v3.1+ required). Ifbuild_cell_codeword_table=Trueand thetranscripts.parquetfile is available, a cell-by-codeword anndata namedtable_codewordwill also be built and added tosdata.tables.- Parameters:
path (str | Path) – Path to Xenium Ranger output directory, or its parent containing
outs/.spatial_resolutions (Sequence[float] | None) – Spatial bin sizes in microns. Pass
Noneor an empty sequence to skip bin table creation entirely (cell-segmentation-only mode).quality_threshold (float) – Minimum transcript quality score to retain.
n_jobs (int) – Parallel worker count for chunk processing. Use
-1for all cores.chunk_batch_size (int) – Number of transcript chunks submitted per processing batch.
counts_layer_name (str) – Layer name used to store codeword counts in each output table.
build_cell_codeword_table (bool) – Whether to build a cell-by-codeword table from the transcripts parquet file.
create_square_shapes (bool) – Whether to create square bin shapes for each table key.
cells_boundaries (bool) – Passed to
spatialdata_io.readers.xenium.xenium.nucleus_boundaries (bool) – Passed to
spatialdata_io.readers.xenium.xenium.cells_as_circles (bool) – Passed to
spatialdata_io.readers.xenium.xenium.cells_labels (bool) – Passed to
spatialdata_io.readers.xenium.xenium.nucleus_labels (bool) – Passed to
spatialdata_io.readers.xenium.xenium.transcripts (bool) – Passed to
spatialdata_io.readers.xenium.xenium.morphology_mip (bool) – Passed to
spatialdata_io.readers.xenium.xenium.morphology_focus (bool) – Passed to
spatialdata_io.readers.xenium.xenium.aligned_images (bool) – Passed to
spatialdata_io.readers.xenium.xenium.cells_table (bool) – Passed to
spatialdata_io.readers.xenium.xenium.gex_only (bool) – Passed to
spatialdata_io.readers.xenium.xenium.show_progress (bool) – Whether to display progress bars while binning codewords.
- Returns:
SpatialData object augmented with bin-by-codeword count tables at each requested resolution and, when requested, a cell-by-codeword table named
table_codeword.- Return type:
- Raises:
ImportError – If required optional dependencies are not installed.
ValueError – If path/layout/arguments are invalid.