Interoperability

5. Interoperability#

Summary

Interoperabilty between languages allows analysts to take advantage of the strengths of different ecosystems
On-disk interoperability uses standard file formats to transfer data and is typically more reliable
In-memory interoperabilty transfers data directly between parallel sessions and is convenient for interactive analysis
While interoperability is currently possible developers continue to improve the experience

5.1. Motivation#

As we have discussed in the analysis frameworks and tools chapter there are three main ecosystems for single-cell analysis, the Bioconductor and Seurat ecosystems in R and the Python-based scverse ecosystem. A common question from new analysts is which ecosystem they should focus on learning and using? While it makes sense to focus on one to start with, and a successful standard analysis can be performed in any ecosystem, we promote the idea that competent analysts should be familiar with all three ecosystems and comfortable moving between them. This approach allows analysts to use the best-performing tools and methods regardless of how they were implemented. When analysts are not comfortable moving between ecosystems they often tend to use packages that are easy to access, even when they have been shown to have shortcomings compared to packages in another ecosystem. The ability of analysts to move between ecosystems allows developers to take advantage of the different strengths of programming languages. For example, R has strong inbuilt support for complex statistical modelling while the majority of deep learning libraries are focused on Python. By supporting common on-disk data formats and in-memory data structures developers can be confident that analysts can access their package and can use the platform that is most appropriate for their method. Another motivation for being comfortable with multiple ecosystems is the accessibility and availability of data, results and documentation. Often data or results are only made available in one format and analysts will need to be familiar with that format in order to access it. A basic understanding of other ecosystems is also necessary to understand package documentation and tutorials when deciding which methods to use.

While we encourage analysts to be comfortable with all the major ecosystems, moving between them is only possible when they are interoperable. Thankfully, lots of work has been done in this area and it is now relatively simple in most cases using standard packages. In this chapter, we discuss the various ways data can be moved between ecosystems via disk or in-memory, the differences between them and their advantages. We focus on single-modality data and moving between R and Python as these are the most common cases but we also touch on multimodal data and other languages.

5.2. Nomenclature#

Because talking about different languages can get confusing we try to use the following conventions:

{package} - An R package
package::function() - A function in an R package
package - A Python package
package.function() - A function in a Python package
Emphasised - Some other important concept
code - Other parts of code including objects, variables etc. This is also used for files or directories.

import tempfile
from pathlib import Path

import anndata
import anndata2ri
import mudata
import numpy
import rpy2.robjects
import scanpy
from scipy.sparse import csr_matrix

anndata2ri.activate()
%load_ext rpy2.ipython

5.3. Disk-based interoperability#

The first approach to moving between languages is via disk-based interoperability. This involves writing a file to disk in one language and then reading that file into a second language. In many cases, this approach is simpler, more reliable and scalable than in-memory interoperability (which we discuss below) but it comes at the cost of greater storage requirements and reduced interactivity. Disk-based interoperability tends to work particularly well when there are established processes for each stage of analysis and you want to pass objects from one to the next (especially as part of a pipeline developed using a workflow manager such as Nextflow or snakemake). However, disk-based interoperability is less convenient for interactive steps such as data exploration or experimenting with methods as you need to write a new file whenever you want to move between languages.

5.3.1. Simple formats#

Before discussing file formats specifically developed for single-cell data we want to briefly mention that common simple text file formats (such as CSV, TSV, JSON etc.) can often be the answer to transferring data between languages. They work well in cases where some analysis has been performed and what you want to transfer is a subset of the information about an experiment. For example, you may want to transfer only the cell metadata but do not require the feature metadata, expression matrices etc. The advantage of using simple text formats is that they are well supported by almost any language and do not require single-cell specific packages. However, they can quickly become impractical as what you want to transfer becomes more complex.

5.3.2. HDF5-based formats#

The most common disk formats for single-cell data are based on Hierarchical Data Format version 5 or HDF5. This is an open-source file format designed for storing large, complex and heterogeneous data. It has a file directory type structure (similar to how files and folders are organised on your computer) which allows many different kinds of data to be stored in a single file with an arbitrarily complex hierarchy. While this format is very flexible, to properly interact with it you need to know where and how the different information is stored. For this reason, standard specifications for storing single-cell data in HDF5 files have been developed.

5.3.2.1. H5AD#

The H5AD format is the HDF5 disk representation of the AnnData object used by scverse packages and is commonly used to share single-cell datasets. As it is part of the scverse ecosystem, reading and writing these files from Python is well-supported and is part of the core functionality of the anndata package (read more about the format here).

To demonstrate interoperability we will use a small, randomly generated dataset that has gone through some of the steps of a standard analysis workflow to populate the various slots.

# Create a randomly generated AnnData object to use as an example
counts = csr_matrix(
    numpy.random.default_generator().poisson(1, size=(100, 2000)), dtype=numpy.float32
)
adata = anndata.AnnData(counts)
adata.obs_names = [f"Cell_{i:d}" for i in range(adata.n_obs)]
adata.var_names = [f"Gene_{i:d}" for i in range(adata.n_vars)]
# Do some standard processing to populate the object
scanpy.pp.calculate_qc_metrics(adata, inplace=True)
adata.layers["counts"] = adata.X.copy()
scanpy.pp.normalize_total(adata, inplace=True)
scanpy.pp.log1p(adata)
scanpy.pp.highly_variable_genes(adata, inplace=True)
scanpy.tl.pca(adata)
scanpy.pp.neighbors(adata)
scanpy.tl.umap(adata)
adata

AnnData object with n_obs × n_vars = 100 × 2000
    obs: 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes'
    var: 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'log1p', 'hvg', 'pca', 'neighbors', 'umap'
    obsm: 'X_pca', 'X_umap'
    varm: 'PCs'
    layers: 'counts'
    obsp: 'distances', 'connectivities'

We will write this mock object to disk as a H5AD file to demonstrate how those files can be read from R.

temp_dir = tempfile.TemporaryDirectory()
h5ad_file = Path(temp_dir.name) / "example.h5ad"

adata.write_h5ad(h5ad_file)

Several packages exist for reading and writing H5AD files from R. While they result in a file on disk these packages usually rely on wrapping the Python anndata package to handle the actual reading and writing of files with an in-memory conversion step to convert between R and Python.

5.3.2.1.1. Reading/writing H5AD with Bioconductor#

The Bioconductor {zellkonverter} package helps make this easier by using the {basilisk} package to manage creating an appropriate Python environment. If that all sounds a bit technical, the end result is that Bioconductor users can read and write H5AD files using commands like below without requiring any knowledge of Python.

Unfortunately, because of the way this book is made, we are unable to run the code directly here. Instead, we will show the code and what the output looks like when run in an R session:

sce <- zellkonverter::readH5AD(h5ad_file, verbose = TRUE)

ℹ Using the Python reader
ℹ Using anndata version 0.8.0
✔ Read /.../luke.zappia/Downloads/example.h5ad [113ms]
✔ uns$hvg$flavor converted [17ms]
✔ uns$hvg converted [50ms]
✔ uns$log1p converted [25ms]
✔ uns$neighbors converted [18ms]
✔ uns$pca$params$use_highly_variable converted [16ms]
✔ uns$pca$params$zero_center converted [16ms]
✔ uns$pca$params converted [80ms]
✔ uns$pca$variance converted [17ms]
✔ uns$pca$variance_ratio converted [16ms]
✔ uns$pca converted [184ms]
✔ uns$umap$params$a converted [16ms]
✔ uns$umap$params$b converted [16ms]
✔ uns$umap$params converted [80ms]
✔ uns$umap converted [112ms]
✔ uns converted [490ms]
✔ Converting uns to metadata ... done
✔ X matrix converted to assay [29ms]
✔ layers$counts converted [27ms]
✔ Converting layers to assays ... done
✔ var converted to rowData [25ms]
✔ obs converted to colData [24ms]
✔ varm$PCs converted [18ms]
✔ varm converted [47ms]
✔ Converting varm to rowData$varm ... done
✔ obsm$X_pca converted [15ms]
✔ obsm$X_umap converted [16ms]
✔ obsm converted [80ms]
✔ Converting obsm to reducedDims ... done
ℹ varp is empty and was skipped
✔ obsp$connectivities converted [22ms]
✔ obsp$distances converted [23ms]
✔ obsp converted [92ms]
✔ Converting obsp to colPairs ... done
✔ SingleCellExperiment constructed [164ms]
ℹ Skipping conversion of raw
✔ Converting AnnData to SingleCellExperiment ... done

Because we have turned on the verbose output you can see how {zellkonverter} reads the file using Python and converts each part of the AnnData object to a Bioconductor SingleCellExperiment object. We can see what the result looks like:

sce

class: SingleCellExperiment
dim: 2000 100
metadata(5): hvg log1p neighbors pca umap
assays(2): X counts
rownames(2000): Gene_0 Gene_1 ... Gene_1998 Gene_1999
rowData names(11): n_cells_by_counts mean_counts ... dispersions_norm
  varm
colnames(100): Cell_0 Cell_1 ... Cell_98 Cell_99
colData names(8): n_genes_by_counts log1p_n_genes_by_counts ...
  pct_counts_in_top_200_genes pct_counts_in_top_500_genes
reducedDimNames(2): X_pca X_umap
mainExpName: NULL
altExpNames(0):

This object can then be used as normal by any Bioconductor package. If we want to write a new H5AD file we can use the writeH5AD() function:

zellkonverter_h5ad_file <- tempfile(fileext = ".h5ad")
zellkonverter::writeH5AD(sce, zellkonverter_h5ad_file, verbose = TRUE)

ℹ Using anndata version 0.8.0
ℹ Using the 'X' assay as the X matrix
✔ Selected X matrix [29ms]
✔ assays$X converted to X matrix [50ms]
✔ additional assays converted to layers [30ms]
✔ rowData$varm converted to varm [28ms]
✔ reducedDims converted to obsm [68ms]
✔ metadata converted to uns [24ms]
ℹ rowPairs is empty and was skipped
✔ Converting AnnData to SingleCellExperiment ... done
✔ Wrote '/.../.../rj/.../T/.../file102cfa97cc51.h5ad ' [133ms]

We can then read this file in Python:

scanpy.read_h5ad(zellkonverter_h5ad_file)

AnnData object with n_obs × n_vars = 100 × 2000
    obs: 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes'
    var: 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'X_name', 'hvg', 'log1p', 'neighbors', 'pca', 'umap'
    obsm: 'X_pca', 'X_umap'
    varm: 'PCs'
    layers: 'counts'
    obsp: 'connectivities', 'distances'

If this the first time that you run a {zellkonverter} function you will see that it first creates a special conda environment to use (which can take a while). Once that environment exists it will be re-used by following function calls. {zellkonverter} has additional options such as allowing you to selectively read or write parts for an object. Please refer to the package documentation for more details. Similar functionality for writing a SingleCellExperimentObject to an H5AD file can be found in the {sceasy} package. While these packages are effective, wrapping Python requires some overhead which should be addressed by native R H5AD writers/readers in the future.

5.3.2.1.2. Reading/writing H5AD with {Seurat}#

Converting between a Seurat object and an H5AD file is a two-step process as suggested by this tutorial. Firstly H5AD file is converted to a H5Seurat file (a custom HDF5 format for Seurat objects) using the {SeuratDisk} package and then this file is read as a Seurat object.

%%R -i h5ad_file

message("Converting H5AD to H5Seurat...")
SeuratDisk::Convert(h5ad_file, dest = "h5seurat", overwrite = TRUE)
message("Reading H5Seurat...")
h5seurat_file <- gsub(".h5ad", ".h5seurat", h5ad_file)
seurat <- SeuratDisk::LoadH5Seurat(h5seurat_file, assays = "RNA")
message("Read Seurat object:")
seurat

R[write to console]: Converting H5AD to H5Seurat...

R[write to console]: The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
which was just loaded, will retire in October 2023.
Please refer to R-spatial evolution reports for details, especially
https://r-spatial.org/r/2023/05/15/evolution4.html.
It may be desirable to make the sf package available;
package maintainers should consider adding sf to Suggests:.
The sp package is now running under evolution status 2
     (status 2 uses the sf package in place of rgdal)

    WARNING: The R package "reticulate" only fixed recently
    an issue that caused a segfault when used with rpy2:
    https://github.com/rstudio/reticulate/pull/1188
    Make sure that you use a version of that package that includes
    the fix.
    

R[write to console]: Registered S3 method overwritten by 'SeuratDisk':
  method            from  
  as.sparse.H5Group Seurat

R[write to console]: Warnung:
R[write to console]:  Unknown file type: h5ad

R[write to console]: Warnung:
R[write to console]:  'assay' not set, setting to 'RNA'

R[write to console]: Creating h5Seurat file for version 3.1.5.9900

R[write to console]: Adding X as data

R[write to console]: Adding X as counts

R[write to console]: Adding meta.features from var

R[write to console]: Adding X_pca as cell embeddings for pca

R[write to console]: Adding X_umap as cell embeddings for umap

R[write to console]: Adding PCs as feature loadings fpr pca

R[write to console]: Adding miscellaneous information for pca

R[write to console]: Adding standard deviations for pca

R[write to console]: Adding miscellaneous information for umap

R[write to console]: Adding hvg to miscellaneous data

R[write to console]: Adding log1p to miscellaneous data

R[write to console]: Adding layer counts as data in assay counts

R[write to console]: Adding layer counts as counts in assay counts

R[write to console]: Reading H5Seurat...

R[write to console]: Validating h5Seurat file

R[write to console]: Warnung:
R[write to console]:  Feature names cannot have underscores ('_'), replacing with dashes ('-')

R[write to console]: Initializing RNA with data

R[write to console]: Adding counts for RNA

R[write to console]: Adding feature-level metadata for RNA

R[write to console]: Adding reduction pca

R[write to console]: Adding cell embeddings for pca

R[write to console]: Adding feature loadings for pca

R[write to console]: Adding miscellaneous information for pca

R[write to console]: Adding reduction umap

R[write to console]: Adding cell embeddings for umap

R[write to console]: Adding miscellaneous information for umap

R[write to console]: Adding command information

R[write to console]: Adding cell-level metadata

R[write to console]: Read Seurat object:

An object of class Seurat 
2000 features across 100 samples within 1 assay 
Active assay: RNA (2000 features, 0 variable features)
 2 dimensional reductions calculated: pca, umap

Note that because the structure of a Seurat object is quite different from AnnData and SingleCellExperiment objects the conversion process is more complex. See the documentation of the conversion function for more details on how this is done.

The {sceasy} package also provides a function for reading H5AD files as Seurat or SingleCellExperiment objects in a single step. {sceasy} also wraps Python functions but unlike {zellkonverter} it doesn’t use a special Python environment. This means you need to be responsible for setting up the environment, making sure that R can find it and that the correct packages are installed (again, this code is not run here).

sceasy_seurat <- sceasy::convertFormat(h5ad_file, from="anndata", to="seurat")
sceasy_seurat

Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
X -> counts
An object of class Seurat
2000 features across 100 samples within 1 assay
Active assay: RNA (2000 features, 0 variable features)
 2 dimensional reductions calculated: pca, umap

5.3.2.1.3. Reading/writing H5AD with {anndata}#

The R {anndata} package can also be used to read H5AD files. However, unlike the packages above it does not convert to a native R object. Instead it provides an R interface to the Python object. This is useful for accessing the data but few analysis packages will accept this as input so further in-memory conversion is usually required.

5.3.2.2. Loom#

The Loom file format is an older HDF5 specification for omics data. Unlike H5AD, it is not linked to a specific analysis ecosystem, although the structure is similar to AnnData and SingleCellExperiment objects. Packages implementing the Loom format exist for both R and Python as well as a Bioconductor package for writing Loom files. However, it is often more convenient to use the higher-level interfaces provided by the core ecosystem packages. Apart from sharing datasets another common place Loom files are encountered is when spliced/unspliced reads are quantified using velocyto for RNA velocity analysis.

5.3.3. RDS files#

Another file format you may see used to share single-cell datasets is the RDS format. This is a binary format used to serialise arbitrary R objects (similar to Python Pickle files). As SingleCellExperiment and Seurat objects did not always have matching on-disk representations RDS files are sometimes used to share the results from R analyses. While this is ok within an analysis project we discourage its use for sharing data publicly or with collaborators due to the lack of interoperability with other ecosystems. Instead, we recommend using one of the HDF5 formats mentioned above that can be read from multiple languages.

5.3.4. New on-disk formats#

While HDF5-based formats are currently the standard for on-disk representations of single-cell data other newer technologies such as Zarr and TileDB have some advantages, particularly for very large datasets and other modalities. We expect specifications to be developed for these formats in the future which may be adopted by the community (anndata already provides support for Zarr files).

5.4. In-memory interoperability#

The second approach to interoperability is to work on in-memory representations of an object. This approach involves active sessions from two programming languages running at the same time and either accessing the same object from both or converting between them as needed. Usually, one language acts as the main environment and there is an interface to the other language. This can be very useful for interactive analysis as it allows an analyst to work in two languages simultaneously. It is also often used when creating documents that use multiple languages (such as this book). However, in-memory interoperability has some drawbacks as it requires the analyst to be familiar with setting up and using both environments, more complex objects are often not supported by both languages and there is a greater memory overhead as data can easily become duplicated (making it difficult to use for larger datasets).

5.4.1. Interoperability between R ecosystems#

Before we look at in-memory interoperability between R and Python first let’s consider the simpler case of converting between the two R ecosystems. The {Seurat} package provides functions for performing this conversion as described in this vignette.

%%R
sce_from_seurat <- Seurat::as.SingleCellExperiment(seurat)
sce_from_seurat

class: SingleCellExperiment 
dim: 2000 100 
metadata(0):
assays(2): X logcounts
rownames(2000): Gene-0 Gene-1 ... Gene-1998 Gene-1999
rowData names(0):
colnames(100): Cell_0 Cell_1 ... Cell_98 Cell_99
colData names(9): n_genes_by_counts log1p_n_genes_by_counts ...
  pct_counts_in_top_500_genes ident
reducedDimNames(2): PCA UMAP
mainExpName: NULL
altExpNames(0):

%%R
seurat_from_sce <- Seurat::as.Seurat(sce_from_seurat)
seurat_from_sce

An object of class Seurat 
2000 features across 100 samples within 1 assay 
Active assay: RNA (2000 features, 0 variable features)
 2 dimensional reductions calculated: PCA, UMAP

The difficult part here is due to the differences between the structures of the two objects. It is important to make sure the arguments are set correctly so that the conversion functions know which information to convert and where to place it.

In many cases it may not be necessary to convert a Seurat object to a SingleCellExperiment. This is because many of the core Bioconductor packages for single-cell analysis have been designed to also accept a matrix as input.

%%R
# Calculate Counts Per Million using the Bioconductor scuttle package
# with a matrix in a Seurat object
cpm <- scuttle::calculateCPM(Seurat::GetAssayData(seurat, slot = "counts"))
cpm[1:10, 1:10]

10 x 10 sparse Matrix of class "dgCMatrix"
                                                                       
 [1,] 602.1263 622.1264  600.5326 1416.439   .         .       965.8600
 [2,] 602.1263   .         .       613.435 618.2562  943.8910  609.1107
 [3,] 602.1263 982.8946 1202.6879  969.506 618.2562  943.8910 1219.1005
 [4,]   .        .       600.5326  613.435 976.3175  594.3451    .     
 [5,] 602.1263 622.1264    .      1221.384 618.2562  594.3451  609.1107
 [6,] 954.8394 982.8946 1202.6879  613.435 618.2562  943.8910 1219.1005
 [7,] 954.8394 622.1264  952.6379  613.435 618.2562    .       965.8600
 [8,]   .      982.8946    .         .     618.2562  594.3451    .     
 [9,] 954.8394   .       600.5326  613.435   .      1192.4227 1219.1005
[10,] 954.8394 622.1264  952.6379  613.435 618.2562  943.8910  609.1107
                                  
 [1,]  958.4081 599.5090  610.1969
 [2,]  958.4081 952.2526  610.1969
 [3,]    .        .       965.9120
 [4,]    .        .         .     
 [5,]  605.0013 952.2526    .     
 [6,]    .        .         .     
 [7,]  605.0013 599.5090  965.9120
 [8,] 1209.0169   .       965.9120
 [9,]    .      599.5090 1413.3168
[10,] 1209.0169   .       610.1969

However, it is important to be sure you are accessing the right information and storing any results in the correct place if needed.

5.4.2. Accessing R from Python#

The Python interface to R is provided by the rpy2 package. This allows you to access R functions and objects from Python. For example:

counts_mat = adata.layers["counts"].T
rpy2.robjects.globalenv["counts_mat"] = counts_mat
cpm = rpy2.robjects.r("scuttle::calculateCPM(counts_mat)")
cpm

<2000x100 sparse matrix of type '<class 'numpy.float64'>'
	with 126146 stored elements in Compressed Sparse Column format>

Common Python objects (lists, matrices, DataFrames etc.) can also be passed to R.

If you are using a Jupyter notebook (as we are for this book) you can use the IPython magic interface to create cells with native R code (passing objects as required). For example, starting a cell with %%R -i input -o output says to take input as input, run R code and then return output as output.

%%R -i counts_mat -o magic_cpm
# R code running using IPython magic
magic_cpm <- scuttle::calculateCPM(counts_mat)

# Python code accessing the results
magic_cpm

<2000x100 sparse matrix of type '<class 'numpy.float64'>'
	with 126146 stored elements in Compressed Sparse Column format>

This is the approach you will most commonly see in later chapters. For more information about using rpy2 please refer to the documentation.

To work with single-cell data in this way the anndata2ri package is especially useful. This is an extension to rpy2 which allows R to see AnnData objects as SingleCellExperiment objects. This avoids unnecessary conversion and makes it easy to run R code on a Python object. It also enables the conversion of sparse scipy matrices like we saw above.

In this example, we pass an AnnData object in the Python session to R which views it as a SingleCellExperiment that can be used by R functions.

%%R -i adata
qc <- scuttle::perCellQCMetrics(adata)
head(qc)

/Users/luke.zappia/miniconda3/envs/interoperability2/lib/python3.9/functools.py:888: NotConvertedWarning: Conversion 'py2rpy' not defined for objects of type '<class 'NoneType'>'
  return dispatch(args[0].__class__)(*args, **kw)

        sum detected total
Cell_0 2005     1274  2005
Cell_1 1941     1233  1941
Cell_2 2011     1270  2011
Cell_3 1947     1268  1947
Cell_4 1933     1265  1933
Cell_5 2031     1289  2031

Note that you will still run into issues if an object (or part of it) cannot be interfaced correctly (for example if there is an unsupported data type). In that case, you may need to modify your object first before it can be accessed.

5.4.3. Accessing Python from R#

Accessing Python from an R session is similar to accessing R from Python but here the interface is provided by the {reticulate} package. Once it is loaded we can access Python functions and objects from R.

%%R
reticulate_list <- reticulate::r_to_py(LETTERS)
print(reticulate_list)
py_builtins <- reticulate::import_builtins()
py_builtins$zip(letters, LETTERS)

List (26 items)
<zip object at 0x18a243040>

If you are working in an RMarkdown or Quarto document you can also write native Python chunks using the {reticulate} Python engine. When we do this we can use the magic r and py variables to access objects in the other language (the following code is an example that is not run).

```{r}
# An R chunk that accesses a Python object
print(py$py_object)
```

```{python}
# A Python chunk that accesses an R object
print(r$r_object)
```

Unlike anndata2ri, there are no R packages that provide a direct interface for Python to view SingleCellExperiment or Seurat objects as AnnData objects. However, we can still access most parts of an AnnData using {reticulate} (this code is not run).

# Print an AnnData object in a Python environment
py$adata

AnnData object with n_obs × n_vars = 100 × 2000
    obs: 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes'
    var: 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p', 'neighbors', 'pca', 'umap'
    obsm: 'X_pca', 'X_umap'
    varm: 'PCs'
    layers: 'counts'
    obsp: 'connectivities', 'distances'

# Alternatively use the Python anndata package to read a H5AD file
anndata <- reticulate::import("anndata")
anndata$read_h5ad(h5ad_file)

AnnData object with n_obs × n_vars = 100 × 2000
    obs: 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes'
    var: 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p', 'neighbors', 'pca', 'umap'
    obsm: 'X_pca', 'X_umap'
    varm: 'PCs'
    layers: 'counts'
    obsp: 'connectivities', 'distances'

# Access the obs slot, pandas DataFrames are automatically converted to R data.frames
head(adata$obs)

       n_genes_by_counts log1p_n_genes_by_counts total_counts
Cell_0              1246                7.128496         1965
Cell_1              1262                7.141245         2006
Cell_2              1262                7.141245         1958
Cell_3              1240                7.123673         1960
Cell_4              1296                7.167809         2027
Cell_5              1231                7.116394         1898
       log1p_total_counts pct_counts_in_top_50_genes
Cell_0           7.583756                  10.025445
Cell_1           7.604396                   9.521436
Cell_2           7.580189                   9.959142
Cell_3           7.581210                   9.183673
Cell_4           7.614805                   9.718796
Cell_5           7.549083                  10.168599
       pct_counts_in_top_100_genes pct_counts_in_top_200_genes
Cell_0                    17.65903                    30.89059
Cell_1                    16.99900                    29.71087
Cell_2                    17.62002                    30.28601
Cell_3                    16.83673                    30.45918
Cell_4                    17.11889                    30.04440
Cell_5                    18.07165                    30.29505
       pct_counts_in_top_500_genes
Cell_0                    61.42494
Cell_1                    59.62114
Cell_2                    60.92952
Cell_3                    61.07143
Cell_4                    59.64480
Cell_5                    61.48577

As mentioned above the R {anndata} package provides an R interface for AnnData objects but it is not currently used by many analysis packages.

For more complex analysis that requires a whole object to work on it may be necessary to completely convert an object from R to Python (or the opposite). This is not memory efficient as it creates a duplicate of the data but it does provide access to a greater range of packages. The {zellkonverter} package provides a function for doing this conversion (note that, unlike the function for reading H5AD files, this uses the normal Python environment rather than a specially created one) (code not run).

# Convert an AnnData to a SingleCellExperiment
sce <- zellkonverter::AnnData2SCE(adata, verbose = TRUE)
sce

✔ uns$hvg$flavor converted [21ms]
✔ uns$hvg converted [62ms]
✔ uns$log1p converted [22ms]
✔ uns$neighbors converted [21ms]
✔ uns$pca$params$use_highly_variable converted [22ms]
✔ uns$pca$params$zero_center converted [31ms]
✔ uns$pca$params converted [118ms]
✔ uns$pca$variance converted [17ms]
✔ uns$pca$variance_ratio converted [17ms]
✔ uns$pca converted [224ms]
✔ uns$umap$params$a converted [15ms]
✔ uns$umap$params$b converted [17ms]
✔ uns$umap$params converted [80ms]
✔ uns$umap converted [115ms]
✔ uns converted [582ms]
✔ Converting uns to metadata ... done
✔ X matrix converted to assay [44ms]
✔ layers$counts converted [29ms]
✔ Converting layers to assays ... done
✔ var converted to rowData [37ms]
✔ obs converted to colData [23ms]
✔ varm$PCs converted [18ms]
✔ varm converted [49ms]
✔ Converting varm to rowData$varm ... done
✔ obsm$X_pca converted [17ms]
✔ obsm$X_umap converted [17ms]
✔ obsm converted [80ms]
✔ Converting obsm to reducedDims ... done
ℹ varp is empty and was skipped
✔ obsp$connectivities converted [21ms]
✔ obsp$distances converted [22ms]
✔ obsp converted [89ms]
✔ Converting obsp to colPairs ... done
✔ SingleCellExperiment constructed [241ms]
ℹ Skipping conversion of raw
✔ Converting AnnData to SingleCellExperiment ... done
class: SingleCellExperiment
dim: 2000 100
metadata(5): hvg log1p neighbors pca umap
assays(2): X counts
rownames(2000): Gene_0 Gene_1 ... Gene_1998 Gene_1999
rowData names(11): n_cells_by_counts mean_counts ... dispersions_norm
  varm
colnames(100): Cell_0 Cell_1 ... Cell_98 Cell_99
colData names(8): n_genes_by_counts log1p_n_genes_by_counts ...
  pct_counts_in_top_200_genes pct_counts_in_top_500_genes
reducedDimNames(2): X_pca X_umap
mainExpName: NULL
altExpNames(0):

The same can also be done in reverse:

adata2 <- zellkonverter::SCE2AnnData(sce, verbose = TRUE)
adata2

ℹ Using the 'X' assay as the X matrix
✔ Selected X matrix [27ms]
✔ assays$X converted to X matrix [38ms]
✔ additional assays converted to layers [31ms]
✔ rowData$varm converted to varm [15ms]
✔ reducedDims converted to obsm [63ms]
✔ metadata converted to uns [23ms]
ℹ rowPairs is empty and was skipped
✔ Converting AnnData to SingleCellExperiment ... done
AnnData object with n_obs × n_vars = 100 × 2000
    obs: 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes'
    var: 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'X_name', 'hvg', 'log1p', 'neighbors', 'pca', 'umap'
    obsm: 'X_pca', 'X_umap'
    varm: 'PCs'
    layers: 'counts'
    obsp: 'connectivities', 'distances'

5.5. Interoperability for multimodal data#

The complexity of multimodal data presents additional challenges for interoperability. Both the SingleCellExperiment (through “alternative experiments”, which must match in the column dimension (cells)) and Seurat (using “assays”) objects can store multiple modalities but the AnnData object is restricted to unimodal data.

To address this limitation, the MuData object (introduced in the [analysis frameworks and tools chapter](analysis frameworks and tools chapter) was developed as as an extension of AnnData for multimodal datasets. The developers have considered interoperability in their design. While the main platform for MuData is Python, the authors have provided the MuDataSeurat R package for reading the on-disk H5MU format as Seurat objects and the MuData R package for doing the same with Bioconductor MultiAssayExperiment objects. This official support is very useful but there are still some inconsistencies due to differences between the objects. The MuData authors also provide a Julia implementation of AnnData and MuData.

Below is an example of reading and writing a small example MuData dataset using the Python and R packages.

5.5.1. Python#

# Read file
mdata = mudata.read_h5mu("../../datasets/original.h5mu")
print(mdata)

# Write new file
python_h5mu_file = Path(temp_dir.name) / "python.h5mu"
mdata.write_h5mu(python_h5mu_file)

MuData object with n_obs × n_vars = 411 × 56
  obs:	'louvain', 'leiden', 'leiden_wnn', 'celltype'
  var:	'gene_ids', 'feature_types', 'highly_variable'
  obsm:	'X_mofa', 'X_mofa_umap', 'X_umap', 'X_wnn_umap'
  varm:	'LFs'
  obsp:	'connectivities', 'distances', 'wnn_connectivities', 'wnn_distances'
  2 modalities
    prot:	411 x 29
      var:	'gene_ids', 'feature_types', 'highly_variable'
      uns:	'neighbors', 'pca', 'umap'
      obsm:	'X_pca', 'X_umap'
      varm:	'PCs'
      layers:	'counts'
      obsp:	'connectivities', 'distances'
    rna:	411 x 27
      obs:	'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden', 'celltype'
      var:	'gene_ids', 'feature_types', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
      uns:	'celltype_colors', 'hvg', 'leiden', 'leiden_colors', 'neighbors', 'pca', 'rank_genes_groups', 'umap'
      obsm:	'X_pca', 'X_umap'
      varm:	'PCs'
      obsp:	'connectivities', 'distances'

/Users/luke.zappia/miniconda3/envs/interoperability2/lib/python3.9/site-packages/anndata/_core/anndata.py:1230: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df[key] = c

5.5.2. R#

5.5.2.1. Bioconductor#

Read/write from/to a MultiAssayExperiment object

%%R
mae <- MuData::readH5MU("../../datasets/original.h5mu")
print(mae)

bioc_h5mu_file <- tempfile(fileext = ".h5mu")
MuData::writeH5MU(mae, bioc_h5mu_file)

A MultiAssayExperiment object of 2 listed
 experiments with user-defined names and respective classes.
 Containing an ExperimentList class object of length 2:
 [1] prot: SingleCellExperiment with 29 rows and 411 columns
 [2] rna: SingleCellExperiment with 27 rows and 411 columns
Functionality:
 experiments() - obtain the ExperimentList instance
 colData() - the primary/phenotype DataFrame
 sampleMap() - the sample coordination DataFrame
 `$`, `[`, `[[` - extract colData columns, subset, or experiment
 *Format() - convert into a long or wide DataFrame
 assays() - convert ExperimentList to a SimpleList of matrices
 exportClass() - save data to flat files

5.5.2.2. Seurat#

Read/write from/to a Seurat object

%%R
seurat <- MuDataSeurat::ReadH5MU("../../datasets/original.h5mu")
print(seurat)

seurat_h5mu_file <- tempfile(fileext = ".h5mu")
MuDataSeurat::WriteH5MU(seurat, seurat_h5mu_file)

R[write to console]: The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
which was just loaded, will retire in October 2023.
Please refer to R-spatial evolution reports for details, especially
https://r-spatial.org/r/2023/05/15/evolution4.html.
It may be desirable to make the sf package available;
package maintainers should consider adding sf to Suggests:.
The sp package is now running under evolution status 2
     (status 2 uses the sf package in place of rgdal)

    WARNING: The R package "reticulate" only fixed recently
    an issue that caused a segfault when used with rpy2:
    https://github.com/rstudio/reticulate/pull/1188
    Make sure that you use a version of that package that includes
    the fix.
    

R[write to console]: Warnung:
R[write to console]:  Keys should be one or more alphanumeric characters followed by an underscore, setting key from prot to prot_

R[write to console]: Warnung:
R[write to console]:  No columnames present in cell embeddings, setting to 'MOFA_1:30'

R[write to console]: Warnung:
R[write to console]:  Keys should be one or more alphanumeric characters followed by an underscore, setting key from MOFA_UMAP_ to MOFAUMAP_

R[write to console]: Warnung:
R[write to console]:  All keys should be one or more alphanumeric characters followed by an underscore '_', setting key to MOFAUMAP_

R[write to console]: Warnung:
R[write to console]:  No columnames present in cell embeddings, setting to 'MOFAUMAP_1:2'

R[write to console]: Warnung:
R[write to console]:  No columnames present in cell embeddings, setting to 'UMAP_1:2'

R[write to console]: Warnung:
R[write to console]:  Keys should be one or more alphanumeric characters followed by an underscore, setting key from WNN_UMAP_ to WNNUMAP_

R[write to console]: Warnung:
R[write to console]:  All keys should be one or more alphanumeric characters followed by an underscore '_', setting key to WNNUMAP_

R[write to console]: Warnung:
R[write to console]:  No columnames present in cell embeddings, setting to 'WNNUMAP_1:2'

R[write to console]: Warnung:
R[write to console]:  No columnames present in cell embeddings, setting to 'protPCA_1:31'

R[write to console]: Warnung:
R[write to console]:  No columnames present in cell embeddings, setting to 'protUMAP_1:2'

R[write to console]: Warnung:
R[write to console]:  No columnames present in cell embeddings, setting to 'rnaPCA_1:50'

R[write to console]: Warnung:
R[write to console]:  No columnames present in cell embeddings, setting to 'rnaUMAP_1:2'

An object of class Seurat 
56 features across 411 samples within 2 assays 
Active assay: prot (29 features, 29 variable features)
 1 other assay present: rna
 8 dimensional reductions calculated: MOFA, MOFA_UMAP, UMAP, WNN_UMAP, protPCA, protUMAP, rnaPCA, rnaUMAP

R[write to console]: Added .var['highly_variable'] with highly variable features

R[write to console]: Added .var['highly_variable'] with highly variable features

5.6. Interoperability with other languages#

Here we briefly list some resources and tools for the interoperability of single-cell data with languages other than R and Python.

5.6.1. Julia#

Muon.jl provides Julia implementations of AnnData and MuData objects, as well as IO for the H5AD and H5MU formats
scVI.jl provides a Julia implementation of AnnData as well as IO for the H5AD format

5.6.2. JavaScript#

Vitessce contains loaders from AnnData objects stored using the Zarr format
The kana family supports reading H5AD files and SingleCellExperiment objects saved as RDS files

5.6.3. Rust#

anndata-rs provides a Rust implementation of AnnData as well as advanced IO support for the H5AD format

5.7. Session information#

5.8. Python#

import session_info

session_info.show()

Click to view session information

-----
anndata             0.9.2
anndata2ri          1.2
mudata              0.2.3
numpy               1.24.4
rpy2                3.5.11
scanpy              1.9.3
scipy               1.9.3
session_info        1.0.0
-----

Click to view modules imported as dependencies

CoreFoundation              NA
Foundation                  NA
PIL                         10.0.0
PyObjCTools                 NA
anyio                       NA
appnope                     0.1.3
argcomplete                 NA
arrow                       1.2.3
asttokens                   NA
attr                        23.1.0
attrs                       23.1.0
babel                       2.12.1
backcall                    0.2.0
beta_ufunc                  NA
binom_ufunc                 NA
brotli                      1.0.9
certifi                     2023.07.22
cffi                        1.15.1
charset_normalizer          3.2.0
colorama                    0.4.6
comm                        0.1.4
cycler                      0.10.0
cython_runtime              NA
dateutil                    2.8.2
debugpy                     1.6.8
decorator                   5.1.1
defusedxml                  0.7.1
executing                   1.2.0
fastjsonschema              NA
fqdn                        NA
h5py                        3.9.0
hypergeom_ufunc             NA
idna                        3.4
importlib_metadata          NA
importlib_resources         NA
ipykernel                   6.25.0
ipython_genutils            0.2.0
ipywidgets                  8.1.0
isoduration                 NA
jedi                        0.19.0
jinja2                      3.1.2
joblib                      1.3.0
json5                       NA
jsonpointer                 2.0
jsonschema                  4.18.6
jsonschema_specifications   NA
jupyter_events              0.7.0
jupyter_server              2.7.0
jupyterlab_server           2.24.0
kiwisolver                  1.4.4
llvmlite                    0.40.1
markupsafe                  2.1.3
matplotlib                  3.7.2
mpl_toolkits                NA
natsort                     8.4.0
nbformat                    5.9.2
nbinom_ufunc                NA
ncf_ufunc                   NA
numba                       0.57.1
objc                        9.2
overrides                   NA
packaging                   23.1
pandas                      2.0.3
parso                       0.8.3
pexpect                     4.8.0
pickleshare                 0.7.5
pkg_resources               NA
platformdirs                3.10.0
prometheus_client           NA
prompt_toolkit              3.0.39
psutil                      5.9.5
ptyprocess                  0.7.0
pure_eval                   0.2.2
pydev_ipython               NA
pydevconsole                NA
pydevd                      2.9.5
pydevd_file_utils           NA
pydevd_plugins              NA
pydevd_tracing              NA
pygments                    2.15.1
pyparsing                   3.0.9
pythonjsonlogger            NA
pytz                        2023.3
referencing                 NA
requests                    2.31.0
rfc3339_validator           0.1.4
rfc3986_validator           0.1.1
rpds                        NA
send2trash                  NA
six                         1.16.0
sklearn                     1.3.0
sniffio                     1.3.0
socks                       1.7.1
stack_data                  0.6.2
threadpoolctl               3.2.0
tornado                     6.3.2
traitlets                   5.9.0
typing_extensions           NA
tzlocal                     NA
uri_template                NA
urllib3                     2.0.4
wcwidth                     0.2.6
webcolors                   1.13
websocket                   1.6.1
yaml                        6.0
zipp                        NA
zmq                         25.1.0
zoneinfo                    NA

-----
IPython             8.14.0
jupyter_client      8.3.0
jupyter_core        5.3.1
jupyterlab          3.6.3
notebook            6.5.4
-----
Python 3.9.16 | packaged by conda-forge | (main, Feb  1 2023, 21:42:20) [Clang 14.0.6 ]
macOS-13.4.1-x86_64-i386-64bit
-----
Session information updated at 2023-11-15 15:48

5.9. R#

%%R
sessioninfo::session_info()

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.2.3 (2023-03-15)
 os       macOS Ventura 13.4.1
 system   x86_64, darwin13.4.0
 ui       unknown
 language (EN)
 collate  C
 ctype    UTF-8
 tz       Europe/Berlin
 date     2023-11-15
 pandoc   2.19.2 @ /Users/luke.zappia/miniconda3/envs/interoperability2/bin/pandoc

─ Packages ───────────────────────────────────────────────────────────────────
 package              * version    date (UTC) lib source
 abind                  1.4-5      2016-07-21 [1] CRAN (R 4.2.3)
 Biobase                2.58.0     2022-11-01 [1] Bioconductor
 BiocGenerics           0.44.0     2022-11-01 [1] Bioconductor
 bit                    4.0.5      2022-11-15 [1] CRAN (R 4.2.3)
 bit64                  4.0.5      2020-08-30 [1] CRAN (R 4.2.3)
 bitops                 1.0-7      2021-04-24 [1] CRAN (R 4.2.3)
 cli                    3.6.1      2023-03-23 [1] CRAN (R 4.2.3)
 cluster                2.1.4      2022-08-22 [1] CRAN (R 4.2.3)
 codetools              0.2-19     2023-02-01 [1] CRAN (R 4.2.3)
 colorspace             2.1-0      2023-01-23 [1] CRAN (R 4.2.3)
 cowplot                1.1.1      2020-12-30 [1] CRAN (R 4.2.3)
 data.table             1.14.8     2023-02-17 [1] CRAN (R 4.2.3)
 DelayedArray           0.24.0     2022-11-01 [1] Bioconductor
 deldir                 1.0-9      2023-05-17 [1] CRAN (R 4.2.3)
 digest                 0.6.33     2023-07-07 [1] CRAN (R 4.2.3)
 dplyr                  1.1.2      2023-04-20 [1] CRAN (R 4.2.3)
 ellipsis               0.3.2      2021-04-29 [1] CRAN (R 4.2.3)
 fansi                  1.0.4      2023-01-22 [1] CRAN (R 4.2.3)
 fastmap                1.1.1      2023-02-24 [1] CRAN (R 4.2.3)
 fitdistrplus           1.1-11     2023-04-25 [1] CRAN (R 4.2.3)
 future                 1.33.0     2023-07-01 [1] CRAN (R 4.2.3)
 future.apply           1.11.0     2023-05-21 [1] CRAN (R 4.2.3)
 generics               0.1.3      2022-07-05 [1] CRAN (R 4.2.3)
 GenomeInfoDb           1.34.9     2023-02-02 [1] Bioconductor
 GenomeInfoDbData       1.2.9      2023-11-10 [1] Bioconductor
 GenomicRanges          1.50.0     2022-11-01 [1] Bioconductor
 ggplot2                3.4.2      2023-04-03 [1] CRAN (R 4.2.3)
 ggrepel                0.9.3      2023-02-03 [1] CRAN (R 4.2.3)
 ggridges               0.5.4      2022-09-26 [1] CRAN (R 4.2.3)
 globals                0.16.2     2022-11-21 [1] CRAN (R 4.2.3)
 glue                   1.6.2      2022-02-24 [1] CRAN (R 4.2.3)
 goftest                1.2-3      2021-10-07 [1] CRAN (R 4.2.3)
 gridExtra              2.3        2017-09-09 [1] CRAN (R 4.2.3)
 gtable                 0.3.3      2023-03-21 [1] CRAN (R 4.2.3)
 hdf5r                  1.3.8      2023-01-21 [1] CRAN (R 4.2.3)
 htmltools              0.5.5      2023-03-23 [1] CRAN (R 4.2.3)
 htmlwidgets            1.6.2      2023-03-17 [1] CRAN (R 4.2.3)
 httpuv                 1.6.11     2023-05-11 [1] CRAN (R 4.2.3)
 httr                   1.4.6      2023-05-08 [1] CRAN (R 4.2.3)
 ica                    1.0-3      2022-07-08 [1] CRAN (R 4.2.3)
 igraph                 1.5.0.1    2023-07-23 [1] CRAN (R 4.2.3)
 IRanges                2.32.0     2022-11-01 [1] Bioconductor
 irlba                  2.3.5.1    2022-10-03 [1] CRAN (R 4.2.3)
 jsonlite               1.8.7      2023-06-29 [1] CRAN (R 4.2.3)
 KernSmooth             2.23-22    2023-07-10 [1] CRAN (R 4.2.3)
 later                  1.3.1      2023-05-02 [1] CRAN (R 4.2.3)
 lattice                0.21-8     2023-04-05 [1] CRAN (R 4.2.3)
 lazyeval               0.2.2      2019-03-15 [1] CRAN (R 4.2.3)
 leiden                 0.4.3      2022-09-10 [1] CRAN (R 4.2.3)
 lifecycle              1.0.3      2022-10-07 [1] CRAN (R 4.2.3)
 listenv                0.9.0      2022-12-16 [1] CRAN (R 4.2.3)
 lmtest                 0.9-40     2022-03-21 [1] CRAN (R 4.2.3)
 magrittr               2.0.3      2022-03-30 [1] CRAN (R 4.2.3)
 MASS                   7.3-60     2023-05-04 [1] CRAN (R 4.2.3)
 Matrix                 1.6-0      2023-07-08 [1] CRAN (R 4.2.3)
 MatrixGenerics         1.10.0     2022-11-01 [1] Bioconductor
 matrixStats            1.0.0      2023-06-02 [1] CRAN (R 4.2.3)
 mime                   0.12       2021-09-28 [1] CRAN (R 4.2.3)
 miniUI                 0.1.1.1    2018-05-18 [1] CRAN (R 4.2.3)
 MuData                 1.2.0      2022-11-01 [1] Bioconductor
 MuDataSeurat           0.0.0.9000 2023-11-15 [1] Github (PMBio/MuDataSeurat@e34e908)
 MultiAssayExperiment   1.24.0     2022-11-01 [1] Bioconductor
 munsell                0.5.0      2018-06-12 [1] CRAN (R 4.2.3)
 nlme                   3.1-162    2023-01-31 [1] CRAN (R 4.2.3)
 parallelly             1.36.0     2023-05-26 [1] CRAN (R 4.2.3)
 patchwork              1.1.2      2022-08-19 [1] CRAN (R 4.2.3)
 pbapply                1.7-2      2023-06-27 [1] CRAN (R 4.2.3)
 pillar                 1.9.0      2023-03-22 [1] CRAN (R 4.2.3)
 pkgconfig              2.0.3      2019-09-22 [1] CRAN (R 4.2.3)
 plotly                 4.10.2     2023-06-03 [1] CRAN (R 4.2.3)
 plyr                   1.8.8      2022-11-11 [1] CRAN (R 4.2.3)
 png                    0.1-8      2022-11-29 [1] CRAN (R 4.2.3)
 polyclip               1.10-4     2022-10-20 [1] CRAN (R 4.2.3)
 progressr              0.13.0     2023-01-10 [1] CRAN (R 4.2.3)
 promises               1.2.0.1    2021-02-11 [1] CRAN (R 4.2.3)
 purrr                  1.0.1      2023-01-10 [1] CRAN (R 4.2.3)
 R6                     2.5.1      2021-08-19 [1] CRAN (R 4.2.3)
 RANN                   2.6.1      2019-01-08 [1] CRAN (R 4.2.3)
 RColorBrewer           1.1-3      2022-04-03 [1] CRAN (R 4.2.3)
 Rcpp                   1.0.11     2023-07-06 [1] CRAN (R 4.2.3)
 RcppAnnoy              0.0.21     2023-07-02 [1] CRAN (R 4.2.3)
 RCurl                  1.98-1.12  2023-03-27 [1] CRAN (R 4.2.3)
 reshape2               1.4.4      2020-04-09 [1] CRAN (R 4.2.3)
 reticulate             1.30       2023-06-09 [1] CRAN (R 4.2.3)
 rhdf5                  2.42.0     2022-11-01 [1] Bioconductor
 rhdf5filters           1.10.0     2022-11-01 [1] Bioconductor
 Rhdf5lib               1.20.0     2022-11-01 [1] Bioconductor
 rlang                  1.1.1      2023-04-28 [1] CRAN (R 4.2.3)
 ROCR                   1.0-11     2020-05-02 [1] CRAN (R 4.2.3)
 Rtsne                  0.16       2022-04-17 [1] CRAN (R 4.2.3)
 S4Vectors              0.36.0     2022-11-01 [1] Bioconductor
 scales                 1.2.1      2022-08-20 [1] CRAN (R 4.2.3)
 scattermore            1.2        2023-06-12 [1] CRAN (R 4.2.3)
 sctransform            0.3.5      2022-09-21 [1] CRAN (R 4.2.3)
 sessioninfo            1.2.2      2021-12-06 [1] CRAN (R 4.2.3)
 Seurat                 4.3.0.1    2023-06-22 [1] CRAN (R 4.2.3)
 SeuratObject           4.1.3      2022-11-07 [1] CRAN (R 4.2.3)
 shiny                  1.7.4.1    2023-07-06 [1] CRAN (R 4.2.3)
 SingleCellExperiment   1.20.0     2022-11-01 [1] Bioconductor
 sp                     2.0-0      2023-06-22 [1] CRAN (R 4.2.3)
 spatstat.data          3.0-1      2023-03-12 [1] CRAN (R 4.2.3)
 spatstat.explore       3.2-1      2023-05-13 [1] CRAN (R 4.2.3)
 spatstat.geom          3.2-4      2023-07-20 [1] CRAN (R 4.2.3)
 spatstat.random        3.1-5      2023-05-11 [1] CRAN (R 4.2.3)
 spatstat.sparse        3.0-2      2023-06-25 [1] CRAN (R 4.2.3)
 spatstat.utils         3.0-3      2023-05-09 [1] CRAN (R 4.2.3)
 stringi                1.7.12     2023-01-11 [1] CRAN (R 4.2.3)
 stringr                1.5.0      2022-12-02 [1] CRAN (R 4.2.3)
 SummarizedExperiment   1.28.0     2022-11-01 [1] Bioconductor
 survival               3.5-5      2023-03-12 [1] CRAN (R 4.2.3)
 tensor                 1.5        2012-05-05 [1] CRAN (R 4.2.3)
 tibble                 3.2.1      2023-03-20 [1] CRAN (R 4.2.3)
 tidyr                  1.3.0      2023-01-24 [1] CRAN (R 4.2.3)
 tidyselect             1.2.0      2022-10-10 [1] CRAN (R 4.2.3)
 utf8                   1.2.3      2023-01-31 [1] CRAN (R 4.2.3)
 uwot                   0.1.16     2023-06-29 [1] CRAN (R 4.2.3)
 vctrs                  0.6.3      2023-06-14 [1] CRAN (R 4.2.3)
 viridisLite            0.4.2      2023-05-02 [1] CRAN (R 4.2.3)
 xtable                 1.8-4      2019-04-21 [1] CRAN (R 4.2.3)
 XVector                0.38.0     2022-11-01 [1] Bioconductor
 zlibbioc               1.44.0     2022-11-01 [1] Bioconductor
 zoo                    1.8-12     2023-04-13 [1] CRAN (R 4.2.3)

 [1] /Users/luke.zappia/miniconda3/envs/interoperability2/lib/R/library

──────────────────────────────────────────────────────────────────────────────

5.10. References#

5.11. Contributors#

We gratefully acknowledge the contributions of:

5.11.1. Authors#

Luke Zappia

5.11.2. Reviewers#

Lukas Heumos
Isaac Virshup
Anastasia Litinetskaya
Ludwig Geistlinger
Peter Hickey