stereoAlign.metrics.silhouette_batch

stereoAlign.metrics.silhouette_batch(adata, batch_key, label_key, embed, metric='euclidean', return_all=False, scale=True, verbose=True)[source]

Batch ASW

Modified average silhouette width (ASW) of batch

This metric measures the silhouette of a given batch. It assumes that a silhouette width close to 0 represents perfect overlap of the batches, thus the absolute value of the silhouette width is used to measure how well batches are mixed. For all cells \(i\) of a cell type \(C_j\), the batch ASW of that cell type is:

\[\begin{split}batch\\, ASW_j=\\frac{1}{|C_j|}\\sum_{i\\in C_j}|silhouette(i)|\end{split}\]

The final score is the average of the absolute silhouette widths computed per cell type \(M\).

\[\begin{split}batch\\, ASW =\\frac{1}{|M|}\\sum_{i\\in M} batch\\, ASW_j\end{split}\]

For a scaled metric (which is the default), the absolute ASW per group is subtracted from 1 before averaging, so that 0 indicates suboptimal label representation and 1 indicates optimal label representation.

\[\begin{split}batch\\, ASW_j =\\frac{1}{|C_j|}\\sum_{i\\in C_j} 1 - |silhouette(i)|\end{split}\]

Parameters

batch_key:

batch labels to be compared against

label_key:

group labels to be subset by e.g. cell type

embed:

name of column in adata.obsm

metric:

see sklearn silhouette score

scale:

if True, scale between 0 and 1

return_all:

if True, return all silhouette scores and label means. default False: return average width silhouette (ASW)

verbose:

print silhouette score per group

Returns

Batch ASW (always) Mean silhouette per group in pd.DataFrame (additionally, if return_all=True) Absolute silhouette scores per group label (additionally, if return_all=True)

The function requires an embedding to be stored in adata.obsm and can only be applied to feature and embedding integration outputs. Please note, that the metric cannot be used to evaluate kNN graph outputs.