Clustering
There are two functions to cluster cells.
>>>label=loaded_data.clustering(
... n_clusters=4, clustering_method='kmeans', similarity_method='innerproduct',
... aggregation='median', n_strata=None
... )
clustering
function returns a numpy array of cell labels clustered.
-
n_clusters (int): Number of clusters.
-
clustering_method (str): Clustering method in ‘kmeans’, ‘spectral_clustering’ or ‘HAC’(hierarchical agglomerative clustering).
-
similarity_method (str): Reproducibility measure. Value in ‘InnerProduct’, ‘HiCRep’ or ‘Selfish’.
-
aggregation (str): Method to aggregate different chromosomes. Value is either ‘mean’ or ‘median’. Default: ‘median’.
-
n_strata (int or None): Only consider contacts within this genomic distance. If it is None, it will use the all strata kept (the argument keep_n_strata) from previous loading process. Default: None.
-
print_time (bool): Whether to print the processing time. Default: False.
>>>hicluster=loaded_data.scHiCluster(dim=2,cutoff=0.8,n_PCs=10,k=4)
scHiCluster
function returns two componments.
First componment is a numpy array of embedding of cells using HiCluster.
Second componment is a numpy of cell labels clustered by HiCluster.
-
dim (int): Number of dimension of embedding. Default: 2.
-
cutoff (float): The cutoff proportion to convert the real contact matrix into binary matrix. Default: 0.8.
-
n_PCs (int): Number of principal components. Default: 10.
-
k (int): Number of clusters. Default: 4.