We often use python sklearn to implement K-Means clustering. Here is a tutorial:
Implement K-Means Clustering Using sklearn.cluster.KMeans in Python
However, sklearn can not use GPU. In this article, we will use pytorch to implement a fast K-Means clustering, which can allow GPU to speed up the clustering.
How to implement a fast K-Means clustering in pytorch?
We can use package: fast-pytorch-kmeans
First, we should install it.
pip install fast-pytorch-kmeans
Then, we can implement K-Means as follows:
from fast_pytorch_kmeans import KMeans import torch kmeans = KMeans(n_clusters=8, mode='euclidean', verbose=1) x = torch.randn(100000, 64, device='cuda') labels = kmeans.fit_predict(x)
We should notice:
fast-pytorch-kmeans supports ‘euclidean’, ‘cosine’ distance which is very useful for us to make a clustering.