Hierarchical clustering in Python

If you are like me looking for a way to produce a heat map similar to the one in R but in Python, then this post is for you.

My particular goal here was to cluster samples and genes in a gene expression data table, but there might be tons of applications where it’s useful.

Turns out, Python offers several ways to approach this problem. I picked out three of them: quick and easy numpy+scipy combination, fastcluster module ( the name speaks for itself) and BioPython that provides superior visualization possibilities.

Unfortunately, there does not seem to be a way to easily plot a heat map along with the dendrograms on one plot, hence the issue I created on matplotlib github page

Follow the instructions I created about these packages for hierarchical clustering in Python: IPython notebook or just download the source code from the github repo.

