22 Jun 2004    The MCL family 1.004, 04-174

1.  
2.  

NAME

mclfamily - a description of the mcl family of cluster applications.

mcl is the Amsterdam implementation of the Markov Cluster Algorithm. It is described in the mcl manual. Several other utilities are part of the MCL distribution. This manual pages gives an overview.

mcl  
the cluster algorithm
mclfaq  
MCL Frequently Asked Questions
mcxio  
the graph/matrix input/output format
mcxassemble  
create matrices from raw data
mcxarray  
transform array data to MCL matrices
   
mcx  
general matrix operations
mcxsubs  
extracting submatrices in various ways
mcxmap  
relabel indices in a graph/matrix
   
clmformat  
display clusters as html or txt files
clmdist  
compute split/join distance between clusterings
clmmate  
find best matching clusters between clusterings
clminfo  
compute performance measure for clusterings
clmmeet  
compute intersection of clusterings
clmimac  
interpret MCL iterand/matrix as clustering
clmresidue  
extend subgraph clustering
   
mclpipeline  
parsing/assembly/clustering/display
mclblastline *  
BLAST pipeline
mcxdeblast *  
parse BLAST files

Entries marked * are not available if only a default install is done.

DESCRIPTION

mclfaq - Frequently Asked Questions.

mcxio - a description of the mcl matrix format.

mcxassemble - assemble a matrix/graph from partial edge weight scores. Useful intermediate format to be used when transforming application specific data into an mcl input matrix.

mcxarray - transform array data to MCL matrices. The data may be of rectangular M x N type. Either an M x M or an N x N dimensional matrix can be made, by computing correlation scores between the vectors in one of the to domains. The Pearson correlation coefficient and the cosine are supported, and further tearing and pruning options can be applied.

mcx - an interpreter for a stack language that enables interaction with the mcl matrix libraries. It can be used both from the command line and interactively, and supports a rich set of operations such as transposition, scaling, column scaling, multiplication, Hadamard powers and products, et cetera. The general aim is to provide handles for simple number and matrix arithmetic, and for graph, set, and clustering operations. The following is a very simple example of implementing and using mcl in this language.

 2.0 .i def                    # define inflation value.
 /small lm                     # load matrix in file 'small'.
 dim id add                    # add identity matrix.
 st .x def                     # make stochastic, bind to x.
 { xpn .i infl vm } .mcl def   # define one mcl iteration.
 20 .x .mcl repeat             # iterate  20 times
 imac                          # interpret matrix as clustering.
 vm                            # view matrix (clustering).

One of the more interesting things that can be done is doing mcl runs with more complicated inflation profiles than the two-constant approach used in mcl itself.

mcxsubs - compute a submatrix of a given matrix, where row and column index sets can be specified as lists of indices combined with list of clusters in a given clustering. Useful for inspecting local cluster structure.

mcxmap - relabel indices in a graph.

clmformat - display clusters suitable for scrutinizing.

clmdist - compute the split/join distance between two partitions. The split/join distance is better suited for measuring partition similarity than the long-known equivalence mismatch coefficient. The former measures the number of node moves required to transform one partition into the other, the latter measures differences between volumes of edges of unions of complete graphs associated with partitions.

clmmate - find best matching clusters between two different clusterings.

clminfo - compute a performance measure saying how well a clustering captures the edge weights of the input graph. Useful for comparing different clusterings on the same graph, best used in conjunction with clmdist - because comparing clusterings at different levels of granularity should somewhat change the performance interpretation. The latter issue is discussed in the clmdist entry.

clmmeet - compute the intersection of a set of clusterings, i.e. the largest clustering that is a subclustering of all. Useful for measuring the consistency of a set of different clusterings at supposedly different levels of granularity (in conjunction with clmdist).

clmimac - interpret MCL iterands as clusterings. The clusterings associated with early iterands may contain overlap, should you be interested therein.

clmresidue - extend a clustering of a subgraph onto a clustering of the larger graph.

mclpipeline - set up a pipeline from data parsing stage unto clustering format/display stage.

mcxdeblast - BLAST parser. Can be used in conjunction with mcxassemble (for fully controlling how to generate MCL input matrices) or one can use mclblastline with which it is integrated.

mclblastline - BLAST specific pipeline.