You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
|
4 years ago | |
---|---|---|
.. | ||
.gitignore | 5 years ago | |
LICENSE | 5 years ago | |
README | 5 years ago | |
SConscript | 5 years ago | |
__init__.py | 5 years ago | |
fastcluster.cpp | 4 years ago | |
fastcluster.h | 5 years ago | |
fastcluster_R_dm.cpp | 5 years ago | |
fastcluster_dm.cpp | 5 years ago | |
fastcluster_py.py | 5 years ago | |
test.cpp | 4 years ago |
README
C++ interface to fast hierarchical clustering algorithms
========================================================
This is a simplified C++ interface to fast implementations of hierarchical
clustering by Daniel Müllner. The original library with interfaces to R
and Python is described in:
Daniel Müllner: "fastcluster: Fast Hierarchical, Agglomerative Clustering
Routines for R and Python." Journal of Statistical Software 53 (2013),
no. 9, pp. 1–18, http://www.jstatsoft.org/v53/i09/
Usage of the library
--------------------
For using the library, the following source files are needed:
fastcluster_dm.cpp, fastcluster_R_dm.cpp
original code by Daniel Müllner
these are included by fastcluster.cpp via #include, and therefore
need not be compiled to object code
fastcluster.[h|cpp]
simplified C++ interface
fastcluster.cpp is the only file that must be compiled
The library provides the clustering function *hclust_fast* for
creating the dendrogram information in an encoding as used by the
R function *hclust*. For a description of the parameters, see fastcluster.h.
Its parameter *method* can be one of
HCLUST_METHOD_SINGLE
single link with the minimum spanning tree algorithm (Rohlf, 1973)
HHCLUST_METHOD_COMPLETE
complete link with the nearest-neighbor-chain algorithm (Murtagh, 1984)
HCLUST_METHOD_AVERAGE
complete link with the nearest-neighbor-chain algorithm (Murtagh, 1984)
HCLUST_METHOD_MEDIAN
median link with the generic algorithm (Müllner, 2011)
For splitting the dendrogram into clusters, the two functions *cutree_k*
and *cutree_cdist* are provided.
Note that output parameters must be allocated beforehand, e.g.
int* merge = new int[2*(npoints-1)];
For a complete usage example, see lines 135-142 of demo.cpp.
Demonstration program
---------------------
A simple demo is implemented in demo.cpp, which can be compiled and run with
make
./hclust-demo -m complete lines.csv
It creates two clusters of line segments such that the segment angle between
line segments of different clusters have a maximum (cosine) dissimilarity.
For visualizing the result, plotresult.r can be used as follows
(requires R <https://r-project.org> to be installed):
./hclust-demo -m complete lines.csv | Rscript plotresult.r
Authors & Copyright
-------------------
Daniel Müllner, 2011, <http://danifold.net>
Christoph Dalitz, 2018, <http://www.hsnr.de/ipattern/>
License
-------
This code is provided under a BSD-style license.
See the file LICENSE for details.