You can not select more than 25 topics
			Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
		
		
		
		
			
				
					80 lines
				
				2.5 KiB
			
		
		
			
		
	
	
					80 lines
				
				2.5 KiB
			| 
											6 years ago
										 | C++ interface to fast hierarchical clustering algorithms
 | ||
|  | ========================================================
 | ||
|  | 
 | ||
|  | This is a simplified C++ interface to fast implementations of hierarchical
 | ||
|  | clustering by Daniel Müllner. The original library with interfaces to R
 | ||
|  | and Python is described in:
 | ||
|  | 
 | ||
|  | Daniel Müllner: "fastcluster: Fast Hierarchical, Agglomerative Clustering
 | ||
|  | Routines for R and Python." Journal of Statistical Software 53 (2013),
 | ||
|  | no. 9, pp. 1–18, http://www.jstatsoft.org/v53/i09/
 | ||
|  | 
 | ||
|  | 
 | ||
|  | Usage of the library
 | ||
|  | --------------------
 | ||
|  | 
 | ||
|  | For using the library, the following source files are needed:
 | ||
|  | 
 | ||
|  | fastcluster_dm.cpp, fastcluster_R_dm.cpp
 | ||
|  |    original code by Daniel Müllner
 | ||
|  |    these are included by fastcluster.cpp via #include, and therefore
 | ||
|  |    need not be compiled to object code
 | ||
|  | 
 | ||
|  | fastcluster.[h|cpp]
 | ||
|  |    simplified C++ interface
 | ||
|  |    fastcluster.cpp is the only file that must be compiled
 | ||
|  | 
 | ||
|  | The library provides the clustering function *hclust_fast* for
 | ||
|  | creating the dendrogram information in an encoding as used by the
 | ||
|  | R function *hclust*. For a description of the parameters, see fastcluster.h.
 | ||
|  | Its parameter *method* can be one of
 | ||
|  | 
 | ||
|  | HCLUST_METHOD_SINGLE
 | ||
|  |   single link with the minimum spanning tree algorithm (Rohlf, 1973)
 | ||
|  | 
 | ||
|  | HHCLUST_METHOD_COMPLETE
 | ||
|  |   complete link with the nearest-neighbor-chain algorithm (Murtagh, 1984)
 | ||
|  | 
 | ||
|  | HCLUST_METHOD_AVERAGE
 | ||
|  |   complete link with the nearest-neighbor-chain algorithm (Murtagh, 1984)
 | ||
|  | 
 | ||
|  | HCLUST_METHOD_MEDIAN
 | ||
|  |   median link with the generic algorithm (Müllner, 2011)
 | ||
|  | 
 | ||
|  | For splitting the dendrogram into clusters, the two functions *cutree_k*
 | ||
|  | and *cutree_cdist* are provided.
 | ||
|  | 
 | ||
|  | Note that output parameters must be allocated beforehand, e.g.
 | ||
|  |   int* merge = new int[2*(npoints-1)];
 | ||
|  | For a complete usage example, see lines 135-142 of demo.cpp.
 | ||
|  | 
 | ||
|  | 
 | ||
|  | Demonstration program
 | ||
|  | ---------------------
 | ||
|  | 
 | ||
|  | A simple demo is implemented in demo.cpp, which can be compiled and run with
 | ||
|  | 
 | ||
|  |    make
 | ||
|  |    ./hclust-demo -m complete lines.csv
 | ||
|  | 
 | ||
|  | It creates two clusters of line segments such that the segment angle between
 | ||
|  | line segments of different clusters have a maximum (cosine) dissimilarity.
 | ||
|  | For visualizing the result, plotresult.r can be used as follows
 | ||
|  | (requires R <https://r-project.org> to be installed):
 | ||
|  | 
 | ||
|  |   ./hclust-demo -m complete lines.csv | Rscript plotresult.r
 | ||
|  | 
 | ||
|  | 
 | ||
|  | Authors & Copyright
 | ||
|  | -------------------
 | ||
|  | 
 | ||
|  | Daniel Müllner, 2011, <http://danifold.net>
 | ||
|  | Christoph Dalitz, 2018, <http://www.hsnr.de/ipattern/>
 | ||
|  | 
 | ||
|  | 
 | ||
|  | License
 | ||
|  | -------
 | ||
|  | 
 | ||
|  | This code is provided under a BSD-style license.
 | ||
|  | See the file LICENSE for details.
 |