An objective function for order preserving hierarchical clustering

Bakkelund, Daniel

Computer Science > Machine Learning

arXiv:2109.04266 (cs)

[Submitted on 9 Sep 2021 (v1), last revised 1 May 2022 (this version, v3)]

Title:An objective function for order preserving hierarchical clustering

Authors:Daniel Bakkelund

View PDF

Abstract:We present an objective function for similarity based hierarchical clustering of partially ordered data that preserves the partial order. That is, if $x \le y$, and if $[x]$ and $[y]$ are the respective clusters of $x$ and $y$, then there is an order relation $\le'$ on the clusters for which $[x] \le' |y]$. The theory distinguishes itself from existing theories for clustering of ordered data in that the order relation and the similarity are combined into a bi-objective optimisation problem to obtain a hierarchical clustering seeking to satisfy both. In particular, the order relation is weighted in the range $[0,1]$, and if the similarity and the order relation are not aligned, then order preservation may have to yield in favor of clustering. Finding an optimal solution is NP-hard, so we provide a polynomial time approximation algorithm, with a relative performance guarantee of $O\!\left(\log^{3/2} \!\!\, n \right)$, based on successive applications of directed sparsest cut. We provide a demonstration on a benchmark dataset, showing that our method outperforms existing methods for order preserving hierarchical clustering with significant margin. The theory is an extension of the Dasgupta cost function for divisive hierarchical clustering.

Comments:	39 pages
Subjects:	Machine Learning (cs.LG); Combinatorics (math.CO)
MSC classes:	62H30, 06A06
ACM classes:	G.1.2; G.1.6; G.2.2; I.2.6; I.5.3
Cite as:	arXiv:2109.04266 [cs.LG]
	(or arXiv:2109.04266v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2109.04266

Submission history

From: Daniel Bakkelund [view email]
[v1] Thu, 9 Sep 2021 13:35:01 UTC (75 KB)
[v2] Fri, 31 Dec 2021 13:48:11 UTC (84 KB)
[v3] Sun, 1 May 2022 07:32:39 UTC (85 KB)

Computer Science > Machine Learning

Title:An objective function for order preserving hierarchical clustering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:An objective function for order preserving hierarchical clustering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators