Skip to main navigation Skip to search Skip to main content

Contour approximation of data: A duality theory

  • Cem Iyigun
  • , Adi Ben-Israel

Research output: Contribution to journalArticlepeer-review

Abstract

Given a dataset D partitioned in clusters, the joint distance function (JDF) J (x) at any point x is the harmonic mean of the distances between x and the cluster centers. The JDF is a continuous function, capturing the data points in its lower level sets (a property called contour approximation), and is a useful concept in probabilistic clustering and data analysis. In particular, contour approximation allows a compact representation of the data: for a dataset in Rn with N points organized in K clusters, the JDF requires K centers and covariances (if Mahalanobis distances are used), for a total of Kn (n + 3) / 2 parameters, and a considerable reduction of storage if N ≫ K, n. The JDF of the whole dataset, J (D) {colon equals} ∑ {J (x) : x ∈ D}, is a measure of the classifiability of the data, and can be used to determine the "right" number of clusters for D. A duality theory for the JDF J (D) is given, in analogy with Kuhn's geometric duality theory for the Fermat-Weber location problem. The JDF J (D) is the optimal value of a primal problem (P), for which a dual problem (D) is given, with a sharp lower bound on J (D).

Original languageEnglish (US)
Pages (from-to)2771-2780
Number of pages10
JournalLinear Algebra and Its Applications
Volume430
Issue number10
DOIs
StatePublished - May 1 2009

All Science Journal Classification (ASJC) codes

  • Algebra and Number Theory
  • Numerical Analysis
  • Geometry and Topology
  • Discrete Mathematics and Combinatorics

Keywords

  • Clustering
  • Contour approximation of data
  • Duality
  • Harmonic mean
  • Joint distance function
  • Mahalanobis distance
  • Weiszfeld method

Fingerprint

Dive into the research topics of 'Contour approximation of data: A duality theory'. Together they form a unique fingerprint.

Cite this